On the nature of War & Peace

animationAnd just like autumn comes now and again, so does the time when my country, Spain, tears itself apart for one lack of reason or another. You might read all sort of news about it but they will mostly be processed products of a well oiled propaganda machine serving their masters at both sides of the aisle; true it is that the first casualty of war is truth.

In a conflict situation words like truth, freedom, justice, democracy and, in general, any word with a heavy positive charge loses its meaning since they are appropriated by every side in the conflict. Still, regardless where the truth lies pun intended I am more interested in this post about the dynamics of conflicts, why they happen and if they can be prevented or even solved.

For this goal I will not use historical arguments but instead I will lay out a few common sense dynamics for a social simulation to finally discuss its results… Also, at the end I will surrender my opinion for anybody to refute at will.

Continue reading

IQ Tails of Race & Gender

ouroboros042Fear not, I am not going to perform any analysis proving the intellectual superiority of any race or gender. Also, as a 100% Spaniard (I need to check on that though) I do not belong to the “elite” of ethnics groups disputing supremacy, namely: Northern Europeans, Jews and Far-East Asians and, quite frankly, I feel kinda good about it since I’d rather stick to the Latin Lover stereotype which, by all means, it is true.

Recently I came across a video titled Steven Pinker – Jews, Genes and Intelligence. Typically I would have disregarded this video as your standard white supremacy internet rhetoric however, I know Steven Pinker from his published works and achievements, and he is no small fish in the Psychology and Cognitive Science world. That is why I decided to give a shot to his video to see what’s what until he began talking about statistics. These are his words:

“…Jewish achievements might have an explanation on another fact that has long been known; that Jewish score on average higher on IQ tests than any ethnic group for what there’s comparable data. Their mean IQ is between 108 and 115, the mean of the European population is by definition a hundred which means that the Jewish average is a whole standard deviation higher than the [European] average… Importantly, even if the effect is moderate on average it’s a mathematical fact in Normal Distributions, that is Bell’s Curves, that small effects in the average can translate into huge effects at the extreme… So with one standard deviation difference between groups a score that is three standard deviation above the mean in the higher distribution is four standard deviations in the lower distribution which means there are 42 times as many people at that cut off.” – Steven Pinker

In short, according to Steven Pinker there are 42 times more chances for a Jewish baby to be born an IQ genius than for an European one… But, is that really so?


Continue reading

Robocoap: Text ⇢ Gephi

RobocoapAs promised in the Don Quijote de la Network post, I Just packaged the R code that generated the data used in Gephi to visualize the network graphs describing Don Quijote.

Now it should be fairly simple (or at least simpler) for anybody to generate such graphs for their favorite books. And since the package automatizes the process like if a robot was collecting the coappearances of elements within a text, its name came to be… Robocoap.

For now the package has just one function (novel.coap) intended for books with a novel format. With minor work, and down the road, the package will also handle theater plays & movie scripts formats and, with a little bit more of work, collections of research papers. Until then, enjoy your novels and have fun!

R corset: Bringing Math models back in shape

So your perfect ideal mathematical model returns values that are impossible; probabilities bigger than one or smaller than zero, negative stock market values, et cetera, and now you feel like quoting George Box… again.


Sometimes the mathematical model embeds a solution to keep things real like, for example, logistic regressions. However, very often many popular models like ARIMA offer no possibility to bound its results within business or scientific constrains, and then what? These are a few common options:

Continue reading

Don Quijote de la Red

quijote-y-sanchoEn este lugar publico mis artículos en inglés para alcanzar una mayor audiencia interesada en temas aleatorios, no obstante publicar un artículo sobre Don Quijote en lengua inglesa se me hace extraño, y no solamente porque echo de menos el uso de mi lengua vernácula paciendo en tierras lejanas, sino también porque, a lo Quijote, sueño con poder cooperar con otros Quijotes (aunque nunca le haré un feo a un buen Sancho) interesados en la siguiente aventura de redes:

Recientemente publiqué un artículo (Don Quijote de la Network) acerca del uso de análisis de redes sociales como herramienta para analizar las interacciones o, para ser más exactos, las co-apariciones  de personajes en obras literarias.

La chispa de tal acción, he de reconocerlo, fue el observar a través de una celosía que la herramienta para análisis de redes Gephi ofrecía como uno de sus ejemplos la red de co-apariciones de la novela Los Miserables; famosa obra literaria de Victor Hugo.

Por más que busqué no encontré en la red equivalente para la historia del famoso hidalgo Don Quijote de la Mancha, y como yaciendo sentado no había rocín que se me acercara no tuve más remedio que arremeter contra este molino a pie. Así que juzguen vuesas mercedes pero acuérdense al tiempo de arrimar el hombro o de dar ánimo a las dádivas y, siendo así, que Dios se lo pague que yo no puedo. Comienzo.

Continue reading

Don Quijote de la Network

quijoteNetwork theory is a quite thrilling subject and specially so in our nowadays big data society where we have at our disposal awesome free tools like Gephi.

There are many different kind of networks and fields where these analysis can take place and today’s post will be on literature and, in particular, the social network structures within the master piece of Spanish literature: El ingenioso hidalgo don Quijote de la Mancha.

As an interesting anecdote about the qualities of this novel, Sigmund Freud first came to Don Quijote as a boy and loved the novel so much that he learnt Spanish so as to read it in its original language keeping the secret from his parents who might have disapproved of the hobby. So if you want to fully enjoy the book and not to lose anything in translation go Freud on it.

So let’s follow Don Quijote through the Network and, in case it is not obvious enough, doing this sort of analysis on a book implies major spoilers ahead.

Continue reading

If you play with your Prior you’ll go blind


And thus, the Huffington Post predicted a 98% probability for Hillary Clinton to be the next President of the United States. Amen… Let’s tease them a little bit, shall we?

My Bayesian friends, I understand playing with your priors is a very joyful activity but you see, it leads to blindness. It allows you to believe, let me cap & bold this one, BELIEVE that Hillary’s chances to be the next President of United States were 98%! No wonder that betting sites favored heavily Hillary’s side days before the election! I mean 98%! Who wouldn’t put some money there. Right?

But you know, a 98% probability coming from a Bayesian means very little unless, of course, they do some math pirouette to guarantee that the probability has frequentist properties, but then, if they do that, why bother going Bayesian in the first place?

If a frequentist tells you there is 98% probability for an event to happen he/she means that 98 out of 100 times where you find yourself in a situation like where the event is taking place the event will occur. Now, if a Bayesian tells you there is 98% probability he/she means that this is his/her degree of believe (wot?) on the event to happen… Amen again.

In other words, Bayesian results are as credible as the beliefs of the Bayesian statistician making the calculations, now we can understand why they calculate credible intervals instead confidence ones.

If we check on the Huffpo methodology we can read:

Many Bayesian models ― including the Pollster averaging model as it’s implemented for our charts ― use “uninformed” priors that don’t affect the model or provide any background information.

However, we do use information from previous elections in these priors to make predictions in our presidential model.

Ba dum tsssss

Much has been written on the pros and cons of going Bayesian and how evil Frequentists are, but this amazing Bayesian result from Huffpo was just too good to let go as a beautiful example of how blind you can go when playing with your priors.

Objectivity is dead, long live Objectivity!

Are p-values an objective measure? Bayesian Statistics are not as objective as Frequentist statistics for the simple reason that they need more assumptions, that is, a prior. This is why to even talk about Objective Bayesian Statistics is an oxymoron and yet seems to be the most popular Bayesian school out there. But anyhow, how about p-values then, can they be subjective? Is there such thing as Objectivity in statistics? death_of_the_justice_by_quadraro-d6sapo4

For a time I thought p-values were an objective measure but then a couple of blows put to rest my dream on having an objective procedure to deal with uncertainty. This is the story of the Subjectivity one-two combo that knocked out flat my Objectivity dreams…

Continue reading