Data Science vs Bimbo Math

Ms. FrySaint Valentine, that romantic and beautiful festivity for department stores also brings everybody to talk about love in all sort of contexts and TED, my favorite talk place (I will have to rethink about this), brought for the occasion complexity theorist Hannah Fry to talk about The Mathematics of Love. She summoned the almighty and powerful daemon of Mathematics in a quite entertaining talk to reveal us all mere mortals the secrets of Love… Not really.

So many things to tell about this talk I do not know where to begin. But you know what, TED picking a math bimbo to sell books; I can understand. Turning Science into show business to make it appealing to the general public; I am for it.  Oversimplifing complex subjects to make them accessible to everyone even if the oversimplification is not quite true; I can take that. Using all the previous to push people into taking life changing decisions based on sloppy science… Well, allow me to draw a line there Ms. Fry. Science is acquiring a bad reputation little by little and talks like these are one of the reasons why.

Anyway, long story short, ignore her love tips and specially #2, that one is really damaging. On my side, I will use Data Science and common sense to show that the best you can do is to marry / partner the person you are in love with when you are in love. And when it comes to use reason in the field of love, allow me please to quote Monsieur Blaise Pascal on this one:

“The heart has its reasons of which reason knows nothing”

Let’s now kick some ass in the name of good science. Misses Fry present us with three “Mathematically Verifiable” tips to:

  1. Win at online dating: Show yourself the way your are.
  2. Pick the perfect partner: Choose whoever is Continue reading

Scientist at last, Scientist at last, thanks God almighty I’m a Scientist at last!

I wanted to be a scientist ever since I read a comic where scientist Bruce Banner turns into The Incredible Hulk. I did not know what a scientist was or what kind of scientist I wanted to be, yet, I thought that the scientific career sounded like lots of fun if it can turn you up into a huge green monster.

bruce banner
How Scientists look like for a 12 years old

I guess that for children of my age back in those days Marvel comics were the closest thing to Harry Potter for children nowadays (Let’s get ready for a massive turn up of sorcerers and witches in the coming years by the way).

So there I am after a few years since I read the comic and for reasons beyond this post but that can easily be described like a billiard break(ing bad) I end up with a couple of degrees; Computer Science and Statistics, and a Master in Operation Research (more of the same stuff).

Yet, I never considered myself (nor did anyone else) as a scientist or a researcher since, well… when programming I don’t feel much like doing science no matter how big the word science is in my Computer Science degree, and the degree in Statistics does not make me feel like an scientist either nor the Master like a researcher.

Statistics by themselves are just a field of mathematics and mathematicians are more into precise grammar than into writing beautiful books. Not to mention the opinion of physicists like Feynman about the current use of statistics for science that downgrades Social Sciences and other fields into Pseudo-Science.

There was a time when some of the work I was doing could be named as Data Mining and this seemed to push me further and further away from my childhood dream since now I could be considered a Miner instead of a Scientist… Don’t get me wrong, Miner is no a bad profession if you want to start a revolution but all the glamour of the word science was gone and so my dream to be a scientist darkened with soot.

But then… Data Science came along, wait, what? That’s right! Data Science is what you get when we consider every procedure that brings us knowledge stripped of any field background, the intersection of every science known to men, the Mixed Martial Arts of knowledge. Data Science… if you think about it, can there be any other kind of science?

Not surprisingly when meeting with fellow Data Scientists we’ll find out they come from all sort of venues and that data science teams are usually Macedonian salads of scientific backgrounds which include Physicists (of course) and Musicians (you heard me).

So turns out that after so many years my dream came true and I became exactly what I wished back in those days: a Scientist with no particular field… but data, and since everything is data, now everything is my field. So I can finally proudly say “Scientist at last, Scientist at last, thanks God almighty I’m a Scientist at last!”.

And now if you excuse me I have to go back to my scientific project codenamed Green. Thank you very much.

How to combine p-values to avoid a sentence of life in prison

I find the use of statistics in the justice system a thrilling subject, specially so when you find out that some persons like Lucia de Berk have been handed life sentences based solely on flaw statistics coming from experts like Mr. Henk Elffers. So I’ll talk in this post about what he did wrong and how to avoid this kind of huge boo-boo in our statistical lives.

Lucia reads post, photo by Carole Edrich
Lucia reads post, photo by Carole Edrich (Photo credit: Wikipedia)

The use of statistics in the justice system has actually a long history, the amazing mathematician / engineer / physicist / philosopher of science Henri Poincaré already had to correct the misuse of statistics in the infamous Dreyfus trial.

But it was in the Lucia de Berk trial where combining p-values wrongly handed her a life sentence. I won’t go into the details of the trial, for that there are many other places like Mr. Richard D. Gill web page account of the trial and a video worth to have a look to. Instead I will focus on how to appropriately deal with a bunch of p-values to make sense of our data. Continue reading

Bayesian Führer finds out about CERN using p-values

I thought about making a post about the whole Bayesian community overreaction over CERN using p-values to announce that a Higgs like particle was discovered, but since I cannot possibly explain it better than Ms. Mayo‘s great blog in this post an others (specially if you like pink color), what I am going to do is to explain it different…

Warning for Bayesians: watching this video without sense of humor may cause difficulty breathing, swelling of your face, lips, tongue, or throat; chest pain; sweating and irregular heartbeats… Yes, just like Viagra, but just like Viagra it is worth it.

This video actually contains more reality that the sarcasm clothing might lead you to believe, so I’ll let you enjoy the game of figuring out what is real and what is not. 😉

I am tempted to make more posts about this so highly interesting and fun Bayesian vs Frequentist philosophical issue, but for now suffice to say that I basically agree with Mr. Bradley Efron‘s opinion in his science magazine article about the subject where he says:

My own practice is to use Bayesian analysis in the presence of genuine prior information; to use empirical Bayes methods in the parallel cases situation; and otherwise to be cautious when invoking uninformative priors. In the last case, Bayesians calculations cannot be uncritically accepted and should be checked by others methods, which usually means frequentistically.

Anyhow, for those interested, the Fuhrer scenes come from the fantastic movie Der Untergang. This is one of those great movies you can’t miss if you like history! Continue reading

The I Ching, random numbers, and why you are doing it wrong

One would think that humanity would not have a need for good random number generators until computers and simulations were invented since, for most practical purposes, tossing a coin or throwing a die should suffice us all. So you can imagine my surprise when I saw in this four to five thousand years old Chinese divination book called I Ching a RNG algorithm that reminds modern Linear Congruential Generators! But why the need for such a complex procedure to render random numbers?

         Artemisia Stems

The I Ching divination process requires to randomly select two trigrams via a rather convoluted process using either stems of Artemisia or Yarrow. And although I acquired this ancestral book a long, long, time ago, truth is that when reading it as an oracle I always used the simplified version for lazy busy people consisting in simply tossing three coins and checking the combination of heads and tails.

I always thought that the traditional form was just a magical way to do the same thing that we can do by tossing three coins, but today, for no particular reason that having too much free time in my hands, I gave a deeper mathematical look to this traditional form and it turns out that it renders a complete different random result that tossing three coins!

Well, a mathematical curiosity you might think, but does it matter? It might! Millions of people seek advice using the simplified coin version to render the I Ching Yin Yang oracles. In this post I will show how the three coins method yields an equal proportion on Old Yin and Old Yang oracles signs whereas the traditional method yields three times more Old Yang signs than Old Yin!

This means that The I Ching, in its traditional form to draw oracles, promotes Yang behaviour over Yin, that is, it promotes among its users action, imagination, creativity, strength whereas, nowadays, with the simplified three coin version, the active and passive answers are even out.

I am not a sinologist nor a psychologist so I cannot really tell what version would have a better influence among practitioners lives, but I know though that the traditional form promotes Yang among those seeking advice which, at first glance, seems like a positive thing to do and, since this book is used by millions of people, maybe experts in the field should advice to practitioners not to use three coins anymore when using the I Ching. For those interested in having a traditionally sound oracle in terms of probability, I will show a few simple ways to achieve just that at the end of this post.

This book has impressed mathematicians like Leibniz, psychologists like Jung, poets like Jorge Luis Borges and all kind of intellectuals all over the world for centuries. And regardless you believe or not whether it has magical properties, what is certain is that it has deep psychological sapiential ones. This is not only the oldest book in human history, but a beautiful one. So, before we plunge into the mathematical details of the traditional algorithm to draw oracles, let’s share this poem from Borges about the I Ching to break the ice.

For a Version of I Ching Para una versión del I King
The future is as immutable
As rigid yesterday. There is nothing
That is no more than a single, silent letter
In the eternal and inscrutable
Writing whose book is time. He who walks away
From home has already come back.
Our life Is a future and well-traveled track.
Nothing dismisses us. Nothing leaves us.
Do not give up. The prison is dark,
Its fabric is made of incessant iron,
But in some corner of your cell
You might discover a mistake, a cleft.
The path is fatal as an arrow
But God is in the rifts, waiting.

El porvenir es tan irrevocable
Como el rígido ayer. No hay una cosa
Que no sea una letra silenciosa
De la eterna escritura indescrifrable
Cuyo libro es el tiempo. Quien se aleja
De su casa ya ha vuelto. Nuestra vida
Es la senda futura y recorrida.
Nada nos dice adiós. Nada nos deja.
No te rindas. La ergástula es oscura,
La firme trama es de incesante hierro,
Pero en algún recodo de tu encierro
Puede haber un descuido, una hendidura,
El camino es fatal como la flecha
Pero en las grietas está Dios, que acecha.

Continue reading

On the certainty that God exists and why Bayesians should go π

Up to this day I defined my theological position as Agnostic, which is not saying much given the different interpretations and philosophical flavors we have to position ourselves when it comes to God. This is why sometimes I instead simply reply to The Question with something like “Both alternatives are equally crazy, so I don’t know. But, can we use statistics to better describe our position in these kind of philosophical matters, or even dictate how should we live our lives? Yes, we can.

WARNING: Beware agnostics!!! I will show mathematical arguments that might turn you into a full blown Believer or a hardcore Atheist… So if you keep reading don’t say I did not warn you.

These are my principles…    But if you don’t like them I have others.

If we envision probability as a measure linked to a random process then questions like “What is the probability that God exists?” imply a sort of Supra-God that creates universes with Gods with a frequency p. But then some might argue that this Supra-God is actually God so, at the end, these kind of philosophical questions make no statistical sense for such frequentist interpretation of probability.

Then we have those that interpret probability as a degree of belief on matters subject to uncertainty, this interpretation is the one hold by Bayesian Statistics.

So if I wear a Bayesian hat and I am asked The Question then, instead replying “I don’t know” to describe my ignorance I should reply with “50%” or “p=1/2“. This is so because when Bayesians (The Objective Kind) have no information on a problem they use a plethora of principles in a Groucho style fashion to figure out a prior distribution to kick off Bayes’ Theorem machinery.

But there are an infinite number of prior distributions with an expected value of 1/2 so, which among this infinite number describe better my agnosticism? Is there such thing as a unique agnostic prior to rule them all? Well, it seems this Holy Grail does not exist since we can read in highly commendable Bayesian books like Bernardo & Smith thing like:

In general we feel that it is sensible to choose a non-informative prior which expresses ignorance relative to information which can be supplied by a particular experiment. If the experiment is changed, then the expression of relative ignorance can be expected to change correspondingly. (Box and Tiao, 1973 p.46).

Wait, what? We change the experiment and our prior ignorance changes too? In fact not all Bayesians agree with their existence; (Howson 2002; O’Hagan 2006; Press 2003) they regard any Bayesian Objective “non-informative” priors simply as well formed beliefs… So I’ll pick on the Subjective kind interpretation and in this post I am going to well form my belief in God.

Plus, in the process of cooking my Agnostic prior I’ll discuss why Bayesians should measure their beliefs from 0 to π instead from 0 to 1; This later measure is too frequentist for them and π makes more mathematical sense since trigonometrical functions are going to naturally pop up everywhere in our prior belief endeavor. Continue reading