7. The 5 mistakes that people make when talking about a statistic
Statistics and science are extremely valuable when it comes to informing our public debate and shining the light of truth on a matter. It is a shame then that we are rather bad at understanding how statistics actually work. While I am by no means a genius at understanding statistics, I know enough to outline the 5 most common mistakes people make when using statistics in a debate.
1. What a statistic shows and what a statistic means is not the same
Let us say that you have found this statistic:
“ Over the course of her life, a newborn danish girl will on average receive 1.6 million DKK more from the welfare system than she contributes via taxation. … For a newborn danish boy the situation is opposite. He will contribute 0.6 million DKK more, than he will recieve in his life.”
You might say “This shows that women are a bigger burden on the welfare system than men”. However, you are making the mistake of confusing results with interpretation.
The only thing a statistic shows is that when this type of inquiry is done at this time, in this way, in this population, it gives this result.
In this case, the statistic shows that when you add up all that a person pays in taxation over their lifetime, and compare with how much public money is spent on them in their lifetime, and you do that for everyone in Denmark, the result shows that the average man, during his life, pays more in taxes than the amount of money spent on him, and the opposite is true for the average woman.
The interpretation of a statistic is another part of the process. It is concerned with explaining why this type of inquiry, done at this time, in this way, with this population, gave this result. The interpretation of a statistic is usually done by the researchers who conducted the inquiry. Through their vast knowledge of prior research, and the research they conducted as part of this inquiry, they find likely explanations. Those are very high quality explanations, but it is basically a highly sophisticated, educated guess.
The possible interpretations of the statistic about danish men and women and their respective contributions to the welfare state are many, and it is possible that more than one interpretation might be true. The difference might come from the fact that women are on average paid less in their jobs than men, that women make up a smaller percentage of the most high-paid jobs, and/or that women, on average, do more unpaid labour, such as childcare, or caring for an elderly spouse.
2. You cannot prove a negative
Science is about providing proof for a hypothesis. It is all about showing that something is, for example, showing that a relationship between 2 things exist, that most people do live past 30, that the Earth is round etc. Therefore, you cannot really prove that something isn’t. You can show that the evidence is not there to support it, that there is no proof, that the opposite hypothesis has proof to support it. But you cannot directly prove that something isn’t.
3. When thousands of randomly sampled people make the same “choice”, there is always an underlying causal factor
Recently, a study came out that the people living in my area, Aalborg East, had on average 13 years shorter lifespans than the people living in another area, Hasseris, just 7 km away. A lady from Hasseris was asked about why she thought this was the case. Her answer; “Laziness”.
The mistake she is making her is thinking that a significant statistical difference can be explained entirely by “free will choices”. She is not the only one. I hear this fallacy often in debates about gender in the workplace. People attempt to explain statistics showing a highly gendered job market or a huge lack of women in positions of leadership, with “women just don’t want those jobs”.
Here is why that makes no sense. Imagine a person who is your exact opposite, you alter ego. Your alter ego and you are unlikely to make the same decisions, because your personalities are so different. It can happen that your alter ego and you make a similar choice just by chance, out of your free will, but it is unlikely. If you pick 1000 people from your country randomly, some of them are going to be people very like you in personality, who makes the same decisions as you. Some of the people will be very different from you, and make very different decisions from you. So if a majority of those randomly selected 1000 people make the same choice as you, it is very very unlikely that it will be because you just happened to make the same decision completely free from influence. More likely, it will be because you all had the same reason, incentive or influence. In more techinal terms, because of an underlying causal factor.
In the case of Aalborg East versus Hasseris, that causal factor could be something like the fact that Aalborg East is a lot poorer than Hasseris, and rich people just have better resources (time, energy and money) to invest in healthy living. Why women are not in leadership positions, the underlying causal factor that influences womens’ choices might be the culture we live in, where we consciously and subconsciously associate leadership with men and leadership jobs are often seen as a “man’s job”.
4. Correlation does not equal causation
If you did a study on people who have been to the doctor recently, you would probably find that people who have been to the doctor recently, also have a tendency to miss work. If you didn’t know any better, you might conclude that going to the doctor makes you miss work. Those darn physicians are a bad influence on your willingness to attend your job! However, this would obviously be a wrong conclusion, because people don’t miss work because they have been to the doctor, but because the health problem that they are getting treated by the doctor is also causing them to miss work.
Statistics can often show that 2 things correlate. But the nature of the relationship between the 2 things depends on interpretation (as mentioned earlier in this piece). It might be that one causes the other, it might be that they are both influenced by a third factor, and in rare cases it might be that they are entirely coincidental.
5. Percentage difference often looks more dramatic than percentage point difference
Say you have 100 people who eat unicorn meat every day, and 100 people who don’t. 6 of the unicorn eaters develop green spots, while only 4 of the non-lion eaters get green spots. That is 6% versus 4%. That is not a big difference, at least in this case. The percentage point difference is 6 minus 4, equal to 2 percentage points. However, the percentage difference is the percentage point difference (6 minus 4) divided by 4, multiplied by 100 (to change it into a percentage), equal to 50%. If you read that unicorn meat increases the risk of green spots by 50%, that sounds super dramatic and scary, but when you look at the percentage point difference, the difference is not that big.
I’m not a statistician, I am just a random person on the internet. If you want to learn more about statistics, go read some of the amazing literature there is about statistics here .