Are you making data driven mistakes?

The age of big data means we are all being encouraged to tell stories with data and compile numbers to show we are data driven. Clever dashboards and visualisations show us needles that need to be pushed, numbers that need to go green and curves that need to continually trend up. Much of this means we don’t need to think about why the numbers must go green – but go green they must.

I still feel a bit like a deer in headlights when I am asked to comment on data for the first time in meetings and haven’t had time to think about it. It feels a lot like brainstorming where I feel compelled to say either vague things confidently or stupid things because I haven’t had time to think them through. In this article I will look at some common data driven mistakes followed by some questions I use to sniff out potential problems that will lead to bad decisions.

Using only quantitative data to make your decisions

Robert McNamara was the US secretary of defense during the Vietnam war and modeled what success would look like based only on quantitative data. His plan sounds spookily familiar, he created clear objectives and achievable goals in the form of metrics so success could be reported on and easily understood by people not working in the war department. McNamara said that all the important quantitative measures indicated that they were winning the war despite what his generals were telling him. Daniel Yankelovich summarised the quantitative fallacy in 1972 like this:

The first step is to measure whatever can be easily measured. This is OK as far as it goes. The second step is to disregard that which can’t be easily measured or to give it an arbitrary quantitative value. This is artificial and misleading. The third step is to presume that what can’t be measured easily really isn’t important. This is blindness. The fourth step is to say that what can’t be easily measured really doesn’t exist. This is suicide.

McNamara had put too much faith in the data and had not factored in many variables that were hard or impossible to measure like the resilience of a soldier fighting for his home rather than his president.

This happens all the time when too much blind faith is put into a system, metric or some new technology. McNamara also believed that learning technologies could be used to make people smarter and as a result lowered the IQ requirement for the draft to 80; a policy subtly alluded to in the movie Forest Gump.

None of the data driven decisions I make will ever have the impact of these mistakes, but I strive to learn from history.

Looking for insight within the data you have rather than the data you need

Like the old joke where a drunk man is looking for his keys under a lamppost rather than where he lost them, in corporate training it is common to use happy sheets, NPS and attendance rates to measure the effectiveness of a program. We know we should be measuring knowledge transfer and performance improvement after the training, but it is hard to measure so often we don’t. After all, by looking under the lamppost the drunk was at least eliminating that spot as the place where he had lost his keys!

Using data like a drunk uses a lamp post – for support rather than illumination

I recently spoke to a training manager who had just done a successful presentation about return on investment for a new training system he had implemented. He used a data driven forecast of how many people would be using his training system next year based on uptake this year. He had proven his point and everybody loved the visualisations. However, when he looked at the numbers again over a longer timeframe, he was quite surprised to see he would have double the population of Australia using his corporate training system within 5 years which seemed unlikely.

How to talk back to the data

Darrell Huff in his 1954 book ‘How to lie with statistics’ outlines five questions you should ask about data which are still relevant today. It is easy to do this in retrospect and a gross simplification of a complex subject, but these are great questions to ask when presented with any data.

Going back to the Vietnam example of a data driven decision I am going to apply the five questions to it:

All quantitative metrics indicate we are winning the war and therefore we should continue until we are victorious.  

  1. Who says so? The US secretary of defense. Ok there is a chance of bias
  • How do they know? They are using quantitative data, metrics like body count, boots on the ground and comparing them to the enemy

  • What’s missing? The US had never fought a conflict like this, qualitative data from his generals
  • Did somebody change the subject? A fact has been stated and a conclusion made but is the conclusion related to the fact? Do those metrics indicate that we are winning the war?
  • Does it make sense? We have been winning this war for nineteen years, how come we haven’t won it yet
Conclusion

H G Wells said that ‘One day statistical thinking will be as necessary for efficient citizenship as the ability to read and write’, almost one hundred years later I think that day has come, nice one Bertie!

At twelve years old I got two percent in my maths exam, which you get for putting both your first and last name on the paper. With a huge amount of work, I went on to do a degree in accountancy and statistics, but number analysis still doesn’t come to me easily. This article is the first of a series that will explore ways to look at data in a practical fun way that I hope will help you (and me) use data better.