| March 29, 2017
Today we are drowning in a sea of data. Companies the world over – including Kantar Millward Brown – are using algorithms and cloud computing to beat that data into submission and force it to cough up its secrets. But what if the data is not measuring the most important things you need to know?
In discussing the topic of infobesity at the Next Forum in the King Abdullah Economic City I was reminded that my colleague Phil Herr circulated a link to this article on Brand Strategy Insider which addresses precisely this issue. I will let you read Derrick Daye’s take, but I will briefly recount the central story here because it is a salutary reminder that we need to ensure we have the data we need not simply use the metrics that are most easily available.
Abraham Wald was a mathematician during the Second World War who was tasked with figuring out where to add armor to long distance bombers to better protect them over hostile territory. In an attempt to answer the question Wald started to analyze the places where the returning bombers were most damaged, but then realized that the data not going to help answer the real question. The planes he was studying had made it back in spite of their damage. The real question was where had the planes not been hit?
Today marketers and researchers risk mistaking the data that is easily available to them as defining the problems they need to solve, rather than taking a step back and figuring out what the real question might be and then figuring out whether the existing data holds the solution or not. I suspect that in many cases huge opportunities go unseen simply because the available data has set limits to our thinking.
This is nothing new, of course, data has always defined how we see the world. One of the abiding problems with data generalizations is that people come to accept them as always true. I have spent over thirty years trying to figure out how brand attitudes and advertising response affect people’s behavior and then systematizing that learning to facilitate research and guide future actions. But that does not mean I assume that these generalizations cover every eventuality.
The benefits of generalization and systematization in terms of efficiency and effectiveness are huge. However, there is always the risk that a difference in circumstances will produce a different result. The risk may be low but any model – conceptual or statistical – will produce the wrong answer if the context changes. But there is a much bigger risk to not having a generalized framework on which to base your analysis and judge the results. As the economist Ronald Coase states,
“If you torture the data long enough, it will confess.”
To what it will confess depends on whether you have the right data to start with, and how the findings fit with your existing understanding of how things work. But what do you think? How big is the risk that the available data defines what we believe is important rather than seeking out real insight? Please share your thoughts.