| May 15, 2013
Kate Crawford, a principal researcher at Microsoft Research and a visiting professor at the MIT Center for Civic Media, has written a provocative post on the HBR Blog titled, “The Hidden Biases in Big Data.” She quotes former Wired Editor-In-Chief, Chris Anderson, as saying, “with enough data, the numbers speak for themselves." Crawford then asks, can numbers actually speak for themselves?
Crawford’s answer is a simple no. She states:
Data and data sets are not objective; they are creations of human design. We give numbers their voice, draw inferences from them and define their meaning through our interpretations. Hidden biases in both the collection and analysis stages present considerable risks, and are as important to the big-data equation as the numbers themselves.
I agree. Data – big or small – can no more speak for itself than a goldfish. Big data just makes a long standing problem… bigger. Data must be cleaned and ordered before it can be used, and what numbers mean depends on how we interpret them. I also agree that what we really need is not big data but, to use Crawford’s term, data with depth. This is what I was trying to get at in my post about big data needing a little help.
Chatting to my colleague Bill Pink, Senior Partner, Creative Analytics at Millward Brown North America, he suggests that making use of big data, or any data for that matter, comes back to first principles:
What question are we trying to answer? Do we understand the people, psychology, human relationships, the category or phenomena under study? The upside of the big data is we now have previously untapped assets to help us answer these questions – mobile collection of texts, social media, set top data on TV viewing… that’s the amazing thing.
And those new data assets can be used to provide a better explanation than if we did not have those data sets to include in the story. But that assumes a framework, analytic approach and tools to evaluate and integrate the data and reach these conclusions. It’s not the presence of the data that matters, it’s the question to be answered and the ability of the new data to take us to further than we were before.
One of the ironies of the buzz around big data is that the folks who are saying the loudest to keep in mind curiosity, experience and the principles of research… are often the most technically savvy data science types.
To back up his case, Bill points us to a New York Times article titled, “Sure, Big Data Is Great. But So Is Intuition.” The article is worth a read and I love this quote from Claudia Perlich, Chief Scientist at Media6Degrees: “You can fool yourself with data like you can’t with anything else.” All too true I fear. What do you think? Please share your thoughts.