| October 31, 2012
I have no doubt that big data will have a big impact in many areas including market research, but readers of this blog will know that I am skeptical about many of the claims made about the utility of big data. So a Harvard Business Review (HBR) blog post titled “Big Data Hype (and Reality)” was bound to get my attention.
Gregory Piatetsky-Shapiro presents some interesting examples to support his case. For instance, he discusses the Netflix challenge as a case where big data analysis failed to improve significantly on the existing ability to predict user preferences. You may remember that Netflix offered a $1 million prize to anyone who could improve the accuracy of recommendations over its existing algorithm by 10 percent. It took 3 years for a team to win the prize and the prediction process proved so complex that it was never implemented by Netflix. Besides, a 10 percent improvement only represents a 0.1 star improvement in predicting someone’s movie preference.
Apart from anything else, this helps explain why I never saw an improvement in the system’s ability to recommend new movies for me to watch (compounded no doubt by the fact that my wife and I have very different tastes when it comes to movies, but we share an account). It seems to me that one of the fundamental issues is that the system assumes that genre is the overriding reason why people might like a movie. Maybe the prediction could be improved if people got to classify the movies by factors other than overall genre, and not just rate them. For instance, I like science fiction movies but only if they are intelligent, thought provoking and well-crafted.
So where does Piatetsky-Shapiro suggest big data will have its biggest impact? Artificial intelligence. He points to IBM’s Watson and Apple’s Siri as indications of things to come. (Why am I reminded of HAL from 2001 Space Odyssey, GERTY from Moon, and David from Prometheus?). Piatetsky-Shapiro also nominates individual healthcare and location-based analytics as areas where big data will prove important. He suggests that the success of social networks such as Facebook, Twitter, and LinkedIn depends on their scale, and that big data tools and analytics will be required for them to exploit that scale effectively.
In spite of my skepticism about many of the claims made about big data, I can’t help feeling that Piatetsky-Shapiro is too pessimistic when he suggests that the randomness inherent in human behavior is the limiting factor to consumer modeling success. He states:
Marginal gains can perhaps be made thanks to big data, but breakthroughs will be elusive as long as human behavior remains inconsistent, impulsive, dynamic, and subtle.
True cause of people’s behavior may not be immediately apparent, and without asking people questions about why they behave the way they do, you are entirely reliant on the researcher’s interpretational ability. Maybe what is needed, as I have suggested in the past, is the marriage of big and little data?
A major advantage of traditional questioning techniques is that they elicit responses that might not naturally occur to the respondent, but which we know are important determinants of behavior.
So what are your thoughts on the value of big data? Does it offer the ability to understand why people do what they do? Where do you think it will have the most influence? Please share your thoughts.