Incomplete, dirty, and not even free (but it's the only information we have!)

by Nigel Hollis | August 30, 2006
For pleasure reading, I turn to science fiction, and I'm currently reading Chris Moriarty's Spin Control.   In this novel, each chapter opens with a quote, which may be real or fictitious. The quote preceding the chapter titled Information Costs is not only real, it is highly relevant to our business.
These three assumptions are fundamental: information is partial, information is contaminated, and information costs. – J. Traub (1988).

While I have not been able to track down the source of the quote, I did find the book Complexity and Information, authored by Joseph F. Traub, and Arthur G Wershcultz.  The ideas in the book seem as pertinent to market research as to any other discipline that seeks to make predictions. If these assertions above are true, why do so many people expect research to yield certainties at minimal cost?

In the book's introduction, Traub and Werschulz explain that information is partial because it cannot uniquely identify a particular physical state, and that it is contaminated because it relies on measurements which are subject to error.  To illustrate these points, they cite the example of a weather forecast. A forecast combines measurements from a variety of sources: earth-based stations, planes, balloons, ships and satellites. They explain:
 "Since the number of measurements is finite, the information is partial. Inevitably, these measurements are contaminated with errors. Hence we can only know the current weather to within some error. Since weather appears to be chaotic, this error is amplified, and limits our ability to forecast the weather to only a matter of days."

(Even this guarded statement seem optimistic to those of us on the East Coast who got soaked last weekend after receiving a forecast of "partly cloudy.")

After providing a couple more examples, Traub and Werschulz conclude their discussion of incomplete and inaccurate information, stating, "At best, we can only guarantee an approximate solution." Then they move on to discuss another dimension of information:  the expense associated with gathering it.

The comparison to the practice of market research is obvious. While we do not seek to predict something as chaotic as the weather, we do make predictions based on partial and contaminated information, and that information does have a cost. Implicit in Traub and Wersculz's Introduction is the trade-off between the quantity and quality of information and its cost. They describe a problem-solving algorithm as optimal if "it uses the information to produce an acceptable approximation to the answer at minimal cost."

The same could be said of any research project or analysis and it is worth bearing in mind the next time you write or review a research proposal. Will the information created by the research provide you with an acceptable "approximation" for the smallest possible cost? All too often, I believe, we focus on just one side of this equation, and assume the project with the lowest cost (and quickest turnaround) will be the best choice. Rarely do we focus on the other side of the equation and ask "Will this research give us an acceptable approximation given how we intend to use the information?"

The downside of not considering both sides of this equation is not just the risk of gathering inaccurate or misleading information. It extends to the increased likelihood of making poorly-informed decisions based on that information, and, ultimately, to poor business performance—shortfalls in revenue, weaker share prices and job losses.

We would do well to remember that there are potentially significant ramifications if we fail to approximate correctly. The lost opportunity cost of bad research may not be as destructive and immediate as the failure to predict the path of a hurricane, but it is still considerable, and potentially more insidious for its lack of immediacy.

To sum up:
Research does not deal in certainties; it deals in approximations. And the quality and reliability of those approximations are intrinsically linked to the cost of obtaining the information involved.

At a time when people seem to be looking for research to provide THE answer, quickly, and at a low cost, everyone from senior management to research buyers to research suppliers would do well to remember these facts.



Leave a comment
  1. Nigel, September 19, 2006
    Hi Diarmid, thanks for the comment.
    To your first point, yes, we do need to train people to better extrapolate from a set of data taken at a point in time. What might happen to the brand's equity if X happened?
    I am in two minds about the issue of timeliness. First, I know it is just as time consuming to change an online script as a CATI script. The time advantages of online apply to the speed of response, provided you are willing to live with a lower response rate from a higher initial set of contacts and any potential bias introduced by the timing of response, e.g. people responding at the weekend may profile differently from those responding during the week.
    The best thing to do may be to conduct a robust measurement at a point in time and then use quick and focused follow up surveys to check the direction of change.
    To your second point, who is Diamond? Who, for that matter is Diarmid? Will either of you tell us I wonder?
  2. Diarmid Campbell-Jack, September 18, 2006
    Fascinating discussion that I think is touching on two of the most important questions in modern day research.

    Firstly, Yixin's comments are important as they go to the heart of the client-agency relationship. Yes, there is the truism that the same "events, dear boy" that MacMillan referred to decades ago equally apply today. In fact, it can be argued that the present integrated world system greatly inflates the impact of these events as they spread through the various networks which we work within. However, the implications of this in terms of research need to be stressed throughout all stages of the research programme. As an example, too often researchers examine the context and setting during which the research is being conducted (e.g. what outside forces affect the actual data) without continuing this contextualisation up to the output stage, and even, perhaps, beyond... I sense this is a wider issue!

    Of course, there are also wider questions about the speed of making amendmments to the research programme and the wide assumption that online systems allow this to a greater extent than more "traditional methodologies".

    Secondly, there is the equally important question of where Diamond got his name from? I am most intrigued...
  3. Nigel, September 05, 2006
    (On average.)
  4. Diamond Campbell-Jack, September 01, 2006
    Nigel, fabulous politicians response.

    If I may, a yes or no answer. Is the percentrage mark up for an online Link test higher than an offline Link test?
  5. Nigel, August 31, 2006
    Thanks Diamond and Yixin, your comments are appreciated.
    Yixin, under the circumstances you describe I would have to suggest that the research buyer think long and hard about whether the research is worth doing, and, if it is, try to build in a forward looking component that asks respondents' reaction to different possible scenarios.
    Diamond, I would absolutely agree that larger sample sizes would be desirable in pre-testing, particularly if clients insist of trying to get a volume estimate from them.
    Unfortunately I have to disagree with your comment on the cost and margins attached to online research.
    It is a common perception that online research is a lot cheaper but in general it is not true. Online is cheaper than traditional mall and phone-based research and our price charged to clients has been adjusted to reflect the differences in the cost base.
    I say differences because in some cases it is a matter of shifting costs rather than lower costs. If the research company uses a reputable and representative source of respondents then the costs attached are still significant. Panel recruitment and management needs to be paid for and, as I mentioned in a previous post, scarcity of supply is now becoming the norm as participation rates decline. Additionally, there is an infratructure cost that we do not have to pay for in the mall. Internet research requires 24-7 availability of servers, with the attendant redundant internet connection and mirrored servers, secure off-site location and IT support.
    If we were talking thousands of interviews rather than a few hundred then yes, diminishing returns to scale would apply. The only other alternative would be to avoid panels and recruit from a representative set of web sites using pop-ups, which we do for some research.
    The belief that online research is cheap is still fairly common among the research community and I think it harks back to the "electrons are free" days of the late 90s. It has been compounded by VC-funded start-ups which have tried to penetrate the market with unsustainably low costs. My experience at Millward Brown Interactive made me uncomfortably aware of the costs involved and, unfortunately, recent lay-offs at one of the major panel operations in North America would also suggest that online research is really not the goldmine that people like to believe.
  6. Yixin, August 31, 2006
    In certain markets where socio/economic environment is changing dramatically very fast, on top of being "incomplete, dirty, and costly", research data are often found "outdated" or even "irrelevant" at the time when decisions are being made or implemented based on those data.
  7. Diamond Campbell-Jack, August 30, 2006
    Point taken, but I think that recent innovations in data collection have changed 'some' of the dynamics of the cost-reliabilty curve.

    This especially true in the case of copy testing. If we are being honest the high cost of mall based data collection meant that a lot of copy testing is close to 'bad research'...not because the questionnaire is poorly designed or the analysis is poor, but just because the low bases are inherantly unreliable. No one should make multi million dollar decisions based on a sample of 150 (or less).

    The cost of increasing the reliability of research has fallen dramatically courtesy of online data collection. Yet the 'standard sample size' for a copy test has not changed. Although the cost per test has fallen slightly, it does not seem to reflect steep the drop in the cost of data collection. The only conclusion has to be that some agencies are placing higher profit margins over higher sample sizes.

    To put this in context, for some measures the difference between falling into the 'bottom 30% of ads tested relative to the 'top 30%' can be less than 10% (for a top box score). Based on a sample of 150 the difference between abject failure and glorious victory may fall entirely within the realms of statistical error.

    Where am I going with this? I am imploring copy testing companies (that embrace online research), to increase the standard sample size for single cell copy tests from 150 to somewhat closer to 400.

    Yes that will cost more....but the real cost is a fraction of what clients are being asked to pay. Yes, we are all in the business of making money, but it is in all of our interests to take advantage of the internet to increase the reliability of the data without incurring unreasonable. prohibitive mark up.

    Leave a comment