Market Research + Sentiment Analysis = New Insight
Q & A with Carol Haney, Toluna
Posted October 24, 2012
Carol Haney is vice president, product marketing at Toluna. She will be speaking at the up-coming Sentiment Analysis Symposium, October 30, 2012 in San Francisco, title "Up Close and Personal: Social Media Insights and the Mind of the Consumer."
Carol is one of four authorities who graciously agreed to respond to a series of questions exploring the role of sentiment analysis, text analytics, and emerging social-intelligence technologies in support of next-generation market research. Read her responses to questions from Seth Grimes and then return to the full set of interviews.
Q1> What impact is sentiment analysis having in the market-research world, whether applied for surveys or to social media?
Carol Haney> Manual coding for sentiment analysis has been going on for decades for open-ended data in survey and social research. The real impact is the handling of increasingly "big data," whether that means, for example, the collection of hundreds of thousands open-ends (usually in large customer sat surveys), or social media scraping on brands with large market shares. In these cases, natural language processing techniques are not just optional, they're required. The applied NLP technique plus analytic understanding of the data really determines the accuracy level of the automated coding, whether it is clustering, classification, or sentiment analysis. Automated coding is being performed with great accuracy by certain organizations to help drive business decisions. Achieving 70-80% accuracy using automated or semi-automated machine learning techniques requires a in-depth understanding of the subject domain that the textual data falls in, a significant amount of analytic preparation, as well as maintenance on the processing over time. Companies who engage with this level of commitment can and do derive great value from this analysis. On the other hand, other companies that rely on only the results of the sentiment analysis and don't look inside the "black box" should beware of basing any business decision on that analysis. The graphs may be pretty, yet drilling into the data most likely will show that the analysis was significantly faulty.
Q2> How effective is text analytics at getting at sentiment, and to what extent should researchers continue to rely on human coding, whether by experts or crowd-sourced? Could you describe one or two things you or your clients have learned, via automated text/sentiment analysis, that you wouldn't have discovered otherwise?
Carol> In my opinion, except in very tightly defined subject domains that have already had a significant amount of manual analysis done, human intervention is required in some aspect of the modeling, whether it be coding enough textual items to create a strong training set, or manually reviewing clustering in order to create some iterative analysis, for example, cluster, then human intervention and validation, adjustment based on intervention, then categorize.
Q3> What factors should companies consider when selecting text and sentiment analysis tools for marketing research?
Carol> Primarily, the tool should support visibility into the machine learning techniques and the ability for a human to audit, intervene, validate, and change to improve accuracy in an iterative way. No matter if the client uses a tool or a consultant, the client should be ready to do significant evaluation before basing any future-forward decisions on the results of the analysis. There is still a lot of snake oil out there.
Q4> Are there quality concerns in social-media MR, for instance due to not having representative sources or due to language and usage irregularities? And are there special advantages in social-media MR?
Carol> There are significant quality concerns. Issues of representativeness of the population under test and standard language uses, in my opinion, are not as much as a concern as usage irregularities. The management of usage irregularities is tricky in analysis, especially in short text or in text written by specific segments. Depending on the subject domain, not having a lexical resource of any kind -- even a basic dictionary of those abbreviations, subject domain specific terms, and slang -- impacts accuracy of the analysis. Today, excellent web services / software products are available help identify the standard language(s) used within texts, and there also exist in the marketplace advanced tools that specialize in terms of automated text analysis for multiple or double-byte languages. Marrying social media analysis with traditional survey research makes a lot of sense in a number of ways. First, survey research can help validate assumptions made within the analysis. Further, survey research done well can help map representativeness onto the social media data. Finally, survey research allows for a deep-dive of insights into concerns/issues brought up by the social media analysis.
Meet Carol Haney and other market research innovators at the Sentiment Analysis Symposium, October 30, 2012 in San Francisco. For now, return to the full set of interviews.