Should you analyze your conversations for sentiment? If so, should you use computers or humans to do the job?
by Katie Delahaye Paine
Whether they bill themselves as listening tools, measurement services, or media monitoring firms, there are now more than 150 companies pitching their social media measurement services to overloaded corporate communicators. Almost all offer sentiment analysis, the art or science of gleaning how people feel about your brand by reading what they have to say.
Should You Do Sentiment Analysis At All?
Yes, it’s the latest shiny new measurement toy, but sentiment analysis is not always possible, or even useful. Carefully consider these two questions before you decide:
1. Do people express any sentiment at all in discussing your brand?
You can't measure sentiment if it’s not there. For many sectors (marketing a B2B product, for instance) it may well be that all the conversations out there are factual discussions. It may not be possible to glean any sentiment out of the conversations.
2. Do you have any direct interaction with customers?
Measuring sentiment is only useful if you can use your results. If you have no direct customer interaction, it will be very difficult to determine whether the expression of sentiment has any real impact on your business. Only if you are an online retailer, or are in a field where people make reservations, or register online, can you tie sentiment to customer behavior.
Should You Use Computers or Humans to Analyze Sentiment?
If and when you decide to go ahead with social media sentiment analysis, the biggest decision you have to make is whether you are going to use human or computer-automated analysis. Before you rush out and buy a sentiment analysis system, here are four questions that will help you decide if it’s computers or humans who will do the best job for you.
1. Do you receive more than 10,000 qualified mentions a month in social media?
That’s not counting spam, or copy generated by content farms, or mentions of a similar sounding brand (for instance, Carmax the car superstore vs. Carmex the lip balm). If your volume falls below this mark it may well be more expensive to program a computer than to use humans to accurately analyze your coverage. See the chart to the right based on our experience.
2. What level of accuracy is acceptable to your executive leadership?
Most automated sentiment analysis tools get sentiment right about 40-60% of the time. If that is good enough, you can use an automated system. If not, then to ensure a higher degree of accuracy you need to (at least) use humans to check random samples of your analysis and re-analyze (perhaps with humans) as necessary.
3. If you need a high degree of accuracy, do you have 20,000 qualified mentions?
Computer-coded accuracy increases with the number of mentions analyzed. It is easier to get a higher degree of accuracy if you have many mentions to work with. In our experience that number is about 20,000. See the chart to the right based on our experience.
4. What level of detail do you need from your sentiment analysis system?
If you need to track complex messaging, quotes, issues, positioning, or other esoteric details, computers will be complex to program and slow to get results from. Chances are you will need humans to get the job done with acceptable accuracy and reasonable speed.
5. Do you run numerous campaigns which will require different search terms, different message tracking, and different definitions of positive or neutral?
Again, computers can take weeks to reprogram, test, and fix. If you need fast turnaround on changes to your system, use a human.
###
(Thanks for the illustration: Personal Robot 02 by Franz Steiner. Portfolio here.)
Katie Delahaye Paine is CEO of KDPaine & Partners, a company that delivers custom research to measure brand image, public relationships, and engagement. Katie Paine is a dynamic and experienced speaker on public relations and social media measurement. Click here for the schedule of Katie’s upcoming speaking engagements.
Katie -- While you raise useful questions for thought, I am troubled by two issues. First, "accuracy" is not an appropriate measure for sentiment analysis. The right statistical test is an inter-observer reliability statistic, of which Krippendorff's alpha is one specifically adapted to content analysis. Second, you and many others throw around statistics such as "most automated sentiment analysis tools get sentiment right about 40% to 60% of the time." This begs the question of the level of agreement among two humans analyzing the same data. Are we sure that they agree more than 60% of the time?
Posted by: David Geddes | May 04, 2011 at 10:18 AM
I can't speak for anyone else's systems, David. But Yes, we conduct intercoder reliability testing using Scott's Pi, for our human readers to ensure that they agree at least 88% of the time. If any individual reader's scores fall below 80% they are given one month to improve. If during a second month they don't improve, they are dismissed.
Posted by: Katie Delahaye Paine | May 05, 2011 at 06:49 AM
As a network monitoring system few things beat SysOrb from Evalesco if any. This network monitoring software is highly scalable, has incredible features and is at the same time one of the easiest to use and therefore fastest to learn. Free trial.
Posted by: evalesco | June 02, 2011 at 09:45 AM
There are a number of issues to be addressed.
In the first instance any sentiment analysis needs inter-coder reliability with a perspective bias. Factual content variance does need to be considered.
Interactions are not limited to customers. The financial and political spheres, not to mention the influence of competitive politics are important too.
This is possible using semanics and harder with human coders.
The problem with using Grunig's six components of relationships is that it is not really very good in content analysis compared to much more granular applications of values analysis (bayesian derived mutuality of semantic concepts being one approach).
I guess, the automated systems that you are using are based on POS tagging and word counts which will always be limited in their applications.
Google,Bing and Amazon abandoned such approaches a number of years ago.
Posted by: David Phillips | November 23, 2011 at 11:31 AM