Seth Grimes’ recent post, “Never Trust Sentiment Accuracy Claims,” was probably designed to set off a lively exchange about social media analytics. It worked. His challenge to the validity of benchmarking accuracy against human analysts, or accuracy as a goal altogether, raises some of my favorite points.
(Full disclosure: I work for SAS, where we use both human and automated analysis. Our SAS® Social Media Analytics employs automated sentiment analysis.)
I love Seth’s core question: Is “as good as human” a valid objective? Yes, and no, I think.
After all, are we looking for “better” than humans, or more consistent? Is positive defined as containing positive keywords associated with a brand, or is it, rather, something we like? If it’s the latter, you need humans – but they’d better be well trained. If it’s the former, consistency wins.
We see these issues pop up all the time. Let’s start with who is good at what.
Humans, compared to computer chips, are better at extrapolating. For example, SAS offers a category of solutions we call “customer intelligence.” (Problem is, few others call it that.) I can teach human coders and computers alike to look for terms, e.g. CRM, marketing automation, marketing optimization, customer segmentation, loyalty program, and on. But real people often talk around marketing terms, without using the actual terms. So we teach our human coders the concepts behind these terms, and then they can spot other ways of describing marketing activities managed via technology. Computers can’t do that. While they can make some assumptions based on similarity, they still need the terms.
Computers, on the other hand, don’t make mistakes (assuming you’ve programmed them correctly). They don’t get distracted or change their minds. More importantly, they have no opinions. Example: When Bank Systems & Technology picks up our press release about SAS’ new security analytics platform, I’m thrilled. But is it positive? Not according to the computer, which looks for sentiment indicators. Simply repeating the availability of the product is neutral, a statement of fact. Computers can resist the urge to take intent into account when scoring. So, points for consistency over comprehension, right?
Well, it depends.
It’s the question that will never die: What do you want to measure? In this case, is it expressions of 3rd-party opinion or successful efforts to plant key messages?
And what happens when you encounter a word that screams “negative” for some but not for others? Most standard sentiment taxonomies rightfully consider “fraud” to be a negative term. But SAS offers a product aimed at detecting and preventing fraud. So for us, it’s neutral. As are terms like “killer” (as in “killer app.”). For one of our customers, “f***ing amazing” is what they strive for. No plug-and-play vendor can customize sentiment to accommodate individual needs.
So a lot comes down to defining your terms – “positive” in this case or, maybe “influence” in another (that’s for another column). But even more comes down to transparency and flexibility. Be skeptical. If your vendors can’t explain to you how they derive sentiment, step cautiously. If they say “natural language processing” but talk only keywords, send up a flare. And if they make lofty claims of accuracy and expertise before they understand your needs and the context, change the locks.
Diane Lennox is PR Services Manager for SAS, the leader in business analytics software and services. A 30-year veteran in marketing communications and writing for all media, she has spent the past six years supporting SAS' internal PR agency by managing the Global PR Resource Center (internal), acting as international liaison with dozens of country PR managers, guiding PR measurement and monitoring, overseeing communications and media training, supporting the blogging and social media program and providing SEO guidance. She does not do windows.
(Thanks to Across the Litoverse for the image.)