Summary: The author attempted to develop an automated system for rating media mentions using Grunig's relationship concepts. Traditional media was found to include too little expression of sentiment. A second attempt was made with only social media mentions, and some success was achieved. In general, coding with humans was found to be much faster and more accurate than automated coding.
by Katie Delahaye Paine
With the ubiquity of social media today, the notion of using social media as a way to listen to and gauge public opinion has captured the interest of marketers. At the moment, the hot button is sentiment measurement, which is the notion that you can teach a computer what positive and negative tone is by precoding it with key words. So the computer searches for brand proximity to words like “suck” and “hate” as a proxy for negative sentiment.
However, sentiment analysis has been shown to be notoriously inaccurate, unreliable, and therefore not terribly useful. With the addition of human coding to improve the model, most sentiment analysis tools can achieve 75% accuracy at best.
Instead of sentiment, why not relationships?
We postulate that marketers are chasing after the wrong emotion. Rather than attempt to gauge sentiment, it would be far more useful to understand how customers, prospects, employees and other constituents feel about their relationship with your brand. So we decided to try to use the core concepts of the Grunig Relationship model to define an algorithm that would rate how stakeholders feel about an organization based on their conversations in social media.
What we thought we were going to do
We defined a list of key words, phrases and concepts that would be tracked, individually and as a cluster, around each of Grunig’s six components of relationships: Control Mutuality, Trust, Commitment, Satisfaction, Exchange, and Communal. We selected three institutions which we thought would be good candidates for this experiment: Harvard University, Stanford University, and MIT. To ensure comparability, we had to eliminate coverage of sports, since Stanford was headed for the NCAA championships and MIT is not known for its athletic achievement.
We then used human coders to content analyze 2,000 items for the relationship concepts. The idea was to take the terms and concepts that showed up and teach the computer to identify relationship components. The plan was to use an algorithm to score the conversations on a scale of 1 to 5, so that each institution would receive a score on each element of the relationship.
What really happened, and lessons learned
During the human coding phase we learned a tremendous amount about using content analysis to measure relationships.
Lesson #1: You can’t analyze passion if it’s not there.
The first discovery we made was that when you remove sports from any study of the university environment, most of the emotion and passion goes with it. What was left was a great number of discussions about scientific and academic research, very few of which actually contained any sentiment at all. Of the 2,000 total items, only 265 (13%) actually contained any of the relationship concepts we were trying to study.
Lesson #2: You Need Twitter and Facebook to analyze sentiment.
No matter what liberals and conservatives may say, the vast majority of traditional media is neutral or balanced, i.e., it doesn’t contain much opinion at all. It also doesn’t reflect what people are directly saying, just what a reporter or an editor has allowed in the piece. So, traditional media does not offer much to work with in a relationship study. You can count the comments in the online versions of these stories, but you also should know that only a small percentage of these are actually seen, and by a very small percentage of the audience.
In reality, if you want to measure sentiment you need to be analyzing Twitter and Facebook, and that’s the rub. Getting full content from either of those sources is not easy. Most services only get about 15% of Twitter, and even those with a full Firehose licence only get about 85%. Facebook poses an even greater challenge since most content providers only provide what is publicly available, which again is small percentage of the universe.
Personal blogs are another good source of sentiment, and we definitely recommend including them, but you need to exclude the content farms and fake sites to make sure you are getting a true reflection of people’s opinions, not something a bot created.
Lesson #3: Trust and Communal Relationships were the easiest relationship concepts to detect.
Communal Relationships (each party sees mutual benefit in the relationship succeeding) are frequently reflected in an organization’s corporate social responsibility, so community events and good deeds were easily identified as promoting a positive communal relationship. 44% of all item contained some expression of communal relationship and 30% contained references to trust.
Automated coding: Does not compute
Given the relatively low level of sentiment detected and the challenges we had in training humans to code for relationship concepts, our expectations weren’t very high for automated coding. Nonetheless we used what we learned from human coding to create a taxonomy based on the concepts and key words we had discovered.
While it was interesting to discover that the accuracy level for positive coding was better than negative, overall the machine coding data was unusable. There was simply not sufficient data to analyze with any degree of accuracy. Further, we learned that the linquistic modeling tools we were using require an valid/invalid construct rather than the scale model that we wanted to use. Back to the drawing board.
So we tried focussing on social media
We then decided to analyze an entirely different set of data: 12,000 purely social media items about a highly controversial organization that had been frequently in the news during the past year. This time the results were much more interesting. We found more than half (56%) of the 12,000 contained some expression of a relationship concept.
Lesson #4: Some concepts are easier than others to identify and score
Once again Communal and Trust were found most frequently, with 17% of mentions conveying trust and 13% suggesting a communal relationship. However, we found that Satisfaction and Trust were most accurately identified. And we learned that some of the other Grunig concepts are virtually impossible to track, such as Control Mutuality and Exchange Relationship.
Taking it to another level, we learned that it’s relatively easy to describe a set of terms to define trust, but teaching a computer takes a lot more time. So, for example, a human knows the difference between “I fell in love with Stanford” and “I fell in love at Stanford.” But it requires a fairly sophisticated algorithm to teach a computer the difference.
Pulling trust out of a posting gets even more complicated. So, if someone posts to Facebook that “I fell in love at Stanford and felt very comfortable openly expressing my feelings for my gay lover,” that clearly expresses trust in the Stanford campus environment. But teaching a computer that the feelings are for the campus and not the lover takes a whole other level of complexity.
Lesson #5: It takes too long.
The only restriction on getting a computer to do this accurately is time. If we had nothing to do but perfect our automated relationship taxonomies for the next year or two, we’d eventually get it right. For now, for the most part, humans are faster and more accurate.
What’s Next?
Clearly this is an important first step – perhaps just a baby step – in the direction of using content analysis to determine relationship health. We feel pretty confident that, given sufficient emotional content, human coders can accurately glean the concepts of Trust, Communal Relationships, and Satisfaction from social media conversations. As we conduct more such analyses, we will continue to define more words and concepts that will help refine the process and improve accuracy still further. In terms of getting a computer to be as accurate, we clearly have a long way to go.
Further complicating the process is the difficulty in creating a generalized taxonomy. Good taxonomies must be specific to the organization studied. So, to get any generalized taxonomy you will need to model which words or phrases are more important. In order to get there you need a greater quantity of human-coded relationship data to build a statistical model that would validate relationship constructs in social media.
###
Katie Delahaye Paine is CEO of KDPaine & Partners, a company that delivers custom research to measure brand image, public relationships, and engagement. Katie Paine is a dynamic and experienced speaker on public relations and social media measurement. Click here for the schedule of Katie’s upcoming speaking engagements.