A process known as sentiment analysis discerns essential truths embedded in blog posts, Facebook pages, Twitter tweets and the like.
An intrepid sleuth makes the rounds in Washington, shrewdly revealing profound secrets that had been hidden in plain sight.
That’s the synopsis of Dan “The Da Vinci Code” Brown’s latest blockbuster, “The Lost Symbol.” It also describes a new tool for decoding data and discovering insights into human behavior that could alter the way government agencies do business.
In the nonfiction scenario, a process known as sentiment analysis allows the government to discern essential truths embedded in the unstructured data that overflows blog posts, Facebook pages, Twitter tweets and the like. Understanding how groups of people feel about various topics could be useful for a wide range of purposes, including preventing terrorist attacks, understanding public opinion abroad and gauging the attitude toward proposed policies at home.
“If you’re able to engage sentiment on particular policies from a broad set of constituents, you’ll get a much more accurate read than you would using most other techniques,” said Larry Levy, chief executive officer of Jodange, which uses linguistic analysis to extract opinions from data. “Millions of people don’t lie.”
Since the beginning of recorded history, public information has been kept in physical formats — papyrus, petroglyphs — and frequently organized into tablets, books and libraries. In a giant leap forward, digitization made it possible to organize, analyze and cross-reference large amounts of structured information stored in databases.
However, in the era of Web 2.0 and social media, most new data exists in the form of unstructured, unanalyzed text that is accumulating at a dizzying rate. By one estimate, the current annual production of unique data is about 40 exabytes — an exabyte is a 1 followed by 18 zeros. Until recently, it would have taken 5,000 years to generate that much new information.
“More and more unstructured information is flying at us,” said Franz Aman, vice president of business intelligence and information management at enterprise software vendor SAP. “It’s very difficult to grasp by traditional tools.”
Using the technique of natural language processing, sentiment analysis attempts to identify, within the churn of information, currents of conviction and ripples of predilection that would otherwise be lost in the raging torrent.
“The CIA and [the National Security Agency] have been using this kind of capability for a long time to pursue terrorist activity,” said Aman, who added that the field has been evolving for about a dozen years.
Now, other government agencies are beginning to use sentiment analysis, too.
- An organization funded by a U.S. government agency wants to use sentiment analysis to monitor Iraqi media, both conventional and social, in the run-up to regional and parliamentary elections. The goal is “to know who the players are, where they are located, what people think about them and their opinions, in order to ensure fair elections and to maintain some cognizance of the security situation,” said Seth Grimes, president of Alta Plana, a business analytics consulting firm.
- A foreign government agency is analyzing unstructured text to monitor social acceptance of their policies. “They’re trying to understand how they are being interpreted across traditional and social media and whether they should change,” Levy said.
- The State Department is using data analysis tools to measure how much a particular country likes or dislikes the United States, says an SAP spokesperson.
- The federal government is experimenting with the use of sentiment analysis to facilitate electronic rule-making, which seeks to increase public participation in policy-making.
“There is unstructured data everywhere,” said Olivier Jouve, vice president of corporate development for text analytics at SPSS, an analytics software producer acquired by IBM. “It is about understanding what people say.”
National mood ring
Sentiment analysis essentially automates labor-intensive polls and surveys for a fraction of the cost. Applications equipped with spidering technology scour the Web and capture opinions — including emoticons, which are little symbols, such as smiley faces — that are relevant to a particular query, such as: What do Americans think about health care reform?
The applications then analyze the unstructured data according to various criteria: who is the opinion holder, what is the topic, and what is the tonality or polarity of the opinion — positive, negative or neutral?
“You can roll up the individual sentiments you find in snippets and build an overall view of the happiness scale of the U.S. population on any given topic on any given day,” said Aman, who likens sentiment analysis to “slipping a mood ring on America’s hand.”
Companies are using the tools to determine customer satisfaction, improve management of brands and reputations, and analyze media, among other applications. Proponents tout it as a direct link to what people think and feel, as opposed to traditional surveys that force respondents to choose among fixed choices.
Sentiment analysis is also emerging as a predictive tool. Knowing how people feel offers insight into how they might act. Financial firms have begun using the tools to forecast the price of stocks, which can rise and fall based on emotion, regardless of objective indicators such as price-earnings ratios.
“They are using sentiment analysis to make money,” said Jeff Catlin, CEO of Lexalytics, a maker of information analysis software.
The technology continues to evolve to more accurately discern meaning in unstructured data. The complexity and nuances of human language are such that they often confound text analytics, said Andy Beal, who helps clients manage their online reputations. Think of the blogger who texts that his new cell phone is “wicked bad.”
At present, sentiment analysis tools are right 70 to 80 percent of the time, but compared to the cost of surveying 1,000 people by telephone, they’re a lot more efficient, Beal said.
“If a government agency can use sentiment analysis to determine the pulse of people affected by a planned project or spending proposal, it will cut down on human input and the expense of market research,” he said.