Data Mining

Predictive analytics allows feds to track outbreaks in real time

diagram of flu virus

Scientists are using social media to track the spread of the flu virus. (CDC image)

The flu spreads fast, but tweets spread faster, so health organizations and federal agencies, including the U.S. Centers for Disease Control and Prevention, are beginning to make use of predictive analytics of social data to monitor emerging situations like this season’s deadly influenza epidemic.

The CDC is among agencies that now utilize social insights gleaned from Google Flu Trends and MappyHealth– predictive tools that take collective web searches and tweets on flu-related symptoms and correlate the data on regional maps. CDC partners with Google and MappyHealth, which won the Department of Health and Human Services NowTrending2012 challenge, to use social media surveillance in the service of public health.

The CDC uses a variety of surveillance methods to track the spread of disease, said Richard Quartertone, health communication specialist at CDC’s Division of Notifiable Diseases and Healthcare Information. They include longstanding techniques such as monitoring hospital emergency room visits, performing laboratory tests and conducting population surveys. Now, epidemiologists also watch trends in web usage and at social-media sites, he said.

"CDC is actively working with partners such as Google and MappyHealth to increase the public health surveillance value of information from social-media sites," Quartertone said.

Google Flu Trends uses aggregate Google search data to provide real-time estimates of flu activity in more than 25 countries. When a user makes a Google search for relevant terms such as "influenza," he or she becomes part of the dataset used by Google Flu Trends to predict flu activity by geographic area.

MappyHealth, meanwhile, mines real-time data from Twitter, looking for health trends through the search of 234 unique terms. Mined data is churned into visual graphs to assist end-users in spotting trends, which are then reported – and reported much more quickly than traditional health data compilation methods can manage.

Traditional health reports prepared by the CDC take weeks, with local and state health departments compiling information and sending it up the federal chain of command.

This season’s particularly extensive flu outbreak actually began in late October, but it wasn’t widely reported in the media until Dec. 3, when the CDC released a public warning highlighting the danger.

Six weeks before – in mid-October – Baltimore-based Sickweather sent out a tweet warning users that the flu season was already here.

Sickweather, another data-mining application, had scanned millions of Facebook posts and tweets on Twitter for 24 flu-related symptoms – like the word "fever" – and ran them though further linguistic analysis to weed out information unrelated to the flu. That data was then used to plot illness-related mentions to a map.

Justin Herman, new media manager at the General Services Administration’s Center for Excellence in Digital Government, said predictive analysis of social data is creating new avenues for the public and government to work together.

"Social data, as part of open data, is building new ways for agencies and the public to work together," said Herman, who works with the GSA-led Social Performance Metrics Working Group, to build collaboration between agencies in analyzing social data.

"When a federal program manager can see the value and power of social data, it can help them identify emerging trends and develop an approach that will help them meet their unique program goals," Herman said.

Of course, with epidemics like the flu, which has already killed some 40 children this winter, quicker trend-spotting can translate into faster reactions from government agencies like the CDC. Decisions to ship flu vaccines or deploy additional nurses to hard-hit areas can be made sooner with predictive insight.

"Being able to spot general trends as they occur allows federal agencies to be more responsive and can sometimes result in immediate life-saving decisions, while also protecting citizens’ individual privacy," Herman said. "What’s unique about social data is the volume and immediacy of the information, which allows agencies to improve programs faster and more effectively."

The 2014 Federal 100

Get to know the 100 women and men honored this year for going above and beyond in federal IT.

Reader comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above