Big Data

Can Wikipedia forecast the flu?

Wikimedia image: Wikipedia logo.

Wikipedia can not only tell you obscure things like where Bala Cynwyd is, it can also tell you if the Pennsylvania town is a current hotspot for the flu bug -- if you know how to ask.

Researchers at Los Alamos National Laboratory say they have learned how to glean information that can be used to forecast the upcoming flu season and other infectious diseases by analyzing views of Wikipedia articles.

Research teams at the laboratory recently published “Forecasting the 2013-2014 Influenza Season Using Wikipedia,” in the Public Library of Science.

Understanding influenza or other infectious disease dynamics and forecasting their impact is fundamental for developing prevention and mitigation strategies. To do that, Los Alamos researchers combined modern data assimilation methods with Wikipedia access logs and Center for Disease Control influenza-like illness (ILI) reports to create a weekly forecast for seasonal influenza.

The research taps into the angst people have when they come down with the flu bug and are searching for more information online. Los Alamos said its researchers found Wikipedia article access logs are highly correlated with historical ILI records and allow for accurate prediction of ILI data several weeks before it becomes available. The researchers' results showed that prior to the peak of the flu season, their forecasting method projected the actual outcome with a high probability.

“The ability to more accurately forecast the flu season and other infectious diseases will transform the way health departments prepare for and respond to epidemics, ultimately saving lives,” scientist Sara Del Valle said.

“We used techniques often seen in weather forecasting to iteratively tune a model of influenza dynamics based on Wikipedia observations so that our forecast agrees with the most current ILI data,” said researcher Kyle Hickmann, the lead author of the paper and a member of Del Valle’s team.

Los Alamos said the methods were applied to the 2013-2014 influenza season but are standard enough to forecast any disease outbreak, given incidence or case count data.

“Disease forecasting is still in its infancy and there is much more to learn in this field,” Del Valle said. “We are continuing to refine our approach so our forecasts can be used for actionable decision-making.”

About the Author

Mark Rockwell is a senior staff writer at FCW, whose beat focuses on acquisition, the Department of Homeland Security and the Department of Energy.

Before joining FCW, Rockwell was Washington correspondent for Government Security News, where he covered all aspects of homeland security from IT to detection dogs and border security. Over the last 25 years in Washington as a reporter, editor and correspondent, he has covered an increasingly wide array of high-tech issues for publications like Communications Week, Internet Week, Fiber Optics News, magazine and Wireless Week.

Rockwell received a Jesse H. Neal Award for his work covering telecommunications issues, and is a graduate of James Madison University.

Click here for previous articles by Rockwell. Contact him at [email protected] or follow him on Twitter at @MRockwell4.


  • Federal 100 Awards
    Federal 100 logo

    Nominations for the 2021 Fed 100 are now being accepted

    The deadline for submissions is Dec. 31.

  • Government Innovation Awards
    Government Innovation Awards -

    Congratulations to the 2020 Rising Stars

    These early-career leaders already are having an outsized impact on government IT.

Stay Connected