Los Alamos uses Wikipedia to predict disease outbreaks

Researchers aim for "an operational disease monitoring and forecasting system with open data and open source code."

Shutterstock image.

They may not like it, but politicians and communicable diseases have something in common. According to a team of researchers at Los Alamos National Laboratory, they're both trackable via Wikipedia.

Using techniques similar to those used to gauge public interest in political candidates before an election, scientists at the Department of Energy's New Mexico research lab said they can monitor and forecast disease outbreaks around the globe by analyzing views of Wikipedia articles and the viewers' locations.

Researchers hope the technique, recently unveiled in a paper published in the scientific journal PLoS Computational Biology, will speed faster response to communicable disease hotspots around the world based on open data in an open source system.

"A global disease-forecasting system will improve the way we respond to epidemics," Los Alamos scientist Sara Del Valle said in a Nov. 13 statement. "In the same way we check the weather each morning, individuals and public health officials can monitor disease incidence and plan for the future based on today’s forecast."

According to Los Alamos' statement, Del Valle and her team successfully monitored influenza in the United States, Poland, Japan and Thailand; dengue fever in Brazil and Thailand; and tuberculosis in China and Thailand using the technique. They were able to forecast all but one disease outbreak (in China) at least 28 days in advance based on data gleaned from Wikipedia. According to Del Valle, that shows people start searching for disease-related information on Wikipedia before they seek medical attention.

The research team's paper said similar tracking models have been used for other applications, like estimating the popularity of politicians and political parties. Economic applications have also included attempts to forecast movie ticket sales and stock prices. The last two applications, the team said, were particularly interesting because they include a forecasting component, just like the work with communicable diseases.

The Los Alamos researchers said their disease tracking models could translate across different regions, essentially using a computer model with public health data in one location to train computers in other locations. For example, researchers could create models using data from Japan to track and forecast disease in Thailand. This is particularly important for countries that do not offer reliable disease data, they said.

"The goal of this research is to build an operational disease monitoring and forecasting system with open data and open source code," Del Valle said. "This paper shows we can achieve that goal."