Big Data

Data scientists: Stop trying to just predict; look back, and think

data door

Sometimes you miss the forest for the trees, and sometimes you get so caught up in plotting a path through the forest, you never stop to figure out how the forest got there in the first place.

Taking a look backward instead of forward was a major theme at the 2nd annual Federal Big Data Summit sponsored by the Advanced Technology Academic Research Center, a non-profit forum for government, industry and academia to collaborate.

“You absolutely have to look back to make sense of your data,” said Nagesh Rao, chief technologist at the Small Business Administration. Rather than just blindly forging ahead with predictive analytics, looking back provides the opportunity to digest information and figure out why things happen, Rao said.

That deeper understanding – drawing “causal inferences” – is sorely needed.

Right now, humanity has a “component-level understanding of complicated systems, not a system-level understanding,” said DARPA program manager Paul Cohen. That, he said, “is a dangerous place to be in,” as humans feel confident enough to interfere with individual systems yet cannot measure the impact those interferences have on the entire global ecosystem.

Cohen used the example of dumping iron filings into the ocean to promote algae blooms. The algae consume carbon dioxide, die and sink to the bottom of the ocean, effectively trapping the greenhouse gas underwater. While it may seem like a perfect climate change-fighting tactic, Cohen warned that we really don’t understand the broader implications of widespread iron dumping.

Jennifer Bachner, government certificates director of Johns Hopkins University, echoed Rao and Cohen, calling for data studies to get beyond “prediction” and into “causal studies,” which she said will require the cooperation of data scientists, social scientists and others across a wide variety of disciplines.

For Cohen, machines will provide the mulling-over power that humans lack.

“The vast majority of scholarly work we do never gets synthesized into a model of how the world works,” Cohen noted, lamenting the thousands of pages of research that is published and then lies unread each year. “The goal three to five years from now is that machines will read everything that everyone writes.”

Cohen is working toward that goal with DARPA’s Big Mechanism cancer research-processing program.

Humanity will still play the crucial role of inputting the thinking. “[IBM’s] Watson itself doesn’t have a causal understanding of anything,” he said, noting the system can provide an answer to a question only if someone somewhere has already written and uploaded said answer.

The assembled speakers did, of course, tout the predictive power of big data analytics, with Bachner looking forward to predictive policing – “kind of like ‘Minority Report,’ but in the real world” – and Rao noting how big data can help identify emerging trends and opportunity for investment in critical technologies of the future.

The FCC’s Tony Summerlin cautioned that it’s a “waste of time” when people “use data to prove things that are well-known to everyone.”

But for those answers that are not well-known, and for crafting a bigger-picture view of how and why the world works, big data holds immense promise – if humans are willing to take a look back and work with machines, said Cohen.

“The highest quality knowledge is about small numbers of things,” he added, saying all those small bits of information need to be brought together in a more cohesive whole. The end goal: developing “causal knowledge,” not just statistical predictions.

About the Author

Zach Noble is a former FCW staff writer.


  • Defense
    Ryan D. McCarthy being sworn in as Army Secretary Oct. 10, 2019. (Photo credit: Sgt. Dana Clarke/U.S. Army)

    Army wants to spend nearly $1B on cloud, data by 2025

    Army Secretary Ryan McCarthy said lack of funding or a potential delay in the JEDI cloud bid "strikes to the heart of our concern."

  • Congress
    Rep. Jim Langevin (D-R.I.) at the Hack the Capitol conference Sept. 20, 2018

    Jim Langevin's view from the Hill

    As chairman of of the Intelligence and Emerging Threats and Capabilities subcommittee of the House Armed Services Committe and a member of the House Homeland Security Committee, Rhode Island Democrat Jim Langevin is one of the most influential voices on cybersecurity in Congress.

Stay Connected


Sign up for our newsletter.

I agree to this site's Privacy Policy.