Big Data

The intelligence community's big-data problem

Big Data

The intelligence community is perhaps the most innovative data collector on the planet, with each of its 17 agencies able to siphon off various pools of information from nearly any source.

Yet the IC collects voluminous amounts of mostly fragmented data, and therein lies a challenge every other body in government struggling to make use of big data can relate to.

“In our world, we’re very good at collecting data, we’re also pretty good at analyzing it – we have to quickly parse out what is valuable,” Roger Hockenberry, a former chief technology officer for the Central Intelligence Agency, said during a panel session March 11 at the Symantec Government Symposium in Washington.

“Our data is always fragmented, and we’re trying to make sense of fragmented data options, which is extremely difficult,” said Hockenberry, who is now a consultant. “How we analyze every piece of data, how we reprocess it to continue to make better sense of what is going on – that is the biggest we have, especially when we can’t get complete databases.”

Former National Security Agency contractor Edward Snowden’s public disclosures of classified information have highlighted how the NSA and other agencies collect various sorts of signals intelligence. A significant amount of this data doesn’t come packaged neatly for ingestion and analysis in any open-source or proprietary platform. Social media feeds and emails, for example, represent large but highly unstructured datasets. To “normalize” that kind of unstructured data in a way that it becomes useful continues to be a major challenge, Hockenberry said.

To conduct its large-scale analytics effectively, the CIA uses a mixture of open-source and commercial products built off a data-science oriented reference architecture that sprung up from one of its small labs in the past decade. The CIA started with OpenStack and added commercial products in various places to note differences and build an effective and scalable solution.

Hockenberry said platforms and tools differ in usefulness depending on the environment in which they’re operating, and that logic also carries over to the post-analytic visualizations a dataset produces.

“You have to decide the right mix,” said Hockenberry, adding that big data forces analysts or data scientists to be creative in how they ask questions.

The intelligence community is at the forefront of big data as a technology, but even at its most effective levels, analyzing piles of unstructured, fragmented data is challenging. Algorithms will improve and data holders will inevitably learn to ask better questions of data, yet as the deluge of unstructured information continues to pour forth, finding meaningful signal in the noise is likely to remain problematic for some time.

“It’d be nice if al-Qaeda would ship us all their records in a nice, standard format, but they don’t,” Hockenberry said.  

About the Author

Frank Konkel is a former staff writer for FCW.

FCW in Print

In the latest issue: Looking back on three decades of big stories in federal IT.

Featured

  • Anne Rung -- Commerce Department Photo

    Exit interview with Anne Rung

    The government's departing top acquisition official said she leaves behind a solid foundation on which to build more effective and efficient federal IT.

  • Charles Phalen

    Administration appoints first head of NBIB

    The National Background Investigations Bureau announced the appointment of its first director as the agency prepares to take over processing government background checks.

  • Sen. James Lankford (R-Okla.)

    Senator: Rigid hiring process pushes millennials from federal work

    Sen. James Lankford (R-Okla.) said agencies are missing out on younger workers because of the government's rigidity, particularly its protracted hiring process.

  • FCW @ 30 GPS

    FCW @ 30

    Since 1987, FCW has covered it all -- the major contracts, the disruptive technologies, the picayune scandals and the many, many people who make federal IT function. Here's a look back at six of the most significant stories.

  • Shutterstock image.

    A 'minibus' appropriations package could be in the cards

    A short-term funding bill is expected by Sept. 30 to keep the federal government operating through early December, but after that the options get more complicated.

  • Defense Secretary Ash Carter speaks at the TechCrunch Disrupt conference in San Francisco

    DOD launches new tech hub in Austin

    The DOD is opening a new Defense Innovation Unit Experimental office in Austin, Texas, while Congress debates legislation that could defund DIUx.

Reader comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above

More from 1105 Public Sector Media Group