Finding structure for unstructured data
- By Frank Konkel
- Jan 09, 2013
If hype equaled performance, big data and cloud computing would already be unmatched cost-cutting, efficiency-increasing, bottom line-building tools in the federal space.
Those end results are not yet reality, but innovative techniques are already being put to use by certain agencies as they take advantage of the capabilities of big data in a cloud environment to monitor fraud and search for terrorists.
Intelligence analysts are using one such approach, called the Cloud Analytics Reference Architecture, to compile all available intelligence data into easily analyzed pool of data, according to Booz Allen Hamilton, which collaborated with the government to develop the technique.
“Reference architecture is simply a new way of looking at data, but one that revolutionizes our ability to gain knowledge and insight,” said Mark Jacobsohn, senior vice president at Booz Allen Hamilton in a new paper titled “Delivering on the Promise of Big Data and the Cloud.
The paper was one of two cloud-related papers the firm released on Jan. 9.
Jacobsohn said reference architecture does away with conventional data and analytics silos, consolidating all information into a single medium designed to foster connections called a “data lake,” which reduces complexity and creates efficiencies that improve data visualization to allow for easier insights by analysts.
Reference architecture can be deployed into a cloud environment, ingesting petabytes of raw data and organizing it in a single structure that is then open for analyzing.
“Imagine the data lake as the largest spreadsheet you ever saw, with billions and billions of cells,” said Booz Allen Hamilton executive vice president Mark Herman. “This isn’t a cloud solution that hasn’t been implemented or 'should' work. The most important thing about reference architecture is that it is being used in operational systems, we’ve already used it successfully for government clients and we know it works.”
Without giving too many specifics, Herman said an intelligence agency has implemented the approach to search for terrorist threats, while the Federal Deposit Insurance Corporation, the Securities and Exchange Commission and the Internal Revenue Service could benefit from using similar models for fraud detection. Meanwhile, the National Security Agency is using similar technology to to improve performance and mission effectiveness.
Jacobsohn claimed the visualization capabilities that a reference architecture provides could be as important to decision-making as pie charts and bar graphs were in the 1950s and 1960s. “Reference architecture will do the same – but this time with big data,” he said.
Yet as attractive as cloud-based big data solutions might seem, Herman offered caution to federal agencies looking to implement them, particularly those that have recently made large investments in legacy IT systems.
“You want to look at where you’re at in the life-cycle of your IT,” Herman said. “If you’ve just made a huge investment in something like a really expensive computer system, going to the cloud at that point is not going to save you money. But if you’re an agency at the end of your IT life-cycle, now you can invest and gain efficiencies. It all depends on IT liquidity.”