Tapping a data mother lode

About 85 percent of the information in the databases of agencies or private corporations is so-called unstructured data — e-mail messages, word-processing documents and other files that do not fit neatly into organized rows and columns — but rarely do organizations tap into it.

Looking to reverse that trend, In-Q-Tel, the venture capital arm of the CIA, has invested more than $1 million in Stratify Inc., a specialist in managing such data, and signed on as a customer.

Stratify's tool automatically organizes millions of documents, e-mail messages and Web pages into an easy-to-navigate hierarchy that can be integrated with structured data, allowing analysts to draw new insights from immense bodies of information.

It's the kind of analysis that usually happens only with the quantitative data — dollars spent, number bought — stored in traditional databases. "The goal here is to help the company develop products that government agencies can use off the shelf, because it's cheaper, quicker to implement and [it costs less] to maintain," said Gilman Louie, chief executive officer and president of In-Q-Tel. In-Q-Tel contacted Stratify in March and in less than six months had agreed to invest in the firm and buy its software for internal use, said Nimish Mehta, Stratify's president and CEO.

There's a compelling argument for getting a better grasp on that unstructured data, Mehta said.

"Key corporate insight resides in that [unstructured] information," Mehta said. Management is relying only on structured data — which is about 15 percent of the total data — but "when running an agency, they ought to be using all the resources and data available.... This tool allows the same kind of structured access to unstructured information as, for example, an Oracle [Corp.] database gives for structured data."

Ramon Barquin, president and CEO of Barquin and Associates Inc., an information technology consulting firm specializing in knowledge management, said agencies have recognized the need to dig into their unstructured data for some time and that the events of Sept. 11 have only reinforced that need.

Barquin said Stratify's software could aid an agency that is sorting through myriad pieces of unstructured data "just fishing for things that have a bearing on that problem, and [that needs to] manage and find content that is going to be helpful in that problem solving."

In-Q-Tel officials agree.

"In a crisis like [the Sept. 11 attacks], people are easily buried in too much information, but every piece of information is important," Louie said. If you normally receive 30 e-mail messages a day and now receive 300 or 3,000, "you need a tool to organize the information that allows you to ingest and digest what's there and more easily categorize it."

The software can handle the entire Microsoft Corp. Office suite and Adobe Systems Inc.'s PDF, as well as Web-based HTML data and text. At the urging of In-Q-Tel, Stratify is now finishing work on support for Microsoft Exchange and Lotus Development Corp. Notes and adding support for western European and Middle Eastern languages, both of which should make its products more attractive to government buyers, Mehta said.

"We told Stratify, as part of their commercial strategy, that they can't be English only," Louie said. "And you better support Microsoft Exchange and Lotus Notes...to compete in the knowledge management world."

Stratify's Mehta said the company has had a number of discussions with federal agencies, and "by summer of next year, we expect to announce several significant government relationships."


  • Defense
    Ryan D. McCarthy being sworn in as Army Secretary Oct. 10, 2019. (Photo credit: Sgt. Dana Clarke/U.S. Army)

    Army wants to spend nearly $1B on cloud, data by 2025

    Army Secretary Ryan McCarthy said lack of funding or a potential delay in the JEDI cloud bid "strikes to the heart of our concern."

  • Congress
    Rep. Jim Langevin (D-R.I.) at the Hack the Capitol conference Sept. 20, 2018

    Jim Langevin's view from the Hill

    As chairman of of the Intelligence and Emerging Threats and Capabilities subcommittee of the House Armed Services Committe and a member of the House Homeland Security Committee, Rhode Island Democrat Jim Langevin is one of the most influential voices on cybersecurity in Congress.

Stay Connected


Sign up for our newsletter.

I agree to this site's Privacy Policy.