Taxonomy use grows

Federal agencies are increasingly likely to use and understand the system of data classification that goes hand in hand with XML.

Federal agencies are increasingly likely to use and understand taxonomies, the system of data classification that goes hand in hand with Extensible Markup Language (XML) in organizing data, according to industry experts.

Taxonomies are especially critical in making unstructured data -- e-mails, word processor documents and other sources where data items are not entered into predefined fields -- sensible to automated search tools.

Gartner Inc. analyst Rita Knox discussed the topic today at a forum hosted by Convera, which develops XML-based search technology for both commercial and government customers.

Standards bodies, like eXtensible Business Reporting Language (XBRL), created by the nonprofit XBRL INTERNATIONAL INC, can develop taxonomies. Its purpose is to set a common reporting platform for complex financial information, so that one company's XML documents will understand the language of another company's. Individual organizations can develop their own taxonomies as well.

Federal agencies are under mandates like the E-Government Act that taxonomies will help them meet, she noted. However, agencies should not underestimate the amount of work that goes into first creating, and then maintaining and updating, an internal system.

"You need [information technology] people, but you really need some business people, you need people who understand the business resources" available to an agency, Knox said.

Intelligence agencies, followed by law enforcement agencies, are ahead of others in implementing the technology, said Sean Alger, Convera's vice president and general manager of the federal sector. However, "I think there's still a huge education curve," he said.

Agencies could play a key role in showing commercial businesses the usefulness of taxonomies, Knox added. "If the federal sector does something, it tends to be more organized," she said. "It's often across bigger groups, and it's more public."