Xerox announces categorization software

Research scientists at Xerox Research Centre Europe say they have perfected a new method for automatically categorizing electronic messages and documents for future retrieval.

The method uses unnamed software that performs what the scientists call "deep linguistic analysis." The technique could be useful, for example, for categorizing documents that should be preserved as federal records, the scientists said. Written in Java, the software can be integrated into existing document management and workflow systems.

"It's exciting news if true," said J. Timothy Sprehe, president of Sprehe Information Management Associates Inc., a consulting company in Washington, D.C. "There's enormous interest in auto-categorizing e-mail," especially among federal records managers.

Eric Gaussier, a research scientist at the center, said the new software represents an advance over existing categorization software, which is offered in some products and in the public domain. The software recognizes, for example, that words can have several meanings, depending on their context. It also recognizes that different words can mean the same thing, he said.

Since 1993, the research center has been developing linguistic analysis tools for different uses and in 20 languages, Gaussier said. The categorization software is a new use for those tools and for machine learning, for which the center is also known.

Such tools are very much needed, Sprehe said. In most federal departments, the volume of e-mail has grown so large that having people categorize e-mail messages for preservation as federal records is nearly impossible, he said. "It's no longer a practical solution," he said.

However, most experts in the field of records management say that automated filtering of records still leaves much to be desired. "The general conclusion is that auto-categorization is not yet ready for prime time," Sprehe said. "Everyone who is interested in this will say they want to see the proof first.

Featured

  • Management
    people standing on keyboard (Who is Danny/Shutterstock.com)

    OPM-GSA merger plan detailed in legislative proposal

    The White House is proposing legislation for a dramatic overhaul of human resources inside government and wants $50 million to execute the plan.

  • Cloud
    cloud applications (chanpipat/Shutterstock.com)

    GSA plans civilian DEOS counterpart

    GSA is developing a cloud email and enterprise services contract inspired by the single-source vehicle the Department of Defense devised for back-office software.

  • Defense
    software (whiteMocca/Shutterstock.com)

    DOD looks to unify software spending for 2020

    Defense Department acquisition head, Ellen Lord, hopes to simplify software buying and improve business systems following the release of the Defense Innovation Board's final software acquisition study.

Stay Connected

FCW INSIDER

Sign up for our newsletter.

I agree to this site's Privacy Policy.