National Library of Medicine prescribes XML

XML is the technology behind the largest database of published medical information in the world.

XML helps the U.S. National Library of Medicine input, store, process and output the data in its MEDLINE database, said Simon Liu, director of information systems at NLM.

MEDLINE is the largest database of published medical information anywhere, holding more than 11 million article citations from stories published in more than 40,000 medical journals from 70 countries. NLM has been using XML in the MEDLINE system for almost three years, which makes it one of the government's early adopters of the technology, Liu said.

"In the past, [publishers] were using hard copies to send it to us and then we had to manually input it," Liu said. "But now we use an XML format for the input process and into the metadatabase."

For storage, NLM currently uses mostly Oracle Corp. databases and a few XML databases, but Liu expects a large jump in the amount of information stored in XML format by the end of next year.

Because it's the largest medical citation resource on the planet, MEDLINE has myriad users from all over the world — and they are now receiving data or "output" that is generated in an XML format, Liu said.

"In the past, we stored in a plain text structure, and output was in a different format also," he said. "But XML supports Unicode [translating software], so all users from around the globe who speak different languages can use it. XML is portable from machine to machine and system to system, but also from language to language."

One drawback involved with using XML is that NLM has to write its own XML-based document type definition files to generate the format necessary to make it work with MEDLINE. A DTD can be written to define the structure of a particular kind of XML document or file.

"In the future, doing that should be easier, assuming the vendors come up with more tools to allow us to do customization," Liu said. "Now, for XML-based DTDs, we have to do it by ourselves."

Featured

  • Telecommunications
    Stock photo ID: 658810513 By asharkyu

    GSA extends EIS deadline to 2023

    Agencies are getting up to three more years on existing telecom contracts before having to shift to the $50 billion Enterprise Infrastructure Solutions vehicle.

  • Workforce
    Shutterstock image ID: 569172169 By Zenzen

    OMB looks to retrain feds to fill cyber needs

    The federal government is taking steps to fill high-demand, skills-gap positions in tech by retraining employees already working within agencies without a cyber or IT background.

  • Acquisition
    GSA Headquarters (Photo by Rena Schild/Shutterstock)

    GSA to consolidate multiple award schedules

    The General Services Administration plans to consolidate dozens of its buying schedules across product areas including IT and services to reduce duplication.

Stay Connected

FCW Update

Sign up for our newsletter.

I agree to this site's Privacy Policy.