National Library of Medicine prescribes XML

XML is the technology behind the largest database of published medical information in the world.

XML helps the U.S. National Library of Medicine input, store, process and output the data in its MEDLINE database, said Simon Liu, director of information systems at NLM.

MEDLINE is the largest database of published medical information anywhere, holding more than 11 million article citations from stories published in more than 40,000 medical journals from 70 countries. NLM has been using XML in the MEDLINE system for almost three years, which makes it one of the government's early adopters of the technology, Liu said.

"In the past, [publishers] were using hard copies to send it to us and then we had to manually input it," Liu said. "But now we use an XML format for the input process and into the metadatabase."

For storage, NLM currently uses mostly Oracle Corp. databases and a few XML databases, but Liu expects a large jump in the amount of information stored in XML format by the end of next year.

Because it's the largest medical citation resource on the planet, MEDLINE has myriad users from all over the world — and they are now receiving data or "output" that is generated in an XML format, Liu said.

"In the past, we stored in a plain text structure, and output was in a different format also," he said. "But XML supports Unicode [translating software], so all users from around the globe who speak different languages can use it. XML is portable from machine to machine and system to system, but also from language to language."

One drawback involved with using XML is that NLM has to write its own XML-based document type definition files to generate the format necessary to make it work with MEDLINE. A DTD can be written to define the structure of a particular kind of XML document or file.

"In the future, doing that should be easier, assuming the vendors come up with more tools to allow us to do customization," Liu said. "Now, for XML-based DTDs, we have to do it by ourselves."

Featured

  • Defense
    Ryan D. McCarthy being sworn in as Army Secretary Oct. 10, 2019. (Photo credit: Sgt. Dana Clarke/U.S. Army)

    Army wants to spend nearly $1B on cloud, data by 2025

    Army Secretary Ryan McCarthy said lack of funding or a potential delay in the JEDI cloud bid "strikes to the heart of our concern."

  • Congress
    Rep. Jim Langevin (D-R.I.) at the Hack the Capitol conference Sept. 20, 2018

    Jim Langevin's view from the Hill

    As chairman of of the Intelligence and Emerging Threats and Capabilities subcommittee of the House Armed Services Committe and a member of the House Homeland Security Committee, Rhode Island Democrat Jim Langevin is one of the most influential voices on cybersecurity in Congress.

Stay Connected

FCW INSIDER

Sign up for our newsletter.

I agree to this site's Privacy Policy.