Digital standards come first

Before agencies digitize their records, LOC group must develop standards

There are no governmentwide standards for digitizing books, records, photos, maps and films or other analog materials. But federal agencies are working together to create standards for bringing millions of creative works into the digital world.

Representatives from the Library of Congress, the Government Printing Office, the National Archives and Records Administration, the Transportation Department and other organizations are establishing guidelines for a massive digitization project.

The Federal Digitization Standards Working Group of the National Digital Strategy Advisory Board (NDSAB) is developing governmentwide standards or guidelines that will help agencies preserve documents and other works and share them.

The board is part of the Library of Congress’ National Digital Information Infrastructure and Preservation Program (NDIIPP), whose purpose is to foster governmentwide collaboration and public/private consensus on standards for creating new digital works. Standards for digitizing various works would benefit librarians, archivists, researchers and businesses, said Michael Stelmach, who leads NDSAB’s Federal Digitization Standards Working Group. However, for agencies such as NARA, which is trying to digitize 9 billion federal records, best practices for digitizing paper documents are its most pressing need.

Digitizing a document requires making decisions about the type of equipment to use, the appearance of the digital representation and the format in which the digital document will be stored. Those decisions affect the usability, integrity and longevity of the digital document, officials say.

Records managers also must create metadata — the information describing a document’s technical specifications — and establish basic descriptive metadata about a particular document, said Amanda Wilson, DOT’s NDSAB working group representative and director of the agency’s digital National Transportation Library. “We are hoping to coordinate with state [and local] Department of Transportation libraries and other transportation agencies throughout the country.”

The standards that NDSAB develops most likely will resemble a menu offering choices among different specifications and types of equipment, she added.
NDSAB will post the draft standards for public comment.

“Basically there will be a platform of standards that says, ‘If you are going to come in and scan our stuff, here is where the bar is set,’ ” Stelmach said.
Agencies are only beginning what will be a long and expensive digitization project, but standardization will be good for businesses and agencies, said Scott Christensen, vice president of electronic production at iArchives. That company scanned newspapers for the Library of Congress’ National Digital Newspaper Program, a searchable public database of newspapers that had been available on microfilm.

“If it’s in a common format, it makes it that much quicker and easier to obtain,” Christensen said. Governmentwide guidelines would lower costs for vendors and open up the marketplace, he added.

The usability of digital documents depends largely on their metadata. However, some organizations also use optical character recognition software to search for key words in digitized documents. A reasonable rate for OCR accuracy is at least 90 percent, which in most cases is sufficient for good search results, Stelmach said.

Metadata describing the structural characteristics of an original document, such as type of work and volume number, are used for cataloging and searching. Many librarians use the Metadata Encoding and Transmission Standard (METS) to locate digital materials.

“From the point of view from an organization like the Library of Congress where [officials] are dealing with huge amounts of content, having standards for how materials are actually digitized is in
redibly important,” said Jerry McDonough, one of METS’ creators and assist
nt professor of library science at the University of Illinois at Urbana- Champaign.

“The sad reality is that we can make our own decision in the library community about how we want to digitize our analog material, but there are publishers out there who are going to have their own standards,” McDonough said. “They’re not driven by the same sort of preservation and access concerns as the library community is.”

About the Author

Ben Bain is a reporter for Federal Computer Week.

The Fed 100

Read the profiles of all this year's winners.

Featured

  • Then-presidential candidate Donald Trump at a 2016 campaign event. Image: Shutterstock

    'Buy American' order puts procurement in the spotlight

    Some IT contractors are worried that the "buy American" executive order from President Trump could squeeze key innovators out of the market.

  • OMB chief Mick Mulvaney, shown here in as a member of Congress in 2013. (Photo credit Gage Skidmore/Flickr)

    White House taps old policies for new government makeover

    New guidance from OMB advises agencies to use shared services, GWACs and federal schedules for acquisition, and to leverage IT wherever possible in restructuring plans.

  • Shutterstock image (by Everett Historical): aerial of the Pentagon.

    What DOD's next CIO will have to deal with

    It could be months before the Defense Department has a new CIO, and he or she will face a host of organizational and operational challenges from Day One

  • USAF Gen. John Hyten

    General: Cyber Command needs new platform before NSA split

    U.S. Cyber Command should be elevated to a full combatant command as soon as possible, the head of Strategic Command told Congress, but it cannot be separated from the NSA until it has its own cyber platform.

  • Image from Shutterstock.

    DLA goes virtual

    The Defense Logistics Agency is in the midst of an ambitious campaign to eliminate its IT infrastructure and transition to using exclusively shared, hosted and virtual services.

  • Fed 100 logo

    The 2017 Federal 100

    The women and men who make up this year's Fed 100 are proof positive of what one person can make possibile in federal IT. Read on to learn more about each and every winner's accomplishments.

Reader comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above

More from 1105 Public Sector Media Group