Digital standards come first

Before agencies digitize their records, LOC group must develop standards

There are no governmentwide standards for digitizing books, records, photos, maps and films or other analog materials. But federal agencies are working together to create standards for bringing millions of creative works into the digital world.

Representatives from the Library of Congress, the Government Printing Office, the National Archives and Records Administration, the Transportation Department and other organizations are establishing guidelines for a massive digitization project.

The Federal Digitization Standards Working Group of the National Digital Strategy Advisory Board (NDSAB) is developing governmentwide standards or guidelines that will help agencies preserve documents and other works and share them.

The board is part of the Library of Congress’ National Digital Information Infrastructure and Preservation Program (NDIIPP), whose purpose is to foster governmentwide collaboration and public/private consensus on standards for creating new digital works. Standards for digitizing various works would benefit librarians, archivists, researchers and businesses, said Michael Stelmach, who leads NDSAB’s Federal Digitization Standards Working Group. However, for agencies such as NARA, which is trying to digitize 9 billion federal records, best practices for digitizing paper documents are its most pressing need.

Digitizing a document requires making decisions about the type of equipment to use, the appearance of the digital representation and the format in which the digital document will be stored. Those decisions affect the usability, integrity and longevity of the digital document, officials say.

Records managers also must create metadata — the information describing a document’s technical specifications — and establish basic descriptive metadata about a particular document, said Amanda Wilson, DOT’s NDSAB working group representative and director of the agency’s digital National Transportation Library. “We are hoping to coordinate with state [and local] Department of Transportation libraries and other transportation agencies throughout the country.”

The standards that NDSAB develops most likely will resemble a menu offering choices among different specifications and types of equipment, she added.
NDSAB will post the draft standards for public comment.

“Basically there will be a platform of standards that says, ‘If you are going to come in and scan our stuff, here is where the bar is set,’ ” Stelmach said.
Agencies are only beginning what will be a long and expensive digitization project, but standardization will be good for businesses and agencies, said Scott Christensen, vice president of electronic production at iArchives. That company scanned newspapers for the Library of Congress’ National Digital Newspaper Program, a searchable public database of newspapers that had been available on microfilm.

“If it’s in a common format, it makes it that much quicker and easier to obtain,” Christensen said. Governmentwide guidelines would lower costs for vendors and open up the marketplace, he added.

The usability of digital documents depends largely on their metadata. However, some organizations also use optical character recognition software to search for key words in digitized documents. A reasonable rate for OCR accuracy is at least 90 percent, which in most cases is sufficient for good search results, Stelmach said.

Metadata describing the structural characteristics of an original document, such as type of work and volume number, are used for cataloging and searching. Many librarians use the Metadata Encoding and Transmission Standard (METS) to locate digital materials.

“From the point of view from an organization like the Library of Congress where [officials] are dealing with huge amounts of content, having standards for how materials are actually digitized is in
redibly important,” said Jerry McDonough, one of METS’ creators and assist
nt professor of library science at the University of Illinois at Urbana- Champaign.

“The sad reality is that we can make our own decision in the library community about how we want to digitize our analog material, but there are publishers out there who are going to have their own standards,” McDonough said. “They’re not driven by the same sort of preservation and access concerns as the library community is.”

About the Author

Ben Bain is a reporter for Federal Computer Week.

The Fed 100

Save the date for 28th annual Federal 100 Awards Gala.

Featured

  • computer network

    How Einstein changes the way government does business

    The Department of Commerce is revising its confidentiality agreement for statistical data survey respondents to reflect the fact that the Department of Homeland Security could see some of that data if it is captured by the Einstein system.

  • Defense Secretary Jim Mattis. Army photo by Monica King. Jan. 26, 2017.

    Mattis mulls consolidation in IT, cyber

    In a Feb. 17 memo, Defense Secretary Jim Mattis told senior leadership to establish teams to look for duplication across the armed services in business operations, including in IT and cybersecurity.

  • Image from Shutterstock.com

    DHS vague on rules for election aid, say states

    State election officials had more questions than answers after a Department of Homeland Security presentation on the designation of election systems as critical U.S. infrastructure.

  • Org Chart Stock Art - Shutterstock

    How the hiring freeze targets millennials

    The government desperately needs younger talent to replace an aging workforce, and experts say that a freeze on hiring doesn't help.

  • Shutterstock image: healthcare digital interface.

    VA moves ahead with homegrown scheduling IT

    The Department of Veterans Affairs will test an internally developed scheduling module at primary care sites nationwide to see if it's ready to service the entire agency.

  • Shutterstock images (honglouwawa & 0beron): Bitcoin image overlay replaced with a dollar sign on a hardware circuit.

    MGT Act poised for a comeback

    After missing in the last Congress, drafters of a bill to encourage cloud adoption are looking for a new plan.

Reader comments

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above

More from 1105 Public Sector Media Group