LOC to save data 'born digital'

Library of Congress introduces plan for preserving Web sites, CDs and other digital information

National Digital Information Infrastructure and Preservation Program Web site

The Library of Congress introduced a plan last week for preserving Web sites, CDs, electronic journals and other digital information.

The National Digital Information Infrastructure and Preservation Program has been approved by Congress and received funding, and now archivists face the daunting task of figuring out just how to save information that was "born digital."

"Your great-great-grandchildren will have a picture of the early days of the Internet," said Laura Campbell, associate librarian for strategic initiatives.

The plan develops a nationwide strategy for collecting and preserving digital information across many federal agencies and private entities. The strategy would create a network of partners, determining who gathers what information, and build a digital architecture for preservation, Campbell said.

Digital photographs, movies, music, Web-based journals and other cultural electronic items would be preserved for future generations while trying to keep up with constantly changing technologies. Archivists must first determine a format that everyone can agree on to preserve the digital data, while keeping in mind that the format likely will change as technology advances.

"You have to consider the ability to play back and provide access to it in 100 years," Campbell said. Today, users trying to read information from a 5 1/4-inch floppy disk are out of luck, so choosing the correct format is not only important, it's ever-changing.

"This will be an continuing process," Campbell said, noting that new digital information will have to be archived and already-archived information will need to be updated. "There won't be an end date," she said. "We try to create some efficiencies in terms of responsibility, technology and cost of preserving digital information."

In December 2000, Congress authorized the establishment of the preservation program and provided the library with $100 million, $5 million of which was dedicated to developing the plan.

With the recent approval of the plan, an additional $20 million of the $100 million is available to embark on the early stages of the plan. For Congress to dole out the remaining $75 million, the money must be matched dollar-for-dollar by nonfederal sources in the form of cash, hardware or software.

The plan is in its early stages, Campbell said, as participants are selecting items that are at risk of disappearing, such as articles about the Sept. 11, 2001, terrorist attacks and Web sites covering the 2000 presidential election.

The National Archives and Records Administration is facing similar obstacles with its electronic records archive, which launched last fall to preserve all the government's public records.

As with NARA's archive, Library of Congress archivists must find a format that is hardware and software independent so the information can be accessed in its original form 100 years from now, said Reynolds Cahoon, assistant archivist for human resources and information services at NARA.

NEXT STORY: Labor launches E-Gov plan