Archives to scale volumes of snapshots

When the National Archives completes the task of collecting "snapshots" of all federal Web sites, it will have to figure out how to store and search through 21 terabytes of digital information.

"We're not scaled to do that now. We will have to build up the capacity to handle it," said Mike Miller, director of the Archives' modern records program.

The snapshots were ordered by the outgoing Clinton administration to preserve archival copies of federal Web sites as they existed Jan. 20. Senior Clinton officials said they wanted a record of the electronic government developed during their watch.

To archivists, the snapshots have a less specific, but perhaps greater worth.

"We save these things for one reason and find that people find tons of ways to use them," Miller said. He said, for example, accounting records captured during the collapse of Nazi Germany sat largely unused for about a half century, but in recent years they have become valuable for tracing looted gold and treasure.

The snapshots of government Web sites are also certain to prove valuable, he said.

Some agencies may find them useful in settling legal disputes. Researchers will no doubt find them valuable for tracing the early development of electronic government.

"We felt we would be kicking ourselves if we did not" take the snapshots, Miller said. So far, 38 agencies, mainly small ones, have sent Web snapshots to the Archives, Miller said Feb. 16. There are at least three times that many federal agencies. The deadline is March 20.

Agencies must capture the Web site as it appeared Jan. 20, complete with working links between the site's pages and layers. Snapshots are being sent to the Archives on CD-ROMs or tape and eventually are to be transferred to digital linear tape for long-term storage.

If printed on paper, the 21 terabytes of Web data would be roughly double the amount of information contained in the Library of Congress' collection of 20 million volumes.

Because of the volume of data involved, the Archives does not want to make a practice of periodically collecting agency Web site snapshots. "We want to get this on a more regularized basis," Miller said. The record-keeping agency hopes to have new guidelines in place next month instructing agency Web managers on how to routinely preserve Web site records.

Featured

  • Telecommunications
    Stock photo ID: 658810513 By asharkyu

    GSA extends EIS deadline to 2023

    Agencies are getting up to three more years on existing telecom contracts before having to shift to the $50 billion Enterprise Infrastructure Solutions vehicle.

  • Workforce
    Shutterstock image ID: 569172169 By Zenzen

    OMB looks to retrain feds to fill cyber needs

    The federal government is taking steps to fill high-demand, skills-gap positions in tech by retraining employees already working within agencies without a cyber or IT background.

  • Acquisition
    GSA Headquarters (Photo by Rena Schild/Shutterstock)

    GSA to consolidate multiple award schedules

    The General Services Administration plans to consolidate dozens of its buying schedules across product areas including IT and services to reduce duplication.

Stay Connected

FCW Update

Sign up for our newsletter.

I agree to this site's Privacy Policy.