Crawling for content

Government Printing Office officials are investigating the use of Web-crawler and data-mining technologies to capture government information published on the Web.

They need technology that can:

Find and capture government information on the Web in any format.

Examine file content and any metadata associated with the file.

Follow rules for capturing government information and avoid capturing information that fails to conform to the rules.

Tolerate rule changes as GPO officials gain a better understanding of the types of electronic information they need to preserve.

Perform automated comparisons between newly captured government information and information already stored in GPO's electronic repository to eliminate duplication.

Source: Government Printing Officea

Featured

  • People
    Federal CIO Suzette Kent

    Federal CIO Kent to exit in July

    During her tenure, Suzette Kent pushed on policies including Trusted Internet Connection, identity management and the creation of the Chief Data Officers Council

  • Defense
    Essye Miller, Director at Defense Information Management, speaks during the Breaking the Gender Barrier panel at the Air Space, Cyber Conference in National Harbor, Md., Sept. 19, 2017. (U.S. Air Force photo/Staff Sgt. Chad Trujillo)

    Essye Miller: The exit interview

    Essye Miller, DOD's outgoing principal deputy CIO, talks about COVID, the state of the tech workforce and the hard conversations DOD has to have to prepare personnel for the future.

Stay Connected

FCW INSIDER

Sign up for our newsletter.

I agree to this site's Privacy Policy.