Crawling for content

Government Printing Office officials are investigating the use of Web-crawler and data-mining technologies to capture government information published on the Web.

They need technology that can:

Find and capture government information on the Web in any format.

Examine file content and any metadata associated with the file.

Follow rules for capturing government information and avoid capturing information that fails to conform to the rules.

Tolerate rule changes as GPO officials gain a better understanding of the types of electronic information they need to preserve.

Perform automated comparisons between newly captured government information and information already stored in GPO's electronic repository to eliminate duplication.

Source: Government Printing Officea

Featured

  • FCW Perspectives
    remote workers (elenabsl/Shutterstock.com)

    Post-pandemic IT leadership

    The rush to maximum telework did more than showcase the importance of IT -- it also forced them to rethink their own operations.

  • Management
    shutterstock image By enzozo; photo ID: 319763930

    Where does the TMF Board go from here?

    With a $1 billion cash infusion, relaxed repayment guidelines and a surge in proposals from federal agencies, questions have been raised about whether the board overseeing the Technology Modernization Fund has been scaled to cope with its newfound popularity.

Stay Connected