XML allows Census to change directions

Managing a growing mound of information has been an ongoing concern for the Census Bureau. The agency collects data from various internal and external data feeds and needs to enable government agencies and citizens to work with it in a variety of formats, from flat files to CD-ROM to Web interfaces.

In the late 1990s, the bureau began building a corporate metadata repository,

a central database that would identify where all of its survey information is located, what specific records are in each database or file, and what format the information uses.

Once the agency decided to place the information in the repository, it needed to provide internal and external users with an easy way to access the data. "Quickly, [Extensible Markup Language] emerged as the best method of making information available to different applications and users," said Samuel Highsmith, a principal researcher at the bureau.

The Census Bureau relies on Oracle Corp. products for its primary database and Web application server, and it used the Oracle XML Development Kit to design the repository. The agency developed a common Web-interface so that applications could create, edit, browse and exchange metadata information.

Rather than try to tackle putting all of the needed distributed security components directly into its applications, the agency opted to break them up. The production systems are inside its firewalls closing them to outside interference, and a copy of various Census files is placed outside so that other agencies or citizens can access them.

"It may not be an optimal setup, but it has worked pretty well to date," Highsmith said. The downside is that it requires the agency to maintain multiple copies of the information and ensure they are in sync. The agency would prefer to let outsiders directly into its main systems but didn't think XML security was robust enough to warrant taking that step.

The first application to take advantage of the features was the 2002 Economic

Census, which consists of 450 surveys.

A second beneficiary is American FactFinder, which provides data from Census 2000 and related historical information to users via

a Web browser. For example, an agency or an individual can use the repository to

find out how many people are of a certain age in a city.


  • FCW Perspectives
    remote workers (elenabsl/Shutterstock.com)

    Post-pandemic IT leadership

    The rush to maximum telework did more than showcase the importance of IT -- it also forced them to rethink their own operations.

  • Management
    shutterstock image By enzozo; photo ID: 319763930

    Where does the TMF Board go from here?

    With a $1 billion cash infusion, relaxed repayment guidelines and a surge in proposals from federal agencies, questions have been raised about whether the board overseeing the Technology Modernization Fund has been scaled to cope with its newfound popularity.

Stay Connected