XML allows Census to change directions

Managing a growing mound of information has been an ongoing concern for the Census Bureau. The agency collects data from various internal and external data feeds and needs to enable government agencies and citizens to work with it in a variety of formats, from flat files to CD-ROM to Web interfaces.

In the late 1990s, the bureau began building a corporate metadata repository,

a central database that would identify where all of its survey information is located, what specific records are in each database or file, and what format the information uses.

Once the agency decided to place the information in the repository, it needed to provide internal and external users with an easy way to access the data. "Quickly, [Extensible Markup Language] emerged as the best method of making information available to different applications and users," said Samuel Highsmith, a principal researcher at the bureau.

The Census Bureau relies on Oracle Corp. products for its primary database and Web application server, and it used the Oracle XML Development Kit to design the repository. The agency developed a common Web-interface so that applications could create, edit, browse and exchange metadata information.

Rather than try to tackle putting all of the needed distributed security components directly into its applications, the agency opted to break them up. The production systems are inside its firewalls closing them to outside interference, and a copy of various Census files is placed outside so that other agencies or citizens can access them.

"It may not be an optimal setup, but it has worked pretty well to date," Highsmith said. The downside is that it requires the agency to maintain multiple copies of the information and ensure they are in sync. The agency would prefer to let outsiders directly into its main systems but didn't think XML security was robust enough to warrant taking that step.

The first application to take advantage of the features was the 2002 Economic

Census, which consists of 450 surveys.

A second beneficiary is American FactFinder, which provides data from Census 2000 and related historical information to users via

a Web browser. For example, an agency or an individual can use the repository to

find out how many people are of a certain age in a city.


  • Congress
    Rep. Jim Langevin (D-R.I.) at the Hack the Capitol conference Sept. 20, 2018

    Jim Langevin's view from the Hill

    As chairman of of the Intelligence and Emerging Threats and Capabilities subcommittee of the House Armed Services Committe and a member of the House Homeland Security Committee, Rhode Island Democrat Jim Langevin is one of the most influential voices on cybersecurity in Congress.

  • Comment
    Pilot Class. The author and Barbie Flowers are first row third and second from right, respectively.

    How VA is disrupting tech delivery

    A former Digital Service specialist at the Department of Veterans Affairs explains efforts to transition government from a legacy "project" approach to a more user-centered "product" method.

Stay Connected


Sign up for our newsletter.

I agree to this site's Privacy Policy.