NARA joins document management test bed
- By Brian Robinson
- May 24, 1998
The National Archives and Records Administration plans to take part in an advanced computer networking project that could point the way to how it will manage future federal government document handling and storage.
The Distributed Object Computation Testbed (DOCT) brings together supercomputers, large electronic archival technology and high-speed wide-area networks to develop solutions for handling massive amounts of data that is stored at sites nationwide.
DOCT is the result of an $8.4 million contract that the Defense Advanced Research Projects Agency, in partnership with the Patent and Trademark Office, awarded to Science Applications International Corp. and the San Diego Supercomputer Center (SDSC) in mid-1996. The DOCT has allowed PTO to test concepts the agency developed as part of its strategy for creating a paperless process for submitting patent requests. Following last fall's court decision that may force agencies to maintain more extensive electronic records, NARA plans to use the test bed to experiment with similar concepts in records management.
Ken Thibodeau, director for NARA's Center for Electronic Records, last week confirmed his agency's involvement in the DOCT. "We saw the results the PTO got from the DOCT, and that's what got us interested," he said. Thibodeau declined to elaborate before NARA made a formal announcement.
SDSC and SAIC focused from the start on PTO's need to manage documents created in the filing and updating of patent applications. DARPA is interested in leading-edge, distributed computing technologies that may develop from the project.
The test bed hardware was provided by existing supercomputer, archival storage and networking systems at various university and federal institutions. Original development centered on object-relational database interfaces to high-performance storage systems, text retrieval software, and complex and distributed document support.
The DOCT will rely on the use of the industry standard, Internet-based Standard Generalized Markup Language (SGML) for the large-scale conversion of PTO's patent data. The project also provides proof-of-concept for an "electronic mail room" at the agency, for handling the flow of patent requests submitted electronically. The first elements of the electronic mail room may be in place later this year.
Lawrence Cogut, manager of infrastructure engineering for PTO, said the DOCT provided for:
- The design and implementation of an electronic patent filing system.
- The demonstration of the feasibility of an electronic mail room that would allow a patent application to be securely registered.
- The demonstration of how large amounts of data could be converted to SGML.
- The modeling of a complete electronic patent office, showing how a patent would be handled and tracked from the electronic mail room through to final publication.
"Because of the DOCT, we feel we have a jump-start on electronic filing, the electronic mail room and on SGML conversion," he said. "We expect to put at least some of the initial processes of the mail room into place later this year."
NARA also is considering electronic records management, he said, and a standard way to archive records, possibly using the process PTO used for its electronic patent documents. "They are pleased we settled on SGML, for example," Cogut said. "After all, the name of the game is interoperability so that agencies can seamlessly swap documents."
NARA particularly wants to test archiving solutions using its own protocols and data, said Chaitanya Baru, manager of the data intensive computing group at SDSC. However, while the testing process may be similar to that conducted for PTO, Baru said NARA's expectations are somewhat different. "The PTO has just a 17-year longevity on patents to deal with, whereas NARA is faced with an indefinite time for storage of its data, so the problems will be somewhat different," he said. "We now have a draft document from NARA about investigation of indefinite storage."
The first official period of the DOCT contract ended late last year, but excess funds allowed an extension into 1999. That will allow for other work to be completed as well as for participation by other agencies. The Nuclear Regulatory Commission and other agencies have expressed interest in the DOCT, Baru said, but they claimed a lack of funding as their reason for not being able to participate.
Robinson is a free-lance journalist based in Portland, Ore. He can be reached at
Brian Robinson is a freelance writer based in Portland, Ore.