Everything’s on the record
Thanks to e-government, agencies have more to keep track of. Electronic records management helps decide what to archive—and what to toss
As the business of government, like that of the rest of the world, is increasingly done digitally, the task of managing official records becomes increasingly important. It isn’t just the volume of information that’s changing; oversight required to manage electronic records is also increasing.
“Government records officers have a huge challenge,” said L. Reynolds Cahoon, CIO for the National Archives and Records Administration. “As more federal records are created electronically, they need to work with analysts and business process designers to build records management right into the business process[es] as they’re being designed.”
Agencies at all levels face similar problems. Financial accountability regulations, Privacy Act requirements, and even requirements for accessibility to government services under Section 508 of the Americans with Disabilities Act are making not just retention but also creation of records an important architectural consideration. It’s one that needs to be taken into account in an agency’s enterprise architecture.No extra help
And records officers have to adapt to this new world without much hope of more hands coming to their aid. “We don’t see these organizations ramping up and hiring more people to handle records management,” said Frank McGovern, a product marketing specialist at FileNet Corp. of Costa Mesa, Calif., and a retired Air Force records officer. “You’re starting to see repositories with over a billion objects. How do you manage that?”
NARA remains a leader in establishing best practices in government records management. It is the managing partner of the Electronic Records Management e-Government initiative. As part of the initiative, the Environmental Protection Agency is working on a process for evaluating commercial, off-the-shelf solutions.
The software industry has moved away from standalone tools for records capture and is now creating records management platforms that integrate into the fabric of enterprise systems. “Vendors are componentizing their records management systems so they can be more deeply embedded,” Cahoon said.
The move to integrate records management with enterprise architecture is consistent across the public and private sectors. “Instead of buying a records management software package, [organizations] are looking to buy a suite of software that does Web collaboration, team forms and basic document management,” McGovern said.
Kathleen Kummer, head of the government business unit at Open Text Corp. of Waterloo, Ontario, sees it much the same way. “Enterprise suites are kind of the trend. We still also offer a standalone product, but from a technical architecture perspective, we’re moving toward a service-oriented architecture, based on reusable components.”
Open Text is one of the earliest adopters of JSR 170, a Java Community Process proposed standard for accessing content repositories in Java 2 Enterprise Edition, independent of system type. The standard API will help Open Text and other records management platforms integrate directly with applications based on J2EE—including major enterprise software platforms such as PeopleSoft, Oracle Applications Server, SAP, and IBM’s WebSphere and Domino programs.
Making records management as automatic as possible is the key, McGovern said.
“We use workflow to automate a lot of records management tasks. National Archives puts out a lot of requirements for vital records programs. If you depend on the records officer to occasionally remind people of those policies, how effective can that be? But you can automate those policies through workflow.
“When you depend on an end user to decide what’s a record, they make mistakes,” McGovern said. FileNet’s software uses a “zero-click” approach to capturing records, based on a combination of application events, metadata within documents, and integration with enterprise processes to capture records with no additional work on the end user’s part. E-mails, for example, are retained automatically based on rules at the server level, so users are not required to drag them to an archive folder.
Systems can be configured to automatically notify record owners when documents are up for destruction. E-mails can be automatically captured, based on metadata within headers or based on where they are addressed, and stored in the repository. And business transaction documents created by ERP systems can be captured as part of the business process.
Lubor Ptacek, director of product marketing for Hopkinton, Mass.-based EMC Corp.’s Documentum division, said Documentum’s platform goes even further with automation. Its system can also be trained to automatically classify content, indexing it based on an organization’s record taxonomy. “We support both automatic and aided classification,” he said. “The aided portion is important, because it usually takes a year to get to full automation, to tune the rules.”
In the majority of modern electronic records management and enterprise content management systems—the heart of most records platforms—metadata is collected from records based either on tags within them or on other information collected at the time they are created, and stored in a relational or Extended Markup Language database.
Most current systems store records in their original format or in a specified archival format such as Portable Document Format. But the longevity of file formats is a major issue for federal records.
Many federal records must be retained permanently, yet the technologies tied to their creation are ephemeral at best. There are over 14,000 different file formats in use today within the federal government, according to Cahoon.
NARA is currently building the Electronic Records Archive under a contract awarded last August to Harris Corp. of Melbourne, Fla., and Lockheed Martin Corp. The challenge is this: The new system must be able to present documents without relying on the applications that originally created them.
Delivery of the system is at least three years away, and just what that platform-independent format will be remains to be seen. One of the possible solutions is Portable Document Format/Archive, a standard recently ratified by the International Organization for Standardization. PDF/A is a bare-bones, platform-independent version of PDF based on Version 1.4 of Adobe Systems Inc.’s public-domain specification. PDF/A was developed with help from government agencies that have large-scale document retention needs, such as the Internal Revenue Service and the U.S. Courts. The courts moved to PDF/A as their preferred electronic archiving format after discovering that older PDFs were no longer readable by many Acrobat Readers because of the proprietary LZWDecode compression they used.
But because it is a stripped-down version of PDF, PDF/A may not meet everyone’s needs all the time. And according to NARA officials, while PDF/A might be fine for agency storage, it doesn’t currently meet NARA’s standards for transmitting records to the archives.
“The stuff they had to take out of Adobe to guarantee readability removed a lot of functionality,” said Paul Chan, vice president of marketing at PureEdge Solutions Inc., an XML-based electronic forms vendor based in Victoria, British Columbia.
The Army and Air Force are using PureEdge’s software to create forms for their business processes. The functionality and business logic behind the forms are written into them with a declarative programming language. Since the forms are pure XML, it is easier to perform searches on the metadata within them.
NARA hasn’t yet endorsed a specific archival format, but agencies should keep an eye on the agency’s work in formulating e-record guidelines.
Without help, according to Cahoon, records officers might find themselves overwhelmed by the size of their management task.S. Michael Gallagher is an independent technology consultant based in Baltimore.