A Gov 2.0 spin on archiving 2.0 data

Brand Niemann proposes a grass-roots approach to archiving social media records.

Brand Niemann is senior data scientist at Semanticommunity.net and former senior enterprise architect and data scientist at the Environmental Protection Agency.

Agencies that struggle to preserve their growing volume of social media records ought to look to an untapped resource: their own employees.

I have preserved the record of my work for the federal government for the past 30 years, especially the past 10 years. Based on that experience, I believe agency administrators, CIOs, chief knowledge officers and senior managers should encourage individuals in their organizations to be information architects and preservationists for their own information.

Employees can also serve as solutions architects by redesigning existing systems to move data into the cloud, as I have done for the Environmental Protection Agency and a number of communities of practice — for example, by bringing Taxonomy Tuesday discussions and Data.gov into the complete archive at the Semantic Community Web portal.

According to the National Archives and Records Administration, federal agencies must preserve Facebook and Twitter records that meet standard criteria for official records. A recent report published by the IBM Center for the Business of Government titled “How Federal Agencies Can Effectively Manage Records Created Using New Social Media Tools” provides recommendations for improving the management of social media records and offers a number of best practices.

But the options are still evolving. NARA is working on responding to the Obama administration’s Open Government Directive of December 2009 and has set up a blog to discuss related issues. Last year, I posted a comment on the blog asking how NARA would feel about having government employees move their desktop files into the cloud.

That approach would save infrastructure costs and increase collaboration — and it would provide a way to preserve the artifacts of feds’ careers so that when they retire, the public has a record. A NARA official said the agency is exploring those ideas.

I learned long ago that the Internet is one of the best storage places for my own work materials. Recently, when several floods of my EPA office caused serious disruptions to my colleagues, I was not affected because all my files were readily accessible online.

My model for this approach is based on my work at EPA and the classification of EPA and interagency information. Here are the steps I followed.

  1. I put all my critical e-mail messages, attachments, work products, Web pages, scanned documents, etc., into a wiki — first using the General Services Administration’s ColabWiki, then 40 MindTouch Deki wikis and now the MindTouch Technical Communication Suite.
  2. I developed an information classification system at EPA and created a living document version of the Census Bureau’s annual Statistical Abstract for organizing interagency information, which is the basis for organizing Data.gov.
  3. I developed an EPA ontology that organized and preserved the agency’s best content by topic, subtopic, data table, and data elements or dictionary.
  4. I developed a case study of the CIO Council that suggested how to organize the human resources and information across the entire government.
  5. I recently migrated everything I had done into a new environment, the Semanticommunity.net’s Community Infrastructure Sandbox for 2011. It includes the complete archive and tutorials of what I have learned from building each piece and what I need to create a new environment for multiple communities of practice to use in 2011.

I hope this approach can help other federal employees become information architects and preservationists for their agencies in 2011 and beyond.