Is every tweet and Facebook post worth archiving?

Web 2.0 content is ephemeral but not insignificant

Federal media managers probably felt a little relieved when the National Archives and Records Administration issued a report this fall that categorizes most federal blogs, wikis, and Facebook and Twitter postings as temporary in nature and not official records.

“NARA and agencies must recognize that the majority of Web 2.0 records … do not have permanent value,” agency officials wrote in “A Report on Federal Web 2.0 Use and Record Value.”

The respite was brief. Several weeks later, U.S. Archivist David Ferriero strongly cautioned that any social network or Web 2.0 content could be an official federal record if it met criteria for uniqueness or for business or policy value.

“The informal tone of the content … should not be confused with insignificance,” Ferriero wrote on his blog Nov. 2.

So now, after hearing that most Web 2.0 content is temporary, do managers need to comb through every tweet and Facebook comment to choose the nuggets that are fit for posterity?

“It is a little confusing,” said Chase Reeves, director of marketing at Iterasi, which provides records systems.

A records specialist was blunt: “It would be impossible to sort through 10,000 tweets a day to see what is affected." The specialist asked not to be named because he has contracts with federal agencies. "The federal Web 2.0 record policies are a moving target.”

I posed the questions to Paul Wester, NARA’s director of modern records programs, and Arian Ravanbakhsh, the agency’s electronic records policy analyst. They explained that agencies’ use of social media is evolving and that NARA is continuing to offer updated guidance.

They agreed that most Web 2.0 content is temporary, but as Wester put it, “We are trying to give direction to be able to identify the 1 percent to 3 percent that is a record.”

The clearest way for a federal agency to publish social media content that doesn’t need archiving is to publish only duplicative content on Web 2.0 platforms — for instance, agencies can post news releases on Facebook and Twitter that already appeared on the agency’s website, Wester and Ravanbakhsh said.

Initially, many agencies did just that. But now, they are finding that approach too restrictive and want to do more.

“The challenge is when the content is no longer repurposed and the tools are being used to develop new content,” Ravanbakhsh said.

“Agencies may not have thought through the record implications before starting a Twitter account,” Wester added. He advised having the agency’s CIO, records management official and relevant program office work together while getting started in social media.

Save Everything … or Nothing?

The duplicative nature of agencies’ initial Web 2.0 use is likely to diminish. After all, why would an agency bother with social networking if it simply replicates other content? It seems like a waste to ignore Web 2.0 platforms’ ability to collaborate and share information and feedback.

If an agency uses a collaborative Web 2.0 platform for an explicit business use or policy development purpose that meets the criteria of a record, the agency probably needs to manage that record, NARA officials said.

Another complication arises because of a general expectation that all content on the Internet is permanent. Federal agencies might fear removing content, even if it is temporary in nature and duplicative, because of that belief. But NARA’s view is that preserving everything is not the way to go.

“The idea that everything on the Internet will be there forever runs counter to the idea of managing content,” Ravanbakhsh said.

While NARA develops further guidance, Wester encouraged agencies to create their own disposition schedules for Web 2.0 records and submit them for NARA’s review under the Federal Records Act. However, NARA has a two-year backlog on reviewing such schedules and takes as long as a year to complete the reviews, according to a report from the Government Accountability Office released Oct. 27.

Meanwhile, data storage is inexpensive, and industry is urging agencies to store all Web 2.0 content while the policy debate continues. “Archiving everything is cheaper than hiring more people,” said Michael Riedyk, founder of PageFreezer.com, a provider of a website archiving solution.

But NARA officials warn that archiving data indiscriminately runs contrary to the philosophy of records management. That approach might be affordable in the short term, but in the long run, the costs of storing all information — and making it searchable and accessible — are unrealistic, Wester said.

It looks as though Web 2.0 archiving policies might take a few more months to develop. Depending on the results, maybe we’ll start seeing listings for federal jobs with titles such as tweet sorter and wiki comment reader.

2014 Rising Star Awards

Help us find the next generation of leaders in federal IT.

Reader comments

Tue, Nov 30, 2010

Archive it all. That way someone in the future can laugh at the waste of gov resources and the inane idiotic ravings that facebook, twitter, zinger, ziptext, doodler, squat (yeah, some of those names are probably not in use yet) attract. Besides, if it is not archived, then more clandestine business will get done in that direction to avoid archiving. Our upper elected officials are good at that.

Sat, Nov 20, 2010 Ron Ventura, Calif

I agree with what the anon poster stated. And as Bruce seemed to have sort of tried to point out in many words, a lot of garbage with a few nuggets of silver that need to be captured.

But I believe that it ALL should be archived along with a very public accounting of the cost for the doing as well as for the archiving. An interesting snapshot of the silliness of the "communications" mode for future posterity to chuckle over and to wonder at the cost to the government for the waste. (yes, I know that email has some of the same idiocies)

Thu, Nov 18, 2010

As far as I am concerned, government presence on Twitter, FB, et al, is FWA. Traditional hardcopy channels and web pages are more than sufficent for making data available. It hasn't been that many years since you had to subscribe to a newspaper of record, or travel to a depository library or reading room, to see public information. FB is nothing but a private mini-internet. If FB goes belly-up, what happens to all the content? Twitter is mostly like bathroom wall graffiti, and deserving of about as much respect and archiving. One sustained wireless outage or exploit, and people will see how un-trustwork and superfluous they are.

Thu, Nov 18, 2010 Bruce Falk Washington, DC

Federal communications policy should be segregated from its record-keeping policy, as they represent distinct challenges. It's in everyone's best interests for government to be both responsive and nimble, getting word out by the fastest and most convenient means, using all available channels. Difficult though it may be to do, archivists must find ways to identify, capture, and retain important information however it may happen to be promulgated. However, with the increasing sophistication of search technology, it might make more sense to focus immediately on establishing industry-standard formatting, tagging, storage, and retrieval protocols (not least to assure reasonable future access) than trying to identify what to keep and what to let pass. While putting such protocols in place should receive urgent attention, communications cannot be brought to a standstill. Full effort should be exerted to automatically capture and save everything that goes over the transom in whatever form, even though this means investing in long-term storage (and deletion) and migration strategy/solutions. Irrespective of the solutions adopted, automation is certainly the way to go. Semantic data tagging of captured text is already here, audio recognition/ transcription are being polished, and last to be perfected may well be visual recognition and motion capture/mapping (for video/image/3D capture/search capability). No doubt it will take time to get these ironed out, combined, and made workable on a day-to-day basis. What investment is there in securing the use of Wolfram Alpha and like technologies?

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above