Digital Government

Smithsonian transcription project moves out of beta

Volunteers are powering an effort by the Smithsonian Institution to create online, searchable versions of its vast collections of diaries, journals, biological specimens and other historical gems. After more than a year of testing, the project came out of beta Aug. 12, with officials inviting the public to join in the massive transcription and labeling effort.

The Transcription Center attracted about 1,000 active volunteers during its testing phase, and that group has grown by more than 800 since the public launch, according to project coordinator Meghan Ferriter. Volunteers dive into a variety of projects, including transcribing texts that are often handwritten and occasionally in languages other than English. Volunteers also review submitted work before it is published.

So far, more than 13,000 transcribed pages have been produced, and several projects have been completed, including the archives of the Monuments, Fine Arts and Archives Section, popularized in the book and film "The Monuments Men," and the Charles Henry Hart autograph collection, which includes letters from notable artists and sculptors.

Some projects focus on gathering the full text of diaries, notebooks and other primary source material. Other projects, such as cataloging specimen records for the U.S. National Herbarium, focus on collecting structured data.

There's no shortage of work. Sylvia Orli, an information manager in the Department of Botany at the National Museum of Natural History, estimates that it would take about 110 years, at current rates, to digitize the 3.5 million uncataloged items in the National Herbarium's collection of 5 million specimens.

In a panel discussion about the Transcription Center at the annual meeting of the Society of American Archivists on Aug. 15, Orli said she hoped the volunteer effort would greatly improve the pace of digitizing the collection.

Ferriter leads the push to promote the Transcription Center on social media and keep in touch with volunteers, who, she said, are morphing into a self-sustaining community -- answering one another's questions and providing help via Twitter and other platforms.

Inside individual projects, volunteers can share notes on specific challenges, such as rendering marginal notes or interpreting scientific symbols. Ferriter also sees the possibility of communities of interest springing up around individual projects. For instance, a project to transcribe the diary of Earl Shaffer, the first man to walk the Appalachian Trail in one continuous hike, was completed in just two weeks thanks to a Reddit group that generated volunteers. Although that process happened organically, social media could be used to seed interest in projects.

The Transcription Center was created in Drupal, but the source code hasn't been released. Although the front-end display and the back-end collection are open source, customization was required to connect it with the Smithsonian's enterprise system. But the Smithsonian is prepared to help libraries, institutions and museums that want to launch a similar service by providing code and support in the future, Ferriter said.

About the Author

Adam Mazmanian is executive editor of FCW.

Before joining the editing team, Mazmanian was an FCW staff writer covering Congress, government-wide technology policy and the Department of Veterans Affairs. Prior to joining FCW, Mazmanian was technology correspondent for National Journal and served in a variety of editorial roles at B2B news service SmartBrief. Mazmanian has contributed reviews and articles to the Washington Post, the Washington City Paper, Newsday, New York Press, Architect Magazine and other publications.

Click here for previous articles by Mazmanian. Connect with him on Twitter at @thisismaz.

Nominate Today!

Nominations for the 2018 Federal 100 Awards are now being accepted, and are due by Dec. 23. 


Reader comments

Mon, Oct 26, 2015

I really excited for your Volunteers project.If you want to something for Audio Transcription Agencies for audio and video transcription services we provide free trail and also provide 24/7. Using the newest systems your computer data and communications are protected from unauthorized entry. All communications with our server FTP and Browser based are SSL encrypted guaranteeing security's very best level.

Tue, Sep 8, 2015 Burt NewYork

It statement from the audio moment, because our customers more often than not understand just how extended movie record or their sound is. You’ll obtain an estimation by if that estimation must be transformed whenever you add your saving, your Bill Government may contact one to make sure it’s okay to create that change. Below, please look for a total clarification of our transcription prices; you may also utilize our rates calculator to calculate your transcription task if you’d like.

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above

More from 1105 Public Sector Media Group