VanRoekel's open data strategy may have some limitations

While Federal Chief Information Officer Steven VanRoekel is a strong supporter of government-sponsored application programming interfaces, or APIs, to distribute federal data to the public, a major transparency group is suggesting that APIs alone can be limiting.

APIs are specifications that serve as interfaces between software. For example, Google Maps has an API that allows third-party websites to overlay restaurant reviews on top of Google Maps. APIs are what allow app developers to write apps that pull data from a database down to a user's smart phone or tablet computer. The API controls what information is accessible to the third-party sites or devices, and how it can be used. 

VanRoekel famously said, “Everything should be an API,” in April 2011, according to news reports, including a report by O'Reilly Radar. At that time VanRoekel was leading the redesign of the FCC.gov website in his previous role as FCC managing director.

He has continued to endorse APIs as a means for distributing government data from sites such as Data.gov. ”VanRoekel is vocal in his support of open data and the use of APIs,” wrote Justin Ellis in an article published March 13 by Nieman Journalism Lab. In the article, VanRoekel also spoke of APIs as providing a platform to users to experience government data in interesting ways, as an alternative to simply pushing out data through a 'dumb pipe.'  While he may not advocate APIs as the exclusive channel for providing data to the public, it's clearly his preferred course of first resort.

However, Sunlight Labs, an arm of the prominent transparency group Sunlight Foundation, suggests that APIs should be only one of the strategies for distributing government data -- and not necessarily the first choice. Sunlight did not connect their conclusions to VanRoekel specifically, but their thoughts do concern the technology VanRoekel champions.

APIs should be used to supplement, not displace, the availability of direct data downloads to the public, Eric Mill, Web and mobile developer for Sunlight, wrote in a March 21 blog entry on the Sunlight Labs Blog.

Mill presented several reasons why direct data downloads are still the best default strategy. For one, developing government APIs to distribute data might actually limit flexibility for users of the data, Mill wrote.

“There's no way to predict ahead of time the right data format and structure for every client who's interested in your data. Expect clients to need to transform your data for their own requirements, and for that transformation to require clients to first obtain all of your data,” Mill wrote.

In addition, creating APIs is time-consuming, expensive and difficult for government agencies, he added.

“Providing bulk access is several orders of magnitude less work on the part of the provider than building and maintaining an API,” Mill wrote in the blog. “An API is a system you need to design, create, keep running, attend to, and worry about. Bulk data access is uploading some files, forgetting about it, and letting [hyper text transfer protocol, or] HTTP do the work.”

Overall, government should take a nuanced view of APIs, Mill recommended. VanRoekel did not immediately respond to a request for comment.

“To sum up, Sunlight is pro-API - we make our own, and we welcome them from the government when they enhance access to information (the FederalRegister.gov API is a particularly good example),” Mill wrote. “However, the first step government should take, in nearly all cases, is to offer the data directly and in bulk.”

About the Author

Alice Lipowicz is a staff writer covering government 2.0, homeland security and other IT policies for Federal Computer Week.

Who's Fed 100-worthy?

Nominations are now open for the 2015 Federal 100 awards. Get the details and submit your picks!

Featured

Reader comments

Sun, Mar 25, 2012 Paul Wilkinson San Diego

Good question, John. As I understand it, Sunlight and others, are seeking what should be a lighter lift than what every U.S. public company is already required to do, i.e. make audited financial statements and a selection of other key data available in machine-readable format. Historical data isn't necessarily obsolete upon delivery. At least it isn't with respect to public companies, since the vast majority of disclosure is historical. A standardized federal (or government) taxonomy, similar to the standardized taxonomies used by public companies, would make it possible to better coordinate government priorities and give taxpayers and lenders to the government better insight into the use of their resources. See http://xbrl.sec.gov/ for an overview of how it works in the private sector. States are using similar systems, but bringing such transparency to the federal level would be a sea change in representative collaboration to achieve public policy goals.

Sat, Mar 24, 2012 Justin Houk

A quick trip to the FCC developer websites will reveal that much of the data they offer via API is also offered via bulk download

Fri, Mar 23, 2012 also

also creating GOOD APIs are hard its very much of an art and science couple with lots of interactivity with the developer community

Fri, Mar 23, 2012 John Denver

I'm not at all understanding what Sunlight is asking for - dumps of entire databases on demand? As soon as the data is transferred to a client, it will be obsolete. Also - can you attempt to scope the requirements of such a design? We'd need massive network pipes to all clients just to effect the 'obsolete on arrival' download. API access to data is PLENTY good enough for transparancy, and the API solution is based in reality...I'm not sure what Sunlight is thinking.

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above