How cloud can take open data to new heights

By removing the storage and transmission barriers, agencies are seeing the use of their datasets explode.

Shutterstock image: cloud technology connections.

According to IDC, the volume of data being produced each day is growing at an explosive rate, and is now estimated at 2.5 exabytes. That’s equivalent to producing 250,000 Libraries of Congress every single day.

Much of it is "open data," which means the data can be used by anyone for any purpose without needing to pay a licensing fee. The availability of this data is a boon for entrepreneurs, scientists and public servants, who can use it to create new products, accelerate scientific discovery and provide better services to the public.

Unfortunately, the infrastructure usually used to serve this data is not keeping up with growth. Government data is still being provided with the assumption that users will download and store their own copies of data. That’s fine when a few gigabytes of data are being shared, but as data volumes increase, this approach simply doesn’t work.

For example, the National Oceanic Atmospheric Administration’s new weather satellite, GOES-16, is estimated to produce one terabyte of data per day -- over 100 times the amount of its predecessor. Very few people have the hard disks or patience to download a terabyte of data. Using this old model of data distribution reduces the data’s value to end-users, and ultimately to the taxpayer.

This is why the cloud is emerging as the center of gravity for all big data analyses.

Once the data is made available in the cloud, anyone who wants to use it no longer needs to buy the hard drive capacity and spend months downloading the data. Interested users can instead use on-demand computing resources in the cloud to query as much, or as little, of the data as they need. When their analysis is done, they can save the results, turn off the virtual servers and not have to worry about paying for an individual copy of the original data.

Through the Amazon Web Services Public Datasets program, we host some of the world's most valuable open datasets and show off what’s possible when data is made available in the cloud. We try to think about what would be possible if people had fast, programmatic access to data, and the computing resources to analyze it. We've seen some really cutting edge results through these initiatives.

A new look at landsat

Since 1971, Landsat satellites have produced the longest continuous record of Earth’s land surface as seen through space. These images have been available at no cost directly from the United States Geological Survey since 2010. However, many people were limited by their ability to download and store significant quantities. We talked to many end-users and learned that they had big ideas of what could be done with Landsat data, but couldn’t get it fast enough or couldn’t afford to store their own copies.

Amazon started hosting imagery from the Landsat 8 satellite in 2015. The response was astonishing. Within the first year, over 1 billion requests for Landsat imagery and metadata were logged from 147 countries. Businesses like Esri, Mapbox and Mathworks immediately created tools to take advantage of the new easy-to-access Landsat archive.

One of the most interesting developments has been how novices and amateurs have been able to create entirely new interfaces and tools to explore and analyze the data. A group of students from Code Fellows created Snapsat -- a fast and completely novel web-based service to browse and interact with Landsat imagery. An independent developer in Melbourne, Australia, even created an iPhone app, giving people the ability to access tremendous amounts of data on how the Earth has changed over time by simply reaching into their pocket.

NEXRAD Opening New Research Frontiers

NOAA recognized early on that the cloud would be essential to fulfill their mission, and in 2015, they entered into a research agreement with several cloud service providers to explore ways to drive usage of their data. Through that agreement, we have made several hundred terabytes of high-resolution NEXRAD radar data available in the cloud.

Similar to the response we saw with Landsat, the usage of NEXRAD data has been impressive. After making NEXRAD data available in the cloud, NOAA recorded a 130 percent spike in usage of the data, while simultaneously seeing a 50 percent decrease in the usage of their own servers.

This open data initiative has also made the full NEXRAD archive available on demand, creating new analysis and discovery possibilities. For example, Dr. Eli Bridge at the University of Oklahoma has leveraged this public dataset to compile radar data to estimate the size of Purple Martin bird roosts.

These birds form large, dense aggregations that appear as ring-shaped patterns on the radar images. Now that the researchers no longer have to make requests for individual scans and receive chunks of data at a time, the University of Oklahoma team is able to learn how the birds are responding to droughts, environmental change, and seasonal queues. This is an example of “latent research” -- that is, research that has existed in the minds of researchers, but hasn’t been possible because of restricted data access.

Big (Census) data

Most recently, the Census Bureau discovered that increasing access to big data can lead to increased usage. Previously, the agency's American Community survey data was only available on tabular file formats like CSV, which required days to access and then required a separate reference document to be able to make any sense of it.

Now that it has been uploaded in bulk to the cloud, anyone can access and analyze the entire dataset for about 40¢ an hour. Additional partners such as National Science Foundation have pitched in to further boost access to the data. They have even provided instructions on how to analyze the data using an open source graph database engine.

The open data era is just getting started

Once uploaded to the public cloud, the potential use cases for open data are virtually endless, and government agencies are just getting started. Governments around the world are investing billions in new sensors, ranging from Internet of Things devices in parking meters to Earth-observing satellites, which are producing huge volumes of data.

The best way to get a return on these investments is to make it easy for innovators across the country to access the data and put it to work. With a modernized data distribution method and some imagination, open data can be unleashed to become a tremendous force for the public good.

NEXT STORY: The future of FOIA

X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.