Big data has big potential in the cloud

For the last several years, cloud computing has been one of the most-talked-about technologies of recent memory. But now big data is coming on strong. What do you get if you combine the two? The opportunity to save money, improve end-user satisfaction and use more of your data to its fullest extent.

This past January, IT heavyweights from the National Institute of Standards and Technology (NIST) as well as other government agencies, industry, and academia, got together to discuss the critical intersection of big data and the cloud. Although government agencies have been slower to adopt new technologies in the past, the event underscored the fact that the public sector is leading – and in some cases creating – big data innovation and adoption.

Cloud is a multiplier when used with other technology, and is capable of big things when it comes to big data, according to U.S. CIO Steven VanRoekel. Combining big data with cloud delivery and compute power might help create new industries and provide benefits to every citizen.

“The government is sitting on a treasure drove of data,” he said during a speech at the two-day NIST Joint Cloud and Big Data Workshop event. “We’re opening data, and looking at what we can do. We can greatly impact the lives of every American by just unlocking simple prices of data.”

He also pointed to the formation of companies built and founded completely on government data.

Nowhere to go but up

Research and consulting firm Deltek Inc. says cloud environments are “optimal” for using analytics since cloud providers are investing in the best analytical and visualization tools available today. In addition, big data projects require processing speed and the most up-to-date technologies, two other cloud provider specialties.

But most important, said Richard Blake, senior technical advisor in the Enterprise GWAC Division of the Integrated Technology Service at the General Services Administration’s Federal Acquisition Service, is that big data in the cloud enables something that is crucial for innovation, analysis and return-on-investment: The ability to share resources between agencies.

Evan Quinn, a senior principle analyst with research firm ESG agrees. “Big data is an incremental learning process and when you can share expertise and resources you can score more small wins and progress more quickly than you may have on your own,” he says.

As highlighted in Deltek’s Federal Big Data Outlook 2012-2017, data is available from a wide variety of sources, including agency data such as data logs, space telescopes, reconnaissance, citizen information, and mission critical apps. However, data is also being sourced from the outside as well -- in the form of email, text messages, pictures, embedded sensors, social media, GPS data, purchase data, traffic cams and market research. With the cloud, it becomes easier to store, analyze, and access this information.

In addition, the same data can be analyzed and used in different applications and analytic projects, since one agency no longer “owns” that data. With these tenets in mind going forward, every piece of data will be examined as a potential resource and used as the basis of future applications, explained VanRoekel during the conference.

One promising approach, he said, is to ensure that data is machine-readable – that is, that it can move from system to system without requiring human intervention or translation.

“Government agencies are ordered to look at the data they produce, catalog data, start to publish data, and think about machine-readable as the new default inside government,” VanRoekel explained. “Any time we’re building a new system, or amending a system, we focus on machine readability both on the collection, as well as getting agencies to develop [application programming interfaces] around their data.”

However, as promising as it is, agencies do not want to rush into putting big data in the cloud. Although it might mean that IT no longer has to worry about buying, maintaining, and managing those databases and associated storage, big data is a very specialized workload, says ESG’s Quinn. Agencies must ensure that their cloud providers can handle big data and that they understand the public sector’s security and compliance requirements, he said.

The other issue is expertise. Most if not all agencies are currently doing some form of business intelligence (BI) on their data, but what works for most organizations with BI is not going to work for big data. Expectations on a whole must be realistic and in line with what is currently technologically possible, said Quinn.

“If any agency thinks it’s going to go in and put $1 million and six months of work into a big data cloud project and change how the agency works from the inside out they are misinformed,” he said. “For most organizations this is a completely new discipline so they need to think in terms of small wins.”