Sizing up big data: 4 ways to succeed

Big data, like other disruptive technologies, has a high failure rate today, with many CIOs missing great opportunities to make the most of the data they have. But four best practices can improve the odds of your new or existing big data projects.

Big data has been around long enough that there have been very notable successes – and resounding failures – related to its use

Today 81 percent of IT executives surveyed say that big data is a top-five IT priority for 2013 and only 6 percent of IT shops do not have big data on their top 10 priorities list. But 55 percent of respondents also say that they have already had big data failures – projects that were not completed or fell short of their objectives.

There are myriad reasons why big data projects fail, according to a recent industry report. The top two are a lack of expertise to connect the dots and a lack of business context around data. But nearly every organization faces similar challenges, including finding the right tools, dealing with time constraints, understanding the platforms and training staff.

The good news is that there are things that IT executives can do ahead of time to improve their success rate and end user satisfaction related to big data. Here the top four strategies that can help organizations get the most out of their big data efforts.

1) Get the scope right

According to the report, 58 percent of respondents say inaccurate scope is responsible for their failed big data projects. Alex Rossino, principle research analyst at Deltek, says that the bigger and more unlimited the mission of an organization is, the more complex its data requirements will be – and consequently the more work it will take to get the scope right.

“For example, if you think about traffic enforcement in Suffolk County, N.Y. a big data project might require making sure that data collected on red light cameras, ticket issues and any related data is what is needed, and the rest would theoretically be discarded,” he says. “However, agencies run into trouble when they start with one outcome in mind and it spirals past what they have already decided is mandatory to collect and analyze.”

Rossino suggests talking to everyone who might be affected by the analysis, from the secretary or commissioner of an agency down to the heads of offices and programs. “It needs to be discussed and then kicked over to the CIO to determine the resources that are needed to make it happen and put in writing so that no one is expecting more out of a project than has previously been discussed.”

That’s not to say a project’s scope might not change at a future date. But by sticking to the agreed upon scope – at least in the beginning – agencies and organizations are more likely to find success.

2) Get the business users involved

In the case of big data, success hinges on producing information that is of value. So it only makes sense to involve the people who will be using that information.

“IT must recognize that big data means something different to every business and IT user,” says Evan Quinn, a senior principle analyst with research firm Enterprise Strategy Group (ESG). “The first question every agency needs to ask itself is, ‘What am I trying to get out of this. What is the value?’”

The value should be reflected in the queries that people make, said Mukul Krishna, director for digital media at Frost and Sullivan, a market research firm. People might run ad hoc queries, but on the whole the payoff comes from asking the right questions.

It’s also important to have someone who can sift through the results and understand what it means. Some agencies might find themselves adding a chief analytics officer that has both business and IT knowledge so they can understand how to turn raw data into specific deliverables.

“Data is only worth something if IT can work with someone who understands the agency’s mission and can explain why it even needs a big data project to begin with,” Krishna said.

3) Hire the right talent

In an August 2012 1105 Government Information Group survey, more than half of nearly 200 government agencies reported that they are having difficulty finding and keeping knowledge workers and data scientists for their big data efforts. This challenge will only get worse as private and public organizations expand their big data efforts.

The difficulty, says David Loshin, president of Knowledge Integrity Inc., a consultancy that focuses on business data management advice and guidance, is that big data is a departure for most IT professionals. “There is a major learning curve to understand the opportunities that big data affords,” he said.

That’s not to say every agency will need to hire new data scientists or mathematicians. Loshin suggests identifying the trainable staff within your current organization by looking for existing skills such as a love for statistics and training in computer science. A security background is also important since some of the data that your teams will be using may be classified or contain personal identification.

4) Size the infrastructure right

Those agencies and organizations that are doing big data analysis on-site will need to make sure that they have the necessary storage and compute power, says Loshin.

“Tiered storage may work well for some since you have the capacity to flow data between disk and share memory,” Loshin said. “The most important thing is to make sure that, if you’re pulling data from multiple sources, performance does not lag.” Evaluating network bandwidth should also be on your to-do list since low latency streaming will be key to end user satisfaction.

Many organizations may end up underestimating the amount of storage they will need, said Mike Gualtieri, an analyst with Forrester Research. “You may know how much data you have, but you might not realize that you need to duplicate everything in an analytical sandbox for those who want to do advanced analysis,” he said.

Gualtieri suggests that clustered systems can help. Still, many organizations may find that cloud-based services are the easiest way to provide the scaling and elasticity needed when dealing with big data.

“For very large data sets it might be more economic, especially if you’re launching something completely new,” he says. “You won’t have to invest millions of dollars if you’re keeping everything in the cloud.”