To manage big data, go beyond IT

Big Data

Chipping away at big data takes special talents -- and they're not always in the IT department.

If you have a big data program to manage, you need a talented IT person, right?

Not necessarily. Big data brings its own challenges and calls for some very specific skills, according to members of a panel discussion at a Dec. 3 hosted by Nextgov.

Although data has always been around, the amount seen today is what makes management so challenging, said Micheline Casey, principal at CDO, LLC and former chief data officer for the State of Colorado.

“The other thing that’s critical,” she said, “is that we have this nexus of what’s happening with the interconnected systems. The dynamic pace of change of innovation that’s just really requiring people to think proactively and not just reactively about data and how manage data assets.”

Big data is just not about the volume of information, said Jeff Butler, director of research databases at IRS Research, Analysis and Statistics, but the many variations of it that add to the challenge. These new types of data, whether unstructured or textual, are forcing a reevaluation of skills in analyzing those variations, he said.

Also, Butler said, organizations traditionally used to manage data in monthly, quarterly or annual cycles. Today, that tends to be done in real-time, or close to it. The question then for agencies on the federal level, he said, becomes how to adapt to those real-time analytics models?

Today “data just happens” and is generated without much thought, said Michael Rappa, director of the Institute for Advanced Analytics and professor at North Carolina State University. Thirty years ago, organizations collected data with purpose and intent. The process required time, money and energy that needn’t be spent if there was no identified need for the data, he said.

Today, Rappa said, the challenges for those who manage and analyze all this data lies in how to do it in a predictive way and very quickly. “That’s the talent pool that’s not really in place,” he said. “That’s the career track that really doesn’t exist in most agencies just quite yet .”

The right kind of person to take on the challenge is often not an IT person, Rappa stressed. Agencies are now starting to work on figuring out what the operational role looks like, and the “data scientist” job title gets used a lot. However, the term is misleading because the person who handles the big data challenge “is not a scientist and they shouldn’t be totally fixated on data,” Rappa said.

Agencies need to be more data-centric and think of data upfront, not just as an afterthought, Casey said.

“Building a culture around data is very different from the culture virtually all state and federal agencies and commercial sector business have today, quite frankly,” she said. “It doesn’t matter if you get the best data scientists in a room; if no one across the organization knows what to do with the data, the insight they come up with doesn’t matter. Data is a business issue, not an IT issue.”

Agencies also need to create an organizational capacity to take in, consume and distribute data across their ecosystem to make use of the insights data scientists provide, Casey suggested.

“Otherwise it’s not doing anyone any good -- having a room full of data scientists and Hadoop clusters,” she said.

About the Author

Camille Tuutti is a former FCW staff writer who covered federal oversight and the workforce.

Cyber. Covered.

Government Cyber Insider tracks the technologies, policies, threats and emerging solutions that shape the cybersecurity landscape.


Reader comments

Wed, Dec 5, 2012 Paul DC

Try crunching a million rows of data into something sensible without IT skills.

Wed, Dec 5, 2012 earth

Liars, Damn Liars and Statisticians…There are a variety of principle component analysis, dimensionality reductions and other methods of reducing “big data’ into smaller data without losing much information. (much of your nervous system is a bandwidth reduction processor with minimal artifacts, Optical illusions and such what.) But the real trick is knowing the difference between the entropy of the data and the semantic value. Folks trained in IT or even communication science probably won’t know the importance of knowing the interpretation protocol or know of any methods to generate automated systems to intuit the interpretation protocol. Without the right interpretation protocol your data compression has to compress the noise along with the signal.
Hire the people that can understand what I just said.

Wed, Dec 5, 2012 John Schutz Denver

Knowing that an expert in the arena would be best suited to utilize data pertinent to the mission is rudimentary. Putting IT types as the proper analysts of business data is not something I've seen as a solid or common practice in the real world. Then again, I've seen companies that were built by IT types get a good does of business practice (look at Netscape\Sun, TCI, and D.E.C. for instance (may they RIP)). Big data is just another way of slicing and dicing data. The game has always been what you do with the data.

Wed, Dec 5, 2012 John Schutz Denver, Colorado

It seems to me there is a large hyperbole mass on this topic. As stated, the issue is attempting to find meaning in the data. It is a reactive approach to gather data to then find out how to use it. When I worked with great data miners, they got in front of the data generators and tailored the input of the data they wanted, much as the current big data approach of what I would analogize as trash picking...or perhaps panning for gold, but still, doing so without preemptive data management practices (planning on what data is to be collected and why). They didn't care much of how the data was structured, as they would blend it into their warehouse for use as they saw fit, and produce the info-gems our leaders would use to our benefit. This was in the '90's. I do believe we are coming full circle, as stated in this article (Agencies need to be more data-centric and think of data upfront, not just as an afterthought, Casey said.) Could targeted data collection be the missing piece to big data? It would sure help relieve the burden of massive retention issues being faced. I guess my primary area of misunderstanding of all the hype is in with idea of looking for meaning past-tense, whether it be near real-time or quarterly - the game is the same as it has been for quite some time...we just have bigger and faster tools to deal with it.

Wed, Dec 5, 2012

You can have all the data and advanced technololgical tools in the world.......If you don't understand your processes, you are dead in the water.........Therein lies the true focal point of the big data issue.......

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above

More from 1105 Public Sector Media Group