Big data

Pro tips for using big data

Placeholder Image for Article Template

As federal agencies begin incorporating substantial big-data capabilities into their organizations, they're grappling with some of the nitty-gritty details of how to do it. A couple of pro tips: Focus on the mission, and mix and match information from different sources to find innovative ways to use data.

Shawn Kingsberry, CIO at the Recovery Accountability and Transparency Board, advised federal agencies to "ignore the buzzword soup" of technology and focus on what they want to achieve with big-data applications.

"Technology can divert attention from the business needs," he said during a big-data conference in Washington on Feb. 25. "At the end of the day, you know the problem you're trying to solve, but sometimes we can't focus on that because we're worried about the latest buzzword."

A tight focus can lead to innovative thinking that yields useful big-data solutions without spending money on new technology. For instance, Kingsberry said his agency combined the Justice Department’s fraud indictment information with audits of big recipients of federal assistance to find data that indicated possible criminal activity.

Similarly, mixing and matching big databases helped the Social Security Administration develop datasets for verifying disability claims, said Herb Strauss, SSA's deputy CIO.

He said the agency combines its deep pool of information with outside databases such as LexisNexis to match property ownership records against the information supplied by claimants.

Agencies can learn a great deal by sharing information with one another and mining external data sources, Strauss added.

He said SSA has been working in a big-data environment since its origins in the 1930s, although the technological capabilities were quite different when the agency was first tasked with assigning and maintaining accounts for every American. That responsibility eventually expanded to include tracking survivor and disability benefits and other duties that increased the amounts of data SSA monitored.

Today, applying big-data technology and techniques is an ongoing process that requires continued attention. "It can't be like a cat fight, with 10 seconds of intense activity followed by a five-year pause," Strauss said.

Although SSA has been sifting data since the Great Depression, Kingsberry's agency has been at work only since the Great Recession.

He detailed the construction of the site that tracks data on spending under the American Recovery and Reinvestment Act of 2009, which led audience members to ask about how to track unstructured data. Such information can present problems because some systems cannot process it uniformly. That sparked further discussion about the difficulties of sharing data, unstructured or otherwise, across agencies.

Kingsberry said his agency took responsibility for data input from the beginning to ensure that it was presented in a uniform manner.

He added that when he works with other agencies to access their data, he sets up memoranda of understanding that explicitly state what each agency expects and what their responsibilities are.

About the Author

Mark Rockwell is a senior staff writer at FCW, whose beat focuses on acquisition, the Department of Homeland Security and the Department of Energy.

Before joining FCW, Rockwell was Washington correspondent for Government Security News, where he covered all aspects of homeland security from IT to detection dogs and border security. Over the last 25 years in Washington as a reporter, editor and correspondent, he has covered an increasingly wide array of high-tech issues for publications like Communications Week, Internet Week, Fiber Optics News, magazine and Wireless Week.

Rockwell received a Jesse H. Neal Award for his work covering telecommunications issues, and is a graduate of James Madison University.

Click here for previous articles by Rockwell. Contact him at or follow him on Twitter at @MRockwell4.


  • Defense
    Ryan D. McCarthy being sworn in as Army Secretary Oct. 10, 2019. (Photo credit: Sgt. Dana Clarke/U.S. Army)

    Army wants to spend nearly $1B on cloud, data by 2025

    Army Secretary Ryan McCarthy said lack of funding or a potential delay in the JEDI cloud bid "strikes to the heart of our concern."

  • Congress
    Rep. Jim Langevin (D-R.I.) at the Hack the Capitol conference Sept. 20, 2018

    Jim Langevin's view from the Hill

    As chairman of of the Intelligence and Emerging Threats and Capabilities subcommittee of the House Armed Services Committe and a member of the House Homeland Security Committee, Rhode Island Democrat Jim Langevin is one of the most influential voices on cybersecurity in Congress.

Stay Connected


Sign up for our newsletter.

I agree to this site's Privacy Policy.