Pro tips for using big data
- By Mark Rockwell
- Feb 25, 2014
As federal agencies begin incorporating substantial big-data capabilities into their organizations, they're grappling with some of the nitty-gritty details of how to do it. A couple of pro tips: Focus on the mission, and mix and match information from different sources to find innovative ways to use data.
Shawn Kingsberry, CIO at the Recovery Accountability and Transparency Board, advised federal agencies to "ignore the buzzword soup" of technology and focus on what they want to achieve with big-data applications.
"Technology can divert attention from the business needs," he said during a big-data conference in Washington on Feb. 25. "At the end of the day, you know the problem you're trying to solve, but sometimes we can't focus on that because we're worried about the latest buzzword."
A tight focus can lead to innovative thinking that yields useful big-data solutions without spending money on new technology. For instance, Kingsberry said his agency combined the Justice Department’s fraud indictment information with audits of big recipients of federal assistance to find data that indicated possible criminal activity.
Similarly, mixing and matching big databases helped the Social Security Administration develop datasets for verifying disability claims, said Herb Strauss, SSA's deputy CIO.
He said the agency combines its deep pool of information with outside databases such as LexisNexis to match property ownership records against the information supplied by claimants.
Agencies can learn a great deal by sharing information with one another and mining external data sources, Strauss added.
He said SSA has been working in a big-data environment since its origins in the 1930s, although the technological capabilities were quite different when the agency was first tasked with assigning and maintaining accounts for every American. That responsibility eventually expanded to include tracking survivor and disability benefits and other duties that increased the amounts of data SSA monitored.
Today, applying big-data technology and techniques is an ongoing process that requires continued attention. "It can't be like a cat fight, with 10 seconds of intense activity followed by a five-year pause," Strauss said.
Although SSA has been sifting data since the Great Depression, Kingsberry's agency has been at work only since the Great Recession.
He detailed the construction of the Recovery.gov site that tracks data on spending under the American Recovery and Reinvestment Act of 2009, which led audience members to ask about how to track unstructured data. Such information can present problems because some systems cannot process it uniformly. That sparked further discussion about the difficulties of sharing data, unstructured or otherwise, across agencies.
Kingsberry said his agency took responsibility for data input from the beginning to ensure that it was presented in a uniform manner.
He added that when he works with other agencies to access their data, he sets up memoranda of understanding that explicitly state what each agency expects and what their responsibilities are.
Mark Rockwell is a senior staff writer at FCW, whose beat focuses on acquisition, the Department of Homeland Security and the Department of Energy.
Before joining FCW, Rockwell was Washington correspondent for Government Security News, where he covered all aspects of homeland security from IT to detection dogs and border security. Over the last 25 years in Washington as a reporter, editor and correspondent, he has covered an increasingly wide array of high-tech issues for publications like Communications Week, Internet Week, Fiber Optics News, tele.com magazine and Wireless Week.
Rockwell received a Jesse H. Neal Award for his work covering telecommunications issues, and is a graduate of James Madison University.
Click here for previous articles by Rockwell.
Contact him at [email protected] or follow him on Twitter at @MRockwell4.