Beyond the buzz of big data

The Army and the SEC are two agencies putting big data to use right now.

Big Data word graphic

A lot of big-data buzz centers on potential, hinting at how federal agencies and organizations might someday make sense of their information or find the oft-mentioned needle in the haystack of data.

Meanwhile, some agencies are actually doing big data to great effect.

The Securities and Exchange Commission, for example, collects about 1 billion records every day with its Market Information Data Analytics System (MIDAS) platform.

The trading data comes from more than 10 different market exchange feeds and totals some 23 petabytes per year, meaning analysis for anomalies like the tell-tale signs of insider trading would be impossible for even a large staff of people to detect, according to SEC CIO Thomas Bayer.

MIDAS makes the analysis unfathomably easier and faster – in near real-time – allowing for both a detailed understanding of the current market and long-term trends.

"When you look at what we’ve been able to do analyzing that data, it's taken a lot of mystery out of what the market is perceived to be doing," said Bayer, speaking at the Symantec Government Symposium on March 11.

The SEC's needs continue to grow. As an example, Bayer said, the agency dissects 9 million page reports, impossible without automated analysis allowed in the big data age. Soon, the agency will collect even more, on the order of 2 petabytes of market data per day. Humans only handle that kind of data after advanced analytics carve it down. If sketchy trends – or needles in the haystacks – are found, "you can turn to humans to do further investigation and examination"Bayer said.

At the symposium, Lieut. Col. William Saxon, division chief and program director for the Army’s Enterprise Management Decision Support System Program, described the Army’s foray into big data. Today, the Army collects a wide assortment of information on "people, training, equipment and installations," and runs metrics against that data to produce another data store for its "readiness" effort.

The big data challenge for the Army was always in the "access and discovery of it," Saxon said, mostly because the Army has 3,600 different systems on which it stores information. In recent years, "it was a challenge to bring all that data to one place so that a handful of action officers in a dark room somewhere" could analyze it. Thanks to a focused evolution of its big data systems, Saxon said, that handful of action officers "can turn answers quickly"and utilize the existing system to present their answers.”

Ironically, while technology continues to evolve at a rapid pace, Saxon said the Army’s data problems now are "political."

"Someone had data but didn’t want to share," Saxon said. "Or we're getting the data out of legacy systems."