Deja vu all over again: Predictive analytics look forward into the past
It might not be on the level of cloud computing or Web 2.0, but a supposedly new form of data mining called predictive analytics is enjoying its 15 minutes of tech fame.
Gartner pegs the technology as one of the top 10 strategic technologies for 2010, an annual recommendation list that always gets the attention of chief information officers -- and this time likely more so in government circles where data mining has seen extensive use for years to detect fraud, cut waste and track down bad guys.
IBM recently invested more than $12 billion to acquire two top-shelf players in the field and raise a small army of consultants to get in line for a bigger piece of the action. Other vendors are falling over one another to tout their products’ predictive analytics chops. And in the echo chamber that is the tech industry, a trade show called Predictive Analytics World debuted last year and has been scheduled for at least two reprises.
So does this mean it’s time to start listening in on predictive analytics Webinars and scheduling vendor meetings to get the scoop on this obviously important new technology? Not so fast.
“There’s nothing new about this. It’s just old techniques that are being done better,” said Pieter Mimno, an independent consultant who has helped several federal agencies build data-mining systems. What’s really happening is that vendors are “packaging up their statistical models with a new buzzword that makes them more attractive. But it’s the same old stuff.”
Still, that’s not the word on the street. To read trade journals, predictive analytics is a significant advance over older data-mining technologies.
According to those reports, traditional tools analyze large sets of historic data to spot patterns and then deliver reports that shed light on events that have already happened. In comparison, predictive analytics slices and dices historic and newly created data to predict future events — for example, to identify which customer is most likely to buy a new product or which unusual pattern of Medicare claims is a sign of ongoing fraudulent activity.
However, data-mining pros will tell you that using the output from the software’s algorithms and statistical models to decide what to do next -- target those prospective customers with special promotions because they will likely buy; investigate that particular doctor’s claims more closely because he’s likely to cheat -- is what they have been doing for years.
Moreover, using the label “predictive analytics” to describe these products might be fresh and have unquestionable marketing panache -- software that lets you see the future! -- but at best, it’s an imprecise description of what the products do, and at worst, it's a cheap marketing gimmick.
“’Predicting’ may not be the most intuitive word to use,” said Dean Abbott, a data-mining consultant with numerous government customers. “What you’re doing is flagging or identifying [new] invoices or credit card transactions or Medicare billing submissions that look like patterns that in the past have been anomalous or fraudulent or suspicious.”
Abbott helped the Defense Finance and Accounting Service set up a data-mining system to analyze supplier invoices in exactly that way – about 10 years ago.
So has Gartner got it wrong? Is there really nothing new or strategic about advanced analytics in 2010? Don’t cancel that Gartner subscription just yet. Many experts say the consulting firm is right and that advanced analytics tools will deliver much more value in the future.
However, that is not going to happen because of any game-changing widgets or dramatic technology breakthroughs from vendors, though the better data-mining products have gotten more powerful and easier to use. The real reason has to do with old-fashioned sweat equity.
“We’ve done the basic blocking and tackling we need to get our data house in order, and now we’re ready to get more value from the data-reporting infrastructure we’ve set up,” said Wayne Eckerson, director of research and services at the Data Warehousing Institute, an education and research firm owned by 1105 Media, Federal Computer Week's parent company.
So it seems we are on the cusp of a new chapter for data mining. It’s just not going to come in a tidy little box.
John Zyskowski is a senior editor of Federal Computer Week. Follow him on Twitter: @ZyskowskiWriter.