When data deceives

Never mind big data -- John H. Johnson and Mike Gluck warn that the little data can create real confusion all by itself.

Chief data officers and data science teams are now fairly common across government -- an acknowledgment that specialized skills are needed to wring actionable insights out of the big data most agencies are creating. But what about the small data we are exposed to on a daily basis? How easily can it deceive us?

In "Everydata: The Misinformation Hidden in the Little Data You Consume Every Day," John H. Johnson and Mike Gluck warn that this is actually quite likely. They contend that too many Americans are essentially innumerate, and expertise in technology and management don't automatically translate into an understanding of the data concepts critical to processing the gigabytes of information that hit us every day.

They note that the space shuttle Challenger disaster stemmed from a sampling error. True outlier data, on the other hand, can ruin an analysis if it is not identified. Furthermore, predictive analytics are dangerous if the user doesn't understand the factors that go into such forecasts, and the way numbers are charted and graphed can mislead, even if the underlying data is devoid of error and bias.

Yet it doesn't take a data scientist to avoid such misunderstandings. Johnson and Gluck write for the generalist and devote a chapter each to seven core concepts. Readers seeking deeper dives can look elsewhere (including the glossary and notes that constitute the final quarter of the book), but "Everydata" alone is a quick investment in becoming a smarter data consumer.