A data scientist who is skeptical about data
As with all too much in our society today, our attitude toward the use of data in public policy decision-making has become part of our cultural wars. On one side there are those who reject data and even science as some sort of elitist conspiracy against popular wisdom from what used to be called “pointy-headed intellectuals.” It will probably not surprise my blog readers to learn that as an academic I am on the other side of this debate, advocating data and “evidence-based government.”
I also like to think I am the kind of guy who wants to learn about opinions different from my own. So it caught my attention when Adam Grant, the renowned behavioral scientist at Wharton (who also has a strong interest in public-sector management) linked to an article on the website Quartz with the provocative title, “I’m a data scientist who is skeptical about data,” written by the NYU data science professor Andrea Jones-Rooy. Her article is worth reading.
There are four kinds of reasons, Jones-Rooy says, why the data being thrown at us may not be accurate and hence believable.
One is that the data may be subject to random error. This comes about because measurement equipment is flawed or people make an error in measuring (maybe due to fatigue or inattention). If the error is random, it doesn’t bias the aggregate, overall results in one way or another -- the different measurement errors (some give too high a value for what we are measuring, some too low) cancel each other out. But even here, in an individual case, random error can be damaging to the person or organization subject to it. If a monitor measuring gas concentrations in a mine is broken and gives false information, it may lead to an explosion, and an ingredient in deodorants can cause a white spot that appears on a mammogram to seem to be a tumor.
A second form of error is much more insidious -- systematic error. An extremely common form of systematic error is what is called “selection bias,” where the people or phenomena we are measuring are not representative of the entire population about which we want to gather data. As an example, Jones-Rooy gives drawing conclusions about the general population based on what’s posted on Twitter. “Using data from Twitter posts to understand public sentiment about a particular issue is flawed because most of us don’t tweet,” and people who do may not be representative of the population (they are likely to be more at the political extremes). If the people who sign up for job training programs are untypical of the unemployed, then we get flawed data about the success of such programs. Another example Jones-Rooy cites is that people who enroll in clinical trials for new medical treatments may be unusually wealthy and/or unusually sick compared with people with the illness in general.
A third source of error is measuring the wrong thing. The number we get might be correct, but the use we make of the number is inappropriate -- it does not correctly represent the phenomenon we are interested in. Jones-Rooy gives the following examples: “If we are looking for top job candidates, we might prefer those who went to top universities. But rather than that being a measure of talent, it might just be a measure of membership in a social network that gave someone the ‘right’ sequence of opportunities to get them into a good college in the first place. A person’s GPA is perhaps a great measure of someone’s ability to select classes they’re guaranteed to ace, and their SAT scores might be a lovely expression of the ability of their parents to pay for a private tutor.”
A similar criticism is often made of some government (or other) performance metrics for being poor indicators of the underlying phenomenon the government (or citizens) care about -- such as when the scores on a poorly designed standardized educational test is a poor measure of how much students have learned. In all these cases, the problem is not the number itself, but whether it measures what we really care about.
A final source of error is what Jones-Rooy calls errors of exclusion. This is when a certain type of person is undercounted in the data that are collected, and that underrepresentation leads to incorrect conclusions about a phenomenon affecting the undercounted. She notes that “women are now more likely to die from heart attacks than men, which is thought to be largely due to the fact that most cardiovascular data is based on men, who experience different symptoms from women, thus leading to incorrect diagnoses.”
Actually, there’s some good news in Jones-Rooy’s cautions. The fact is that the worries she has are well known. Both academics who study social problems and students who are being taught statistics and reasoning about data at places such as the Kennedy School (and others) spend a lot of time thinking or learning about how to avoid or at least mitigate them. Moreover, many studies discuss such dangers very explicitly. To take Jones-Rooy’s example about clinical trials, we can control for the wealth or degree of illness of participants and thus cleanse away the effects of these on the results of the treatment. It would be rare to see an academic paper that does not specifically discuss selection bias and steps the research took to deal with it; when the problem cannot be completely solved, the paper will note this as a limitation of the research.
At the risk of being accused of elitism, I would say that most of the horror story examples of data problems due to these errors come either from journalists or from partisans who illustrate the old saw, “Figures don’t lie, but liars figure.” I would contend it’s a good rule of thumb to assume that if research has been published in a reputable academic journal, you should give it the benefit of the doubt. This doesn’t work completely -- often there are disputes among scholars about how legitimate it is to draw the conclusions one scholar has drawn from the data -- and probably few non-scholars have the patience to explore these disputes.
So there is no perfect solution to the kinds of issues Jones-Rooy raises. One step for any reader of this blog is to note the categories of worries she presents and analyze data you see in light of them -- if somebody is claiming, say, that a certain government job training program helps the unemployed find jobs, ask whether the people in the program about whom data is being gathered chose to participate in the program because they were already more job-ready than others who didn’t.
Jones-Rooy calls herself a data skeptic, and you should be one too. But you don’t need to be, and you shouldn’t be, a data nihilist like those on the anti-science side of the culture wars. It doesn’t mean you should abandon a belief in evidence-based government.
As Jones-Rooy argues, "These errors don’t mean that we should throw out all data ever and nothing is knowable, however. It means approaching data collection with thoughtfulness, asking ourselves what we might be missing, and welcoming the collection of further data. This view is not anti-science or anti-data. To the contrary, the strength of both comes from being transparent about the limitations of our work. Being aware of possible errors can make our inferences stronger.”
Posted by Steve Kelman on Aug 06, 2019 at 10:41 AM