What's so scary about performance ratings?

Of course metrics can reflect factors outside your control. But that, Steve Kelman argues, is "why God invented multiple regression."

man graphs performance

A May 26 front-page article in The New York Times, with the scary title "Colleges Rattled as Obama Seeks Rating System," began by stating that college presidents are "appalled" by an Obama administration plan that "would compare schools on factors like how many of their students graduate, how much debt their students accumulate and how much money their students earn after graduating."

I am a professor, working at a university. I am also an advocate of using performance measurement to improve organizational performance. Should I feel "rattled" by the administration's effort?

The basic worries the college presidents expressed in the article were that the measurements could have perverse effects. (Sound familiar?)

Two in particular were cited in the article. The first was that measures of how much students on average earn after they graduate would penalize schools that trained more students in liberal arts, where salaries are lower compared to those for students with degrees in finance, and those that prepared students for poorly paid public-service careers such as social work or even teaching. A focus on post-graduation income would distort the behavior of universities towards a more-narrow view of their mission.

The second worry cited was that comparisons of debt load, or the measures of graduation rates and even post-graduation income, would discourage universities from taking a risk on students from disadvantaged backgrounds.

Do I share these worries?

I think the bottom line is that neither worry need argue against a performance rating system, if the system is designed correctly. A number of states, in looking at public school test scores, already perform what are called "value added" adjustments. The basic idea is that the scores are adjusted to account for factors that can produce differences in test scores that have nothing to do with the quality of the school. So, for example, schools where most of the kids come from poor families are going to have a harder time showing the same test scores as schools where the students' families are all rich. But the first school may actually be doing a better job –- adding more value -– than the second, and that's the difference we want to measure. Because these techniques often use a kind of statistical analysis that allows controlling for the impact of these irrelevant variables, I have often reminded my students, "This is why God invented multiple regression."

So in the case of the college ratings, if the scores are adjusted to control for percentage of liberal arts majors, emphasis on public service jobs and the economic backgrounds of the students, then the scores we see will come much closer to measuring their intended purpose –- which is to show differences in school teaching quality that influence graduation rates, students indebtedness and so forth. These performance measures can the play the role they are supposed to play for organizations, which is to spur performance improvements. (The administration has, in my view, over-emphasized the politically appealing punitive language about reducing federal funds to low performers, which puts the cart before the horse. Let's get a system working before we start playing around with financial penalties.)

There is, of course, a risk that the government will not introduce a value-added system, in which case the numbers might indeed create the problems about which presidents are worrying and warning. But I am sad that the universities seem to be looking for reasons this system might not work rather than embracing the performance improvement potential ratings can bring. Doctors have done the same thing over the years to try to prevent publication of hospital performance information. Unfortunately, both these efforts smell of special interest, and, although I'm at a university, I cannot endorse them.

NEXT STORY: Who is Shaun Donovan?