• Tuesday, May 29, 2012

Previous

Next

The NRC Rankings: Further Thoughts From Stephen Stigler

November 2, 2010, 3:32 pm

Five weeks ago, the National Research Council released its long-awaited rankings and assessments of American doctoral programs.

Many institutions are already making use of the avalanche of data in the NRC report. The University of Florida has announced a comprehensive review of its doctoral programs. And administrators at Ohio State University say that the NRC project helped provide a structure for an continuing assessment of doctoral programs there.

At the same time, doubts about the quality and accuracy of the NRC’s analyses have not died down. A nontrivial number of doctoral programs seem to have given the NRC incorrect counts of the number of faculty members involved in their programs. Such errors could cause large ripple effects because several of the NRC’s measures were analyzed on a per-faculty-member basis. If your program’s faculty denominator is too large, its per-capita publication and citation rates in the NRC report could be seriously distorted.

Two weeks ago, The Chronicle published a defense of the project by E. William Colglazier, executive officer of the National Academy of Sciences, and Jeremiah P. Ostriker, provost emeritus at Princeton University and chair of the committee that produced the NRC’s doctoral assessments. Some of their comments were made in response to Stephen M. Stigler, a professor of statistics at the University of Chicago who has been one of the report’s most vocal critics.

Here, Mr. Stigler offers some additional thoughts. —Ed.

E. William Colglazier and Jeremiah P. Ostriker make a number of cogent observations regarding the NRC rankings. I agree that if one judges the committee’s report in the light of the stringent (and perhaps impossible) restrictions placed upon them by the charge to the National Academy of Sciences, the report looks much better than if one views it in comparison with what might have been achieved without those restrictions. But I would like to highlight two points that can be of help going forward.

First, I think Mr. Colglazier and Professor Ostriker grant their critics too much in saying that many complaints derive from “potential errors in the huge database,” with the implication that these errors might be correctable, either now or in future reports.

The point could be made about most of the variables they recorded, but I focus on one for illustration, the list of program faculty members. The NRC committee gave reasonable and clear instructions to the universities on which faculty members to list, and the universities (in cases I have looked at) interpreted them in reasonable and intellectually sound ways. [To read those instructions, see Pages 163-71 of the NRC’s report. —Ed.]

When, for example, one university included over 20 postdoctoral instructors in the faculty list for one program, it was not a data error; it was based upon the fact that these instructors play an important role in the doctoral program. It is true this turned out to be a strategic error: Had the university known that the most important role of the faculty lists would be as the denominator of the variables “publications per faculty member” and “awards per faculty member,” and that the size of the program faculty would play no direct role in the study as an indicator of intellectual scope, they might well have interpreted the request differently.

Omitting the postdocs would have increased most of their indices by as much as 50 percent. But the real point is that it is wrong to think of counts of faculty members, publications, students, citations, awards, and so forth as simple objective numbers where there is a “right” answer.

Different programs will have different ways of interpreting “faculty,” and all of them may be correct—intellectually defensible in light of the specific program. There is no single  “right” answer, and these were not simple data errors; they were strategic errors that could only have been avoided if the committee had told the universities specifically what they intended to do with the data. The charge to the NRC committee was based upon a false hypothesis, that there were simple objective measures to be had.

Second, I would call attention to a way that the recent report differed from those issued by the NRC before, a difference that I have come to believe is important. The report gives ranks (as ranges), but it does not give the index values being ranked, which previous studies had included.

Had the index values been reported, the readers would have seen the trivial—minute in many cases—differences between most of them, an indication that these indices (simple weighted averages of  standardized variables) do not discriminate among the programs in a useful way, that they tell us little of what we may wish to know.

The wide ranges to the ranks reflect the fact that small perturbations in the weights lead to major changes in the ranks even though these perturbations have little effect upon the index values.

Put another way, they reflect the lack of statistically significant differences between the indices of a majority of the programs under study. But here, as usual, statistical significance does not tell the whole story: Had the index values themselves been reported, the reader could have seen that the differences were substantively negligible as well. In omitting the index values, the report overemphasizes rank, which is after all not itself important.

It is not too late to add these index values to the online spreadsheet; the computations have presumably already been done in-house, and no changes in the database would be needed.

This entry was posted in Uncategorized. Bookmark the permalink.

  • Print
  • Comment

Comments are closed.