Counterpoint: Doctoral-Program Rankings—the NRC Responds

October 17, 2010

When A Data-Based Assessment of Research-Doctorate Programs in the United States was released by the National Research Council on September 28, we expected many comments, including criticisms. We would like to offer our perspective on some of these comments and look forward to hearing the views of others on the type of online database that should be maintained in the future to help strengthen doctoral programs.

Stephen Stigler, a professor from the University of Chicago, offered a thoughtful critique, saying that the project "was doomed from the start" when reputation was downplayed. Rankings based on reputation were provided in the 1982 and 1995 NRC reports, but they were excluded from the charge given to the committee for the 2010 report. We agree that reputationally based rankings contain important information, which was discussed in the report, but they also contain seriousness weaknesses.

The decision not to include rankings based purely on reputation arose from three considerations. First, pure reputational rankings can contain "halo effects," meaning a program's ranking may be skewed by the university's overall reputation, or it may lag behind because of its past reputation. Second, reputationally based rankings were not supported by many universities whose participation was needed to collect the data. And third, this study was intended to provide a comprehensive, updatable collection of data on characteristics associated with perceived program quality that would allow faculty, students, administrators, and other stakeholders to assess programs based on their own values, and thereby become a transparent tool to strengthen doctoral programs. The reputationally based rankings that received the most attention in previous reports did not provide a means to achieve this last important objective.

Professor Stigler also argues that the rankings have "little credence." We want to stress that the rankings provided in the NRC report are really intended to be "illustrative." They are not intended to be definitive; they are not endorsed by the NRC above other alternatives that might be constructed. Instead, they are examples of two ways of deriving faculty values for determining weights for making rankings, which illustrate how stakeholders can apply their own weights.

Professor Stigler criticizes the fact that not all programs were surveyed in determining the weights for the R (regression-based) rankings. Even if this had been done, the R and S (survey-based) rankings would still be called "illustrative." An important insight from comparing the differences between the S and R rankings is that faculty members generally do not assign great importance to program size (compared with scholarly output per faculty member) when assigning weights directly to characteristics—but when they rank programs, size appears to implicitly carry large weight. So it is not so clear which weights most accurately represent faculty values.

The committee also wanted to avoid the defect of "spurious precision" associated with rankings. The ranges provided with the R and S rankings represent an estimate of statistical uncertainties, but they do not represent all of the uncertainties associated with the challenging task of trying to develop a methodology for assessing programs based solely on quantitative measures of program characteristics. The committee presented both R and S rankings with large statistical ranges and called them "illustrative" to indicate that models based solely on quantitative variables and implicit and explicit faculty views also have pitfalls, as noted in the report.

A valid criticism made by some observers relates to potential errors in the huge database. The NRC took many precautions during data collection to ensure the accuracy of the information. In spite of these efforts, errors may persist. The NRC seeks information by November 1, 2010, regarding possible mistakes, and will work with each institution to identify the source of the error and to see if it can be remedied. We will record what universities tell us on a publicly available list. At least one institution submitted faculty lists for some departments that were not correct because numerous adjunct faculty members were included, which is unfortunate.

A number of other universities corrected similar errors during the validation process. Some errors may have arisen from complex guidance provided by the NRC in an effort to obtain lists of only those faculty involved in doctoral education and research. Other errors may have resulted from mistakes by some universities during efforts to collect and submit data. At this point, it would be difficult to make corrections on the 2005-6 data for characteristics such as publications and citations, which depend on faculty lists.

The fields of computer science and communication are examples where the illustrative rankings may be more problematic. In the case of computer science, the publications of faculty members were re-counted when it became clear that certain peer-reviewed conference proceedings were highly valued in the field. While the publications per faculty member were updated based on this re-count, citations per faculty member had to be dropped, because it was not possible to include peer-reviewed conference proceedings in the data on citations. In the field of communication, the information collected for the database may not include all types of scholarly work that are most important for assessing programs in this discipline.

We hope that departments will post updated data, perhaps for the 2009-10 academic year, on their program Web sites. We also hope that each discipline will discuss whether the program characteristics and methods of data collection encompass what is most important for assessing their programs, so that appropriate changes may be incorporated in the future. Lastly, we hope that all of those interested in graduate education in the United States will work together to maintain a regularly updated, online database that can be an important tool for helping to strengthen graduate programs through continuous improvement. The National Research Council is certainly willing to help with that task.

E. William Colglazier is executive officer of the National Academy of Sciences and chief operating officer of the National Research Council. Jeremiah P. Ostriker is chair of the Committee on an Assessment of Research-Doctorate Programs and provost emeritus at Princeton University.