|
No Classroom Left Unstudied
The federal government is embracing science-like experiments in the public schools, but some scholars argue that effective learning is harder to test than effective drugs

|

|

|

|

|

|

|

Colloquy: Join an online discussion about whether research on elementary and secondary education, partly under prodding from the Bush administration, is becoming more rigorous because it is increasingly based on randomized experiments.
|

|
|
By DAVID GLENN
It was an afternoon that could have been scripted by Frank Capra. Busloads of fresh-faced kids filled a committee room as the U.S. Senate held hearings on the Bush administration's proposal to cut $400-million from an after-school program known as 21st Century Community Learning Centers.
It was May 2003. With dozens of cameras trained on him, Arnold Schwarzenegger -- who was then simply the "actor and youth activist Arnold Schwarzenegger" -- testified against the cuts. Steven Kinlock, a high-school senior from a poor Philadelphia neighborhood, told the committee that the centers had given him and his friends "a positive direction in life."
Something about the hearing felt fundamentally wrong to Grover J. (Russ) Whitehurst, director of the Institute of Education Sciences, which is the latest incarnation of the Education Department's oft-shifting research division. Movie stars and poignant personal stories were all well and good, he thought, but they're no substitute for careful scrutiny of the actual operation of the program.
The White House justified the cuts partly on the basis of a large-scale "randomized" study that had found, disappointingly, that children who attended the after-school centers did not enjoy significant academic gains relative to their peers. That study had been designed during the Clinton administration, so it could not easily be dismissed as the product of budget-cutting Republican ogres. But as far as Mr. Whitehurst could tell, few people at the hearing were giving the study its due.
"A lot of what goes on in Washington with respect to education evidence is simply advocacy," Mr. Whitehurst said at a conference a few months later. "I submit that that does not happen in medicine and agriculture."
"Education," Mr. Whitehurst continued, "is often degraded by the use of pseudoscience or weak science or anecdote in lieu of better methods."
Criticisms like those are not new. What has changed in the last several years is that people who share that perspective -- Mr. Whitehurst not least among them -- have risen to positions where they can act on their complaints.
Under Mr. Whitehurst, who was hired in 2001, the Education Department has greatly accelerated the Clinton team's slow turn toward medical-style randomized studies. The phrases "scientifically based research" and "scientifically based reading research" occur 111 times in the text of the No Child Left Behind Act, the Bush-backed education plan that was signed into law in 2002. If you want federal support to study, say, a new math curriculum, you had better be prepared to construct a valid experiment in which some classrooms use the new program while other, demographically similar, classrooms serve as a control group.
This summer the move to make education research more like medical research will reach a small milestone, with the debut of an ambitious online database of studies that meet the criteria laid down by the new law.
Many traditional education researchers, however, see peril and folly in the new regime. Some fear that randomized trials in education are based on a false analogy with health studies, and bristle at easy references to "medicine and agriculture." Resources now sometimes go to scholars with little sense of the real-world dynamics of classrooms, such skeptics warn.
Others suggest that the new experimental ethos is suffused with a naïve optimism about policy makers' responsiveness to research findings. Still others argue that the new federal studies tend to be centered on a too-narrow set of outcomes -- namely, scores on reading and mathematics achievement tests. The skeptics worry, in sum, that the government is emphasizing numbers and formal methods at the expense of other, less quantifiable kinds of knowledge about how schools operate.
"It's not just about methodological rigor," says Deborah J. Stipek, dean of the Stanford University School of Education. "Even if you achieve that, it's not always clear how your results will affect classroom practice. That's the question that the feds won't even touch."
Leaving No Child Behind
The coming months will provide both sides with some visible tests of their critiques. This summer the Education Department will unveil the What Works Clearinghouse, an online database of research findings that meet the government's new criteria for methodological rigor.
The project's proponents hope that within a few years, school superintendents and other policy makers will be able to turn to the clearinghouse (which is modeled on online databases for medical practitioners) for authoritative guidance about a wide range of topics, including science curricula and drug-abuse prevention.
And there is reason to believe superintendents will be scrambling for such advice: This summer thousands of schools and districts are expected to be officially designated as "failing" under the standards of No Child Left Behind. Some school leaders may feel desperate to find solutions that will help their students meet the new achievement-test benchmarks.
The catch is that the What Works Clearinghouse will initially focus on only a narrow range of topics, in part because very few existing studies meet the new methodology requirements. The demand for certain kinds of education research may rise much faster than the supply.
In part to remedy that perceived gap between supply and demand, more jobs and grants in education research are going to scholars outside the traditional circles of the American Educational Research Association.
Some of that money is flowing to labor economists who learned how to conduct randomized policy trials in such arenas as job training and welfare reform. Many of those economists have little or no experience in the schools and no ties to colleges of education. But their approach to research is a better fit with the new ethos.
"Fewer than 10 percent of AERA members are knowledgeable about randomized trials," says Robert F. Boruch, a professor of education and statistics at the University of Pennsylvania and the principal investigator for the What Works Clearinghouse. "And even fewer have actually worked on a randomized trial."
Mr. Whitehurst himself epitomizes the trend: He is a developmental psychologist by training -- he taught for many years at the State University of New York at Stony Brook -- and was appointed by the White House for his expertise as an experimenter, not for any deep background in education research.
"Though I have played around the edges of the education-research community for 30-plus years," he told members of the American Psychological Association at their meeting last year, "I wasn't really embedded in the issues of education research as a profession ... until I came to Washington in April of 2001."
Some members of the education-research association, especially those who believe strongly in the ethnographic and qualitative approaches that have dominated the field recently, are not at all pleased with the new climate.
At the association's annual meeting in San Diego in April, Paul S. Shaker, dean of the faculty of education at Simon Fraser University, in British Columbia, denounced what he called "a new 'common sense' of education that maligns or manipulates the corpus of educational research and attacks promising practices and reforms. In addition, a new type of education scholarship has emerged that is delivered in alternative ways, financed through unorthodox sources, motivated by nonacademic purposes, and supported through direct access to media and political organizations, including the federal government."
Mr. Shaker's panel was called "Is the Federal Government Taking Over Education Research?"
Errors by Trial?
Some of the new government-sponsored randomized education studies are vast, multimillion-dollar trials of, for example, the effectiveness of various instructional-software products. Those large-scale trials are generally led by private-sector research contractors like Mathematica Policy Research or MDRC, formerly the Manpower Development Research Corporation. Last September Mathematica was awarded a $5.6-million contract to assess 16 commercial software products that are touted as useful for improving students' reading and math skills.
The Education Department is also sponsoring smaller-scale projects. Take, for example, a $1.2-million grant that was awarded in 2003 to Sharon L. Ramey, a developmental psychologist who directs Georgetown University's Center for Health and Education. Ms. Ramey is examining the efficacy of a commonly used prekindergarten curriculum known as Building Language for Literacy.
Following the new requirements, Ms. Ramey has designed her study, which is taking place in 12 low-income schools in Maryland, as a randomized trial. Some schools have been assigned the Building Language for Literacy curriculum while others stick with the status quo.
And Ms. Ramey has made further distinctions: Half the teachers using the special curriculum will be given special training and support, while the other half will be given only the standard training that is normally attached to such curriculum packages. "We need to know what works for children," Ms. Ramey says. "We have had such a thin base on which we make decisions that affect children's everyday lives, and perhaps their future. ... We need rigorous science attached to a great deal of what we study."
The official expectation is that Ms. Ramey's randomized study, and dozens of others like it, will offer more-persuasive and more-useful evidence about what causes achievement-test gains than do the "process-oriented" program evaluations that have dominated education research during the past three decades. Such evaluations tend to involve detailed descriptions of the mechanisms of a particular school or classroom, without the use of a control group and often without explicit reference to concrete educational results like test scores.
"Education had drifted off into this other direction where the model was one from sociology and political science, where they studied schools as organizations," says Thomas D. Cook, a professor of sociology and psychology at Northwestern University who is a strong advocate of randomized trials. "As opposed to a model that says, 'How can I improve schools? What are the things that can work? Let's test them out.'"
Many scholars, however, insist that Mr. Whitehurst, Mr. Cook, and their allies are overselling the promise of randomized trials. Such studies may be elegantly designed, the critics say, but they may not have much external validity -- that is, they may not offer lessons that can be generalized throughout the country.
"The correct expectation is that local, and even personal, events are so determinative that you can't draw simple lessons from randomized trials," says Robert E. Stake, a professor of education at the University of Illinois at Urbana-Champaign. According to this line of argument, even if Ms. Ramey takes great pains to make sure that her prekindergarten classrooms are demographically and structurally similar and to allocate resources randomly, it would only take one unusually gifted (or incompetent) teacher to skew her results. A study large enough to flatten out such statistical anomalies would need to be enormous, and thus prohibitively expensive.
A related critique holds that it is extremely difficult to monitor randomized trials to ensure that the teachers are actually adhering to the prescribed curriculum.
"The randomized double-blind test is probably the best test of pharmaceuticals that could ever be designed," says Yvonna S. Lincoln, a professor of education at Texas A&M University. "But we're in deep trouble if we try to use that method in education."
She says that a colleague of hers used to joke, "'You've got to be kidding me. We can't control for when a first grader needs to go to the bathroom. How are we going to control all of these interventions?' And he was absolutely right. It's very difficult to maintain the boundaries of a randomized trial."
Half Empty, Half Full
Mr. Whitehurst concedes that randomized trials are difficult to do well. But he rejects Ms. Lincoln's suggestion that medical or pharmaceutical trials are somehow free of such challenges.
A study of falls and accidents among elderly people, for example, would involve "diet, exercise, the design of the house, the self-control of the individuals involved," Mr. Whitehurst says. "There are a whole number of challenges like that in the medical arena that I think are not less complex, not less contextual, not less multivariate than the problems you get in education. That doesn't mean that you can't use a randomized trial to study them."
Observing these conflicts from a slight remove is Timothy A. Hacsi, an instructor in history at the University of Massachusetts at Boston and the author of Children as Pawns: The Politics of Educational Reform (Harvard University Press, 2002). Mr. Hacsi believes that randomized trials were overdue for acceptance by education researchers, and he is cautiously optimistic about certain elements of the new regime.
He worries, however, that the Education Department may be too optimistic about the prospects for translating research findings into public policy. "It starts to look more like wishful thinking than anything else," he says.
The What Works Clearinghouse, and similar elements on the Education Department's own Web site, remind Mr. Hacsi of the technocratic hubris of the Kennedy and Johnson administrations, a time when hopes ran high that public policy could be guided by social science.
In particular, Mr. Hacsi suggests that the new randomized experiments are being conducted within a policy framework -- No Child Left Behind -- whose own premises are not supported by rigorous evidence. "The whole thrust of No Child Left Behind, that you're going to create accountability through testing -- there is no evidence that that will work." he says.
"There is no evidence at all that testing will improve overall scores," Mr. Hacsi says. "There is no evidence that accountability will improve overall scores. And there's no evidence that testing will lead to accountability."
A central premise of No Child Left Behind, Mr. Hacsi says, is that similar reforms were effective in Texas in the 1990s. "But there's all sorts of evidence now that, Oh, no, it didn't work in Texas," he says. "It just astounds me that the same people who are saying that high-stakes testing, without funding education to help meet the testing requirements, will work, are saying, 'Oh, we're just going to rely on randomized experiments.'"
Northwestern's Mr. Cook, who serves on several new federal advisory panels, is thrilled by the new experimental emphasis. He worries, however, about the cultural divide between the American Educational Research Association and the newly emergent cadre of experimental researchers. "The people in education schools have really been clubbed into acquiescence," he says. "But this project won't work unless we invite them into fraternity and sorority. We need to have a long-term perspective."
"This really is an attempt at a revolution," Mr. Cook continues. "It has a top-down, elitist feel because of the involvement of people from departments of economics and psychology and statistics. But unless departments of education are willingly incorporated, this won't work. It can't be top-down forever."
http://chronicle.com
Section: Research & Publishing
Volume 50, Issue 38, Page A12
|