Search The Site
 
More options | Back issues
Home
News
Opinion & Forums
Careers
Presidents Forum
Technology Forum
Sponsored Information & Solutions
Campus Viewpoints
Travel
Services

The Chronicle of Higher Education
From the issue dated November 5, 1999

The 'Open-Source Movement' Turns Its Eye to Science

Revealing the code for software used in research will help verify data, advocates say

By VINCENT KIERNAN

Upton, N.Y.

Scientists in academe, who already share research materials and papers freely with each other, should take a further step toward openness by releasing the full texts of computer programs that they use to reach conclusions in their work.

That's the position being advanced by a small but increasingly vocal group of scholars who maintain that the current state of affairs in science -- in which researchers rarely supply each other with details about the computer programs they use -- has left a cloud of uncertainty over the validity of modern, computer-intensive scientific research.

"Good science must be verifiable," said J. Daniel Gezelter, an assistant professor of chemistry at the University of Notre Dame. But scientific research that relies on software to analyze or simulate scientific processes, he said, "is not verifiable in practice unless you can look at the source code" -- the listing of individual commands in a piece of software.

He and other scientists from academe, government, and industry gathered here in October at the U.S. Energy Department's Brookhaven National Laboratory for a day-long conference that promoted the development and use of open software for scientific research.

Their efforts grew from a groundswell among computer enthusiasts known as the "open source" movement, which holds that the source code of software in general -- even commercial products -- should be freely available for inspection and even modification by consumers. Open-source enthusiasts are behind the creation of Linux, a computer-operating system developed by volunteer programmers around the world, which can be downloaded and used by anyone at no charge.

Linux increasingly is finding fans among scholars. In August, the Energy Department's Fermilab particle accelerator, in Batavia, Ill., announced that it would begin using Linux on its computers. Researchers at the State University of New York at Stony Brook are building a supercomputer by hooking together a series of desktop machines equipped with Pentium II processors and the Linux operating system. By next year, the researchers plan to have 256 off-the-shelf computers working together to solve problems previously reserved for supercomputers that cost far more.

"It's so open that I can't keep my students in my lab," said Yuefan Deng, a professor of applied mathematics and statistics at Stony Brook. "They are hired away for salaries higher than faculty members'."

Other open-source programs recently developed by scientists in academe include Another Molecular Mechanics Program, developed by researchers at Thomas Jefferson University to determine the structure of biological molecules, and Loci, created by scholars at the Open Lab, an independent group in Hudson, Mass., to coordinate computer programs that work together on a network.

Even university libraries are benefiting from open-source software. Prospero, a program developed at Ohio State University's medical library, handles most of the tasks involved when one university requests material from another institution's library. Not yet six months old, the program is already in use by more than 100 institutions, according to Daniel Chudnov, of Yale University's medical library, which developed a predecessor of Prospero.

Outside academe, some companies are producing open-source versions of commercial scientific software. For example, International Business Machines Corporation recently decided to release the source code for Visualization Data Explorer, a product that some researchers rely on for scientific visualization -- the creation of maps, graphics, and animations that help make sense of research data.

The company's move came after a debate over how to pay for keeping Visualization Data Explorer current. I.B.M. officials considered simply withdrawing the product, but realized that doing so would hurt scientists who rely on it. Still, the company was leery of releasing the source code, for fear that competitors might be able to take advantage of its availability. I.B.M. decided to release it anyway, and to change the program's license so that others could modify the code.

Companies that produce open-source software for science do so because they hope to benefit financially from the arrangement even without selling the software, said R. Kent Koeninger, a strategic technologist for SGI, a high-performance-computer company. SGI has decided to produce an open-source version of a program it developed for managing files on a computer. The file system eventually will be incorporated into Linux, and SGI officials hope that its availability will help promote the sale of computers that the company builds, Mr. Koeninger said.

Echoing arguments favoring the communal process through which Linux was crafted, Notre Dame's Mr. Gezelter told the Brookhaven conference that scrutiny from their peers would improve the programs that scientists write for their research. For example, one scientist might offer a piece of software to improve another scientist's program if both were willing to put their own source codes out in the open.

Mr. Gezelter described his own experience in producing open-source software for modeling molecules. "Sometimes the donations for your code come from areas of science that you had no idea existed," he said. In his case, a scholar who studies lichen offered some source code that improved his program, which Mr. Gezelter found unexpected because analysis of the simple plant forms seems dissimilar from the study of molecular structures.

"This is good for scientific discoveries," said Mr. Gezelter, who operates a World-Wide Web site (http://www.openscience.org) that advocates the wider use of open-source software in science.

Not all scientists are interested in having flaws in their programs laid bare, however. In the 1970s, computer scientists devoted considerable attention to developing techniques for proving that a program has been written correctly, but interest in that area has waned, said Bruce Perens, co-founder of Software in the Public Interest, a non-profit group that promotes Linux.

"In general, computer science has not approached verifiability of computer programs as much as it should," he said. "In the end, people wanted to get their programs done" and were less concerned about whether the work was flawless.

Mr. Gezelter and others argue that science is a "gift culture," one based on the donation of goods and services, and especially ideas, by individual scientists. Open-source software is a logical extension of that culture, they say.

Geoff Davis, who recently left Dartmouth College's faculty to join Microsoft Research, collaborates with Mr. Gezelter on open-source issues in science. Mr. Davis noted that scholars effectively donate their intellectual property to journals by signing over the copyright to their articles, all to foster the dissemination of scientific knowledge. "A lot of the ideals of academia are close to those of the open-source movement," he said.

"The drive really is to obtain status, and you do that by giving things to the community," Mr. Gezelter said at the Brookhaven meeting.

Nonetheless, there are roadblocks to scientists' embracing the open-source movement. One is that, as academe now operates, publicly releasing a computer program does little or nothing to advance a scientist's academic career, in which productivity is judged largely by the number and quality of published scientific papers.

"Releasing your code should be treated as equivalent to publishing a paper," Mr. Gezelter said.

A related problem, said Mr. Davis, is that the rules governing scientific publishing do not require researchers to identify the software they have used, no matter who wrote it. The number of times a journal article is cited by other scholars is often cited as evidence of the article's importance, but since software is not routinely cited, scholars can't point to other researchers' use of their programs as evidence of achievement, Mr. Davis said.

For example, he has written an open-source program to analyze electronic signals, but because other scholars don't consistently cite software, he does not know who has used it. Reading papers in his field, he said, "you have no idea in general of what software they've used." The problem could be solved simply, Mr. Davis suggested, if makers of bibliographic software agreed on a format under which scientists could cite software used in their research.

Others argue that the federal government and other sources of funds for scientific research should do more to encourage scientists to release their software.

Mark Galassi, a research scientist at Los Alamos National Laboratory, which is operated by the University of California, told the Brookhaven conference that some legal hurdles stand in the way of developing a library of open-source programs for performing mathematical tasks that are common in scientific computing. He is one of several scientists and programmers who are working on such an effort, known as the GNU Scientific Library. The project, financed by the Free Software Foundation, is one of the major players in the development of Linux, which incorporates the mathematical software developed by Mr. Galassi and his fellow volunteers.

Because Mr. Galassi works at the federal lab, the U.S. government ordinarily would own the rights to any software that he contributed to the library, even if he wrote the software on his own time, he said. He averted that problem only because, when he was hired by Los Alamos, he already had a legal agreement with a previous employer permitting him to work on the library as an open-source project.

The federal government -- instead of insisting on retaining rights to software and thereby wasting publicly produced programs -- should be promoting the open-use approach, Mr. Galassi argued.

In the end, Mr. Gezelter said, if open-source software is to become the standard, scientists themselves must take the initiative to change the norms of scientific culture. "Convince your adviser, students, postdocs, colleagues, and yourself that good science depends on verifiable results," he told the conference, "and that real verification cannot be achieved without publicly available source code."


http://chronicle.com
Section: Information Technology
Page: A51


Print this article
Easy-to-print version
 e-mail this article
E-mail this article


Copyright © 1999 by The Chronicle of Higher Education