After Losing Users in Catalogs, Libraries Find Better Search Software

Lisa Billings/Freelance

Jean A. Bauer, a graduate student in American history at the U. of Virginia, has been frustrated with the confusing search results from the university library's old online catalog. A new one is in the works.
September 28, 2009

Thomas Jefferson founded the University of Virginia. So you might think that typing his name into Virgo, Virginia's online library catalog, would start you off with a book about him.

Jean A. Bauer tried it the other night. At the top of the results list were papers from a physics conference in Brazil.

The problem is that traditional online library catalogs don't tend to order search results by ranked relevance, and they can befuddle users with clunky interfaces. Bauer, a graduate student specializing in early American history, once had such a hard time finding materials that she titled a bibliography "Meager Fruits of an Ongoing Fight With Virgo."

That's changing because of two technology trends. First, a growing number of universities are shelling out serious money for sophisticated software that makes exploring their collections more like the easy-to-filter experience you might find in an online Sears catalog.

Second, Virginia and several other colleges, including Villanova University and the University of Rochester, are producing free open-source programs that tackle the same problems with no licensing fees.

A key feature of this software genre is that it helps you make sense of data through "faceted" searching, common when you shop online for a new jacket or a stereo system. Say you type in "Susan B. Anthony." The new system will ask if you want books by her or about her, said Susan L. Gibbons, vice provost and dean of Rochester's River Campus Libraries. Users can also sort by media type, language, and date.

These products can also rank search results by relevance and use prompts of "Did you mean … ?"

"It's sort of our answer to, Why it is you need a library when you have Google?" said Ms. Gibbons. "What this is going to do is show how much you've been missing."

It's a pressing issue. Libraries once had a monopoly on organizing data about content. No longer. And today some users gripe about how libraries present materials online: how scattered they are, how sluggish searches can be, and how often those searches are useful only if you already know exactly what you want.

The worry for Jennifer Bowen, assistant dean of the River Campus Libraries, is that library catalogs could become "marginalized."

"There are people who just cannot find what they need," she said. "And they're just sort of giving up on libraries."

A Single Entry Point

The issue concerns professors, too. One software developer pointed to a 2006 study by Ithaka, a nonprofit group that promotes the use of information technology in higher education. It found that faculty members value the campus library but "perceive themselves to be decreasingly dependent on the library for their research and teaching." The report described what appeared to be "growing ambivalence about the campus library."

The buzzwords for the technology that librarians hope will allow users to rediscover their collections are "Web-scale index searching."

That, in Ms. Gibbons's translation, is a fancy way of saying that the system, like Google, works by searching against a vast index of information. It's a contrast with an earlier attempt to deal with the search problem through "federated searching," where there is no local index, and each query is taken from the user and sent individually to various databases.

You expect a Google search to cast the broadest possible net. The same should apply to a library catalog, the thinking goes. That means a single entry point to the collection. The entire collection: books, articles, digital objects. Heck, why not even herbarium specimens?

Marshall Breeding, director of innovative technology and research at the Vanderbilt University library, calls the concept "an ambitious goal—and at this point I think it's more of a goal than reality."

But the move toward simplified, silo-busting, relevant-result-returning library searches may come with its own problems.

Mr. Breeding, who founded the Web site Library Technology Guides, has observed "pockets of resistance" in the library community. Some argue that new search products—sometimes called next-generation catalogs or discovery interfaces—amount to a dumbing-down of catalogs.

By contrast, traditional search tools reinforce the idea that library users need a clear understanding of the different materials involved in research, Mr. Breeding said, such as the difference between articles and monographs. New interfaces that mix many different information sources blur all that, he said.

And then there are the slew of devil-in-the-details questions that arise from the content convergence.

Will users understand it? Will they find what they want? Will books be properly represented among the flood of articles? What about image collections? Could the pile of stuff just get too big?

Libraries' online catalogs are typically one module of an integrated software system that runs library functions like the circulation desk, acquisitions, and cataloging. They are a window into what libraries manage inside their integrated systems, Mr. Breeding said, which tends to be mostly the print collections. But the problem is they lack a good way to include the growing electronic part of the library collection, he said.

What the new interfaces share is the ability to derive material from catalogs and combine it with other data in a modern package.

The commercial market for these interfaces has already produced Encore, from Innovative Interfaces, adopted by at least 44 academic libraries in the United States, according to Mr. Breeding's tally; AquaBrowser, from Media lab Solutions, used by 23 libraries; and Primo, from Ex Libris, adopted by 13 libraries.

How much institutions will have to pay for new commercial systems will vary depending on both what comes with the software and the size and complexity of the library. That could mean a price as low as $10,000 for a small academic library to one in the $100,000 range for a much larger one, Mr. Breeding said.

A 'Shift of Power'

In the open-source world, at least 10 academic libraries have turned to VuFind, which originated at Villanova. Virginia's Blacklight, with Stanford University as a development partner, is in a beta phase. And Rochester's eXtensible Catalog, or XC, backed by $1.2-million from the Andrew W. Mellon Foundation, will be rolled out in the spring.

The shift from commercial products to open-source ones is about more than money, though.

Bess Sadler, chief architect of the online library environment at the University of Virginia, sees the open-source Blacklight project as a "shift of power," as she wrote recently in the journal Library Hi Tech. The idea is that libraries, which know their local needs, should control the technology that patrons use to gain access to their collections. That's a change from the one-size-is-good-enough-for-everybody, commercially managed model that has prevailed in the industry.

The ability to customize is important when it comes to something like a music collection. A librarian might get this question: "I play the guitar. My boyfriend plays the flute. What duets can we play together?" In the past, even though Virginia had cataloged the instruments used in all of its sheet music, a search of that information was impossible because the fields that were indexed were maintained by a vendor, Ms. Sadler said.

"The problem with a vendor solution is that it's hard for vendors to tailor that solution for different collections, for different user populations, for different specializations," she said.

With an open-source system, a library can set its own relevance rankings and adjust them based on what users want. By maintaining the system itself, Virginia is now able to search by musical instrument.

The downside is libraries need someone on staff who can install and maintain the open-source program. So far, vendors aren't supporting products like VuFind the way they support established open-source products like Koha and Evergreen, both integrated library systems, said Mr. Breeding. Vendors will install software like Evergreen, host it on their own servers, and provide a help desk that you can call if something breaks. Not so for the newer software. Another barrier is going to be trusting that an open-source project is sustainable. There is always a concern that there will not be a community of users to keep developing it.

Also, the open-source systems have been slower to fold in article-level data, Mr. Breeding said. Most of that action is on the commercial side.

With Blacklight, you won't be able to get individual journal articles. If you're doing research on cell division, for example, a search will tell you that Virginia subscribes to the journal about cell division, but you'll have to go to a journal database for the article.

"That's going to be true for a very long time," Ms. Sadler said. "For the foreseeable future, you're going to need to go to separate interfaces in order to search licensed content."

But commercial vendors, smelling a new market, are stepping in. Serials Solutions, a subsidiary of ProQuest, released a software product in July called Summon. The company has been negotiating deals with publishers and content providers to create a searchable index of their content. It's like Google, except what Summon provides is an index of the "deep Web" of paid content. So now university libraries that pay for a subscription to Summon can let their users search their licensed content as well as locally owned stuff, together. Summon has 17 customers so far, including Arizona State University and Dartmouth College.

The catch? It can be expensive.

Andrew S. Nagy, senior discovery-services engineer at Serials Solutions, wouldn't say how expensive. But the cost of a subscription can run into the tens of thousands, said one university administrator who was not authorized to discuss price and thus wanted to remain anonymous. Summon also does not have permission to display the full text of articles.

At Virginia, the open-source Blacklight has paid off for Ms. Bauer.

"You know the feeling of when you go into the stacks, and you're usually looking for one book, but then it's almost always the book that's next to it that's the one you really need?" she asked. "It helps replicate a bit of that experience."

And if you search for Thomas Jefferson, it even starts you off with a book about him.