• April 16, 2014

After Losing Users in Catalogs, Libraries Find Better Search Software

After Losing Users in Catalogs, Libraries Find Better Search Software 1

Lisa Billings/Freelance

Jean A. Bauer, a graduate student in American history at the U. of Virginia, has been frustrated with the confusing search results from the university library's old online catalog. A new one is in the works.

Enlarge Image
close After Losing Users in Catalogs, Libraries Find Better Search Software 1

Lisa Billings/Freelance

Jean A. Bauer, a graduate student in American history at the U. of Virginia, has been frustrated with the confusing search results from the university library's old online catalog. A new one is in the works.

Thomas Jefferson founded the University of Virginia. So you might think that typing his name into Virgo, Virginia's online library catalog, would start you off with a book about him.

Jean A. Bauer tried it the other night. At the top of the results list were papers from a physics conference in Brazil.

The problem is that traditional online library catalogs don't tend to order search results by ranked relevance, and they can befuddle users with clunky interfaces. Bauer, a graduate student specializing in early American history, once had such a hard time finding materials that she titled a bibliography "Meager Fruits of an Ongoing Fight With Virgo."

That's changing because of two technology trends. First, a growing number of universities are shelling out serious money for sophisticated software that makes exploring their collections more like the easy-to-filter experience you might find in an online Sears catalog.

Second, Virginia and several other colleges, including Villanova University and the University of Rochester, are producing free open-source programs that tackle the same problems with no licensing fees.

A key feature of this software genre is that it helps you make sense of data through "faceted" searching, common when you shop online for a new jacket or a stereo system. Say you type in "Susan B. Anthony." The new system will ask if you want books by her or about her, said Susan L. Gibbons, vice provost and dean of Rochester's River Campus Libraries. Users can also sort by media type, language, and date.

These products can also rank search results by relevance and use prompts of "Did you mean … ?"

"It's sort of our answer to, Why it is you need a library when you have Google?" said Ms. Gibbons. "What this is going to do is show how much you've been missing."

It's a pressing issue. Libraries once had a monopoly on organizing data about content. No longer. And today some users gripe about how libraries present materials online: how scattered they are, how sluggish searches can be, and how often those searches are useful only if you already know exactly what you want.

The worry for Jennifer Bowen, assistant dean of the River Campus Libraries, is that library catalogs could become "marginalized."

"There are people who just cannot find what they need," she said. "And they're just sort of giving up on libraries."

A Single Entry Point

The issue concerns professors, too. One software developer pointed to a 2006 study by Ithaka, a nonprofit group that promotes the use of information technology in higher education. It found that faculty members value the campus library but "perceive themselves to be decreasingly dependent on the library for their research and teaching." The report described what appeared to be "growing ambivalence about the campus library."

The buzzwords for the technology that librarians hope will allow users to rediscover their collections are "Web-scale index searching."

That, in Ms. Gibbons's translation, is a fancy way of saying that the system, like Google, works by searching against a vast index of information. It's a contrast with an earlier attempt to deal with the search problem through "federated searching," where there is no local index, and each query is taken from the user and sent individually to various databases.

You expect a Google search to cast the broadest possible net. The same should apply to a library catalog, the thinking goes. That means a single entry point to the collection. The entire collection: books, articles, digital objects. Heck, why not even herbarium specimens?

Marshall Breeding, director of innovative technology and research at the Vanderbilt University library, calls the concept "an ambitious goal—and at this point I think it's more of a goal than reality."

But the move toward simplified, silo-busting, relevant-result-returning library searches may come with its own problems.

Mr. Breeding, who founded the Web site Library Technology Guides, has observed "pockets of resistance" in the library community. Some argue that new search products—sometimes called next-generation catalogs or discovery interfaces—amount to a dumbing-down of catalogs.

By contrast, traditional search tools reinforce the idea that library users need a clear understanding of the different materials involved in research, Mr. Breeding said, such as the difference between articles and monographs. New interfaces that mix many different information sources blur all that, he said.

And then there are the slew of devil-in-the-details questions that arise from the content convergence.

Will users understand it? Will they find what they want? Will books be properly represented among the flood of articles? What about image collections? Could the pile of stuff just get too big?

Libraries' online catalogs are typically one module of an integrated software system that runs library functions like the circulation desk, acquisitions, and cataloging. They are a window into what libraries manage inside their integrated systems, Mr. Breeding said, which tends to be mostly the print collections. But the problem is they lack a good way to include the growing electronic part of the library collection, he said.

What the new interfaces share is the ability to derive material from catalogs and combine it with other data in a modern package.

The commercial market for these interfaces has already produced Encore, from Innovative Interfaces, adopted by at least 44 academic libraries in the United States, according to Mr. Breeding's tally; AquaBrowser, from Media lab Solutions, used by 23 libraries; and Primo, from Ex Libris, adopted by 13 libraries.

How much institutions will have to pay for new commercial systems will vary depending on both what comes with the software and the size and complexity of the library. That could mean a price as low as $10,000 for a small academic library to one in the $100,000 range for a much larger one, Mr. Breeding said.

A 'Shift of Power'

In the open-source world, at least 10 academic libraries have turned to VuFind, which originated at Villanova. Virginia's Blacklight, with Stanford University as a development partner, is in a beta phase. And Rochester's eXtensible Catalog, or XC, backed by $1.2-million from the Andrew W. Mellon Foundation, will be rolled out in the spring.

The shift from commercial products to open-source ones is about more than money, though.

Bess Sadler, chief architect of the online library environment at the University of Virginia, sees the open-source Blacklight project as a "shift of power," as she wrote recently in the journal Library Hi Tech. The idea is that libraries, which know their local needs, should control the technology that patrons use to gain access to their collections. That's a change from the one-size-is-good-enough-for-everybody, commercially managed model that has prevailed in the industry.

The ability to customize is important when it comes to something like a music collection. A librarian might get this question: "I play the guitar. My boyfriend plays the flute. What duets can we play together?" In the past, even though Virginia had cataloged the instruments used in all of its sheet music, a search of that information was impossible because the fields that were indexed were maintained by a vendor, Ms. Sadler said.

"The problem with a vendor solution is that it's hard for vendors to tailor that solution for different collections, for different user populations, for different specializations," she said.

With an open-source system, a library can set its own relevance rankings and adjust them based on what users want. By maintaining the system itself, Virginia is now able to search by musical instrument.

The downside is libraries need someone on staff who can install and maintain the open-source program. So far, vendors aren't supporting products like VuFind the way they support established open-source products like Koha and Evergreen, both integrated library systems, said Mr. Breeding. Vendors will install software like Evergreen, host it on their own servers, and provide a help desk that you can call if something breaks. Not so for the newer software. Another barrier is going to be trusting that an open-source project is sustainable. There is always a concern that there will not be a community of users to keep developing it.

Also, the open-source systems have been slower to fold in article-level data, Mr. Breeding said. Most of that action is on the commercial side.

With Blacklight, you won't be able to get individual journal articles. If you're doing research on cell division, for example, a search will tell you that Virginia subscribes to the journal about cell division, but you'll have to go to a journal database for the article.

"That's going to be true for a very long time," Ms. Sadler said. "For the foreseeable future, you're going to need to go to separate interfaces in order to search licensed content."

But commercial vendors, smelling a new market, are stepping in. Serials Solutions, a subsidiary of ProQuest, released a software product in July called Summon. The company has been negotiating deals with publishers and content providers to create a searchable index of their content. It's like Google, except what Summon provides is an index of the "deep Web" of paid content. So now university libraries that pay for a subscription to Summon can let their users search their licensed content as well as locally owned stuff, together. Summon has 17 customers so far, including Arizona State University and Dartmouth College.

The catch? It can be expensive.

Andrew S. Nagy, senior discovery-services engineer at Serials Solutions, wouldn't say how expensive. But the cost of a subscription can run into the tens of thousands, said one university administrator who was not authorized to discuss price and thus wanted to remain anonymous. Summon also does not have permission to display the full text of articles.

At Virginia, the open-source Blacklight has paid off for Ms. Bauer.

"You know the feeling of when you go into the stacks, and you're usually looking for one book, but then it's almost always the book that's next to it that's the one you really need?" she asked. "It helps replicate a bit of that experience."

And if you search for Thomas Jefferson, it even starts you off with a book about him.

Comments

1. paievoli - September 28, 2009 at 09:18 am

Need to find a way to self sustain these costs or they are going to become prohibitive in the future. Self sustaining models are the future of all business and academia. Read Chris Anderson's "Free".
It explains how to deal with this new economic model that will affect us all.

2. mitt4jp - September 28, 2009 at 03:47 pm

I found this article a little mis-leading. First of all, a library catalog is structured differently from a search engine. To find items about Thomas Jefferson, the correct way is to use "Thomas Jefferson" as a subject, not as keywords anywere search.

Unfortunately, instead of teaching students how to conduct a precise search with few relevant results, faculty and librarians have found an easy way out -- googlize everything.

3. uvalibmobile - September 28, 2009 at 06:26 pm

<Comment removed by moderator>

4. bsparris - September 29, 2009 at 09:07 am

The problem is people are trying to use the catalog the wrong way. Instead of a keyword search like on the internet and online databases, the catalog offers something unique-- direct access to exactly what you want through a browse or exact search using subject headings, authors, titles. An old idea but it still works--give it a try!

5. pucciot - September 29, 2009 at 09:58 am

The Library was once considered to be the center of the University. It is now treated the same as the food court in the student center. It seems that the University Libraries (and Librarians) are not being rightly considered as an important part of the educational process. Teaching students what to search, how to search, and how to choose good resources is an important part of the the University education. Today it seems that just because our students come in knowing how to perform a google search that that is all they need. Library databases are "tools". Knowing how to use a tool properly must be taught. To apply a simple metaphor would be to think that just because a student took _Shop_ in High School that they should be able to be brought into a factory to build a car.
The University Library and the use of its resources should be considered part of the University Education. Web level discovery layers are new useful tools - but they do nothing to educate a student to be more information literate.

6. ladykaty - September 29, 2009 at 10:50 am

If the graduate students don't know the difference between a keyword and a subject search, I think, perhaps, that the university would do better to invest in a comprehensive information literacy instruction program rather than expensive "improvements" to the catalog.

7. commentarius - September 29, 2009 at 03:50 pm

Much as I am also irritated by users who don't know a keyword from a hole in the ground, the tendency to blame the user for not knowing how to use a catalog is exactly the kind of thinking that got us into this mess to start with. Yes, users are idiots. But good systems are designed for idiots and help idiots be successful despite their idiocy. That's why Google is so popular, and why catalogs are not. Any tool that requires "instruction" to use is doomed.

8. 11134078 - September 29, 2009 at 04:22 pm

There is a serious difficulty in all this. Faceted cataloging is inadequate. We have to start from this realization. Good old LC subject headings are still (SHOULD still) be the way to go. Learning to use them takes a few hours, but it is really not a big deal. (I taught this stuff until just a few years ago.) Once the concepts of the free-floating headings and the authority files are understood and there is also a basic knowledge of the material that used to be in the introductory section of the "big red books" and now should pop up online when needed, the system is at its base quite simple (despite its occasional bouts of illogic) and very effective. By the way, the current OCLC search engine is an unusable abomination.

9. rattebur - September 29, 2009 at 05:01 pm

Commenters who claim that students need to be taught the correct way to use existing catalogs need to come up with a comprehensive way to teach every student at a university this information. Librarians don't often have access to a wide swath of students for instructional purposes; at many institutions, they are dependent on teaching faculty and instructors to want to integrate library instruction. More user-friendly catalogs seem much more realistic at this point.

10. 11134078 - September 29, 2009 at 05:52 pm

rattebur, my friend, there are lots of things students need to be taught. Many of them are now subjected to freshman seminars, how to study sessions, long harangues to the effect that credit card companies really do send bills and really do charge extortionate rates of interest if those bills are not paid promptly. Come on now, how about a session on how to use subject headings? And "user friendly catalogs" are in fact hostile to users who actually know how to use catalogs because they are so damnably primitive and therefore yield so many irrelevant hits (or, alternatively) none at all.

11. jhough1 - September 30, 2009 at 08:05 am

I teach at Duke and live in Washington D. C. The LC catalog is wonderful. You can make a mistake in spelling, type in half a name, you name it, and you get something. Duke, I assume, has bought something, and you must have a perfectly spelled name, usually with first name and maybe the middle initial to get a reasonable response even on the author catalog. I just use LC and check the Duke stacks. Unfortunately, older books are off campus. Is it not possible to use LC technology?

12. erla32 - September 30, 2009 at 08:24 am

Duke uses an open-source solution developed by the NC State libraries and used to search all of the Triangle Research Network institutions (Duke, NCSU, UNC-CH, NCCU). Library of Congress has a purchased system -- Ex Libris.

13. zizzer - September 30, 2009 at 09:37 am

I guess I have finally reached the tipping point of the generational divide, maybe it's just my learning style, but I don't like getting a muddle of everything and the kitchen sink from search tools. I like knowing what media the tool I am searching indexes and where it will ultimately lead me.

Short of that I would want clear delineations in any results, and I see that frequently from students who didn't grow up digital. They don't want an eBook, they want a "real" book they can check out and take home. (We serve a rural area with spotty Internet access.) They don't want a citation, they want full text - right NOW - that they can print or save to a flash drive for later. We have a federated search tool to a set of consortium resources and many find it very confusing and it often yields inferior results because the searches have to be dumbed down to adapt to each individual database. In short, it stinks, and users often don't understand the results and miss great information. The smart ones ask for help, which gives me concern about the rest.

I would that we had more time to teach Information Literacy. When I was in elementary school our library visits had three components: Time that we learned about the library, story time, and time to find books to check out. In my freshman year of college I had to take a half-semester course called Bibliography where we learned to use the library and its resources. As it is now, we are lucky to get 50 minutes with the students who take Study Skills, but not all students are required to take it, and many consider the library day a day to blow off.

14. blackbart - September 30, 2009 at 09:38 am

I _think_ the issue that this article is trying to probe is the dichotomy between binary searching and search engines. Most well-established library catalogs use binary searching--you type in a term, and the catalog returns only those records that contain the term you typed (in whatever fields you did or didn't specify, depending on the search and the catalog interface). The results are binary: either the record matches the search string and is retrieved, or it doesn't and isn't. Search engines like Google, by contrast, use complex algorithms to interpret the search string in an effort to show you what the software "thinks" you wanted based on that search string.

It takes all of five minutes to explain that difference to students. It might take as long as an hour to drill the difference into them by demonstrating identical searches on binary and search-engine interfaces. Each has tremendous strengths; each has weaknesses relative to the other model. But do we really need to spend a gajillion dollars in software development and retrain the entire university community just because students were using Google before they got to campus?

15. greebie - September 30, 2009 at 09:44 am

Library instruction is limited. To remember what special ritual dance you need to do in your specific discipline, you need to actually practice it. That means dancing with each and every student for quite a long time. Personally, I'd rather put the teaching resources into critical thinking skills, source evaluation, finding learning networks (the best way to get the 'classic' tomes of a field is still knowing a prof and then tracing the scholarly pedigree via the bibliography).

Open source models look promising and hold the best option for sustainability over time. These products are very expensive for what they do - they shouldn't have to be.

16. erikagwen - September 30, 2009 at 11:12 am

Fact of the matter is students don't know who to use the catalog, library instruction is limited and frankly usually offered by people who are terrfied of google and web 2.0. You don't need to revamp the library catalog and interface, you need to revamp the librarians and how they are taught.

17. intplibrarian - September 30, 2009 at 12:06 pm

erikagwen, I don't know what library you've been in lately, but librarians are most definitely NOT "terrified" of Google and web 2.0.

Many librarians will tell you that Google is incredibly useful for many many things, even tasks that most non-librarians wouldn't think of. Citation verification, for example.

But, a catalog is NOT a search engine. It's a specific tool with a specific purpose and, yes, it has to be learned -- though that should only take a few minutes at most. However, the student has to be willing to learn... oh, wait, that's what's supposed to define "student" isn't it?

18. chaselibrary - September 30, 2009 at 01:01 pm

To erikagwen - The student who won't consider a source that isn't online is just as bad as a librarian who might be afraid of Google. Some students will come to realize through use that some sources are easier to use in print than online because of the ability to page back and forth. They are closing their minds to a wealth of information if they think that online only is the answer to everything. That should be part of the teaching of critical thinking.

19. odulibrary - September 30, 2009 at 02:21 pm

The commentary discussion is as enlightening, and disheartening, as the article itself.

20. primaryovertone - September 30, 2009 at 03:50 pm

It seems that this argument does have two sides: 1) The catalog system is flawed and needs to change to support the user, 2) The user is flawed and needs to change to use the catalog system. In my view neither side is 100% correct but then neither side is 100% incorrect either. There are some things about the catalog systems I have used which could be made to work better and faster and more intuitive even for someone who is familiar with the systems. Personally I started with a card catalog so computerized systems are a great leap forward to me. But at the same time the users need to learn about the exsisting system and learn how to interact with it to gain access to what they are looking for. I am still surprised when I run across college students who do not have at least a rudimentary understanding of the Dewy Decimal system and how to take the catalog reference and find something on the right shelf.

21. pucciot - September 30, 2009 at 05:28 pm

primaryovertone,
I do not see the comments in those two catagories. The two catagories seems to be 1) The catalog system is flawed and needs to change to support the user; 2) The catalog system can be improved _and_ the user is flawed and needs to change to use the catalog system. The use of Discovery Layers in searching the catalog would be useless if the library's catalog system didn't already exist and was well maintained. Librarians have always tried to make their systems usable for their patrons. Interfaces to the catalog have been changing and evolving for hundreds of years. The main point is that even the use of a Discovery layer will have to be "taught". Installing, a simple and easy search interface does not relieve the University and its librarians of the duty to teach proper information searching to its students.
Students are ignorant, they enter the University to change that.

22. daniell - September 30, 2009 at 07:14 pm

I don't think I've ever had trouble with my library's search functions because I never blindly go in seaching for a topic I've never dealt with before. If I go into an area blind, I first search out preliminary articles in JSTOR. After I've read a few, that gives me a basic list of authors and their bibliographies. Then hey, it's pretty much cake from there.

But that doesn't really address the issue here, I don't think.

23. daniel_tunkelang - September 30, 2009 at 09:15 pm

I'm pretty happy with the Triangle Reseach Libraries Network, which I saw referenced above. Here's what happens when you search for Thomas Jefferson:

http://search.trln.org/search?Nty=1&N=0&Ntt=Thomas+Jefferson&Ntk=Keyword

Not only do you see a first page of highly relevant results, but you have the opportunity to refine the thousands of results by subject, format, location, and other meaningful facets.

As one of Endeca's co-founders, I'm especially proud of how TRLN has used our software to deliver a next-generation library experience based on faceted search.

For more information (not a sales pitch), I encourage you to check out my blog post entitled "Librarian 2.0":

http://thenoisychannel.com/2008/07/07/librarian-20/

24. primaryovertone - October 01, 2009 at 09:02 am

pucciot,
Personally I am not a fan of discovery layer technology for any kind of searching whether it be in a library or on the internet. Creating a search protocol that assumes that my previous searches for information have bearing on my current search is illogical to me. When I imply that improvements can be made to library catalogs I mean that I want the ability to filter my results by media type and sort by year of publication and other features of that type. I have found features like that in a few catalogs but not consistently. None of this however changes the fact that students need to learn how to use a library. Whether there is a class in library use or they have to get the information one on one from a librarian, they need to learn the resources for continued learning.

25. tjwiebe76 - October 01, 2009 at 03:19 pm

"You expect a Google search to cast the broadest possible net. The same should apply to a library catalog, the thinking goes."

The less effort the better, right?

I agree this a great way to begin a library research experience and most students would prefer it as such, but there's still the issue of knowing what a source "is" and what makes it good or bad, credible, relevant, etc. I have no trouble with starting big (in fact, I think it's great!), but the concepts of narrowing and evaluating will become all the more critical.

Also, knowing how to formulate a search query will need to be stressed even more than usual due to the fact that integrated database/catalogue searches blur the lines between Google and the library simply based on how they look. With a single, non-threatening search box and nothing else, students will likely start typing in questions or full sentences like they so often to into Google.

26. abcrs - October 02, 2009 at 02:03 pm

This is such bad journalism - if you do what they say in the catalog you get books on the man (using either last name and first name) http://www.lib.virginia.edu/ .

Does Chronicle bother with a basic fact check?

27. kmerguer - October 05, 2009 at 07:56 pm

I think the Univ. of VA has added a couple of books about Jefferson since the article was written. The Brazil proceedings are still there, but are now in the #3 position. They have something to do with a "Thomas Jefferson National Accelerator Facility" which is why they are appearing in the search results.

28. elleanor21 - November 24, 2009 at 01:43 pm

The Library Catalog is, indeed, a very useful and powerful tool. But useful and powerful means absolutely nothing if it is not actually used. Ultimately it comes down to this - you cannot change human behavior on a large scale without an equally large scale endeavor.

Human behavior dictates that people will use the simplest means to get what they want. People don't purposely try to use more complex tools, unless they value the results that those tools can yield (enough, at least to expend the effort to understand those tools).

Students are not frequently among this group. They do not usually care if their search results are a perfect example of precision and recall. They want results in the quickest way possible and do not always care to evaluate those results. And, honestly, it does not always matter if they get the best results or not.

You cannot change this behavior without a tremendous educational push - complete with a massive library instruction program. This, even is not guaranteed to change user behavior. And the expense of such a program is definitely not a realistic possiblity for most libraries - either in terms of funding or staffing.

So, we must ask ourselves, what do we value most? Our catalogs or our users? Organizing information or making information accessible? It doesn't have to be an issue of choosing only one or the other - but we do need to emphasize one OVER the other. And despite what most comments here proclaim, they are NOT the same thing - or else this article would not even be an issue.

Add Your Comment

Commenting is closed.

subscribe today

Get the insight you need for success in academe.