Colleges working with Google on the company’s effort to scan millions of library books today unveiled their own search tool to comb the full text of some 500,000 volumes.
The tool has a few features that Google lacks, said John P. Wilkin, an associate university librarian for the University of Michigan at Ann Arbor. He is leading the group formed by the colleges, which is called HathiTrust Digital Library. The killer app: HathiTrust’s search lists every page that contains a user’s search term, while Google’s might return a partial list, said Mr. Wilkin. “That’s a small amenity, but I think you’ll see us continue to devote energy to that sort of tool,” he added.

Last year we mentioned that HathiTrust had released a campus-by-campus search of the books they had digitized in collaboration with Google. They had not yet figured out how to allow searching across those collections, though. They spent the last year working out the kinks, and now the global search tool is up and running. The feature is in beta, meaning that it is still a work in progress.
What’s next? A draft press release about today’s new feature said the digital library would soon include an advanced search, a “more like this” option, and “tools that can be used in computational research.”





6 Responses to Universities Add Their Own Search of Google Books
dlsadmin - November 18, 2009 at 5:56 pm
Please note that the full text search we’ve launched is actually all 4.6 million volumes. The 500,000 volume figure was for our earlier experimental search, which we launched to help us test our strategy. We’ll continue to provide a search over the entire body of content in HathiTrust, which will soon reach 5 million volumes, a pretty phenomenal accomplishment!
d_fevens - November 19, 2009 at 6:08 am
My work “Fevens, a family history” was digitized by the University of Wisconsin in partnership with Google without my permission or consent. As the university is a member of the Hathi Trust a digital copy was deposited with them. Since I consider the digitization of my book by Google & Company (i.e. The University of Wisconsin) an illegal act under U.S. and international law I insisted that the Hathi Trust destroy all of the products they received from the university as a product of the digitization of my book. The digitization of books holds the enormous promise of bringing more works to more people,however, Google & Company’s appropriation of in-copyright works is wrong, and I believe illegal. It is unfortunate that the Hathi Trust holds in their library these illicit volumes. Douglas FevensHalifax, Nova ScotiaThe University of Wisconsin, Google, & Me
mbelvadi - November 19, 2009 at 7:01 am
Mr. Fevens, you seem unaware that American copyright law has an exception to the requirement of obtaining permission, called “Fair Use”. Many people consider Google’s scanning of copyrighted materials for search purposes to fall within that exception, and hence is perfectly legal. Many copyright holders who don’t like the Google Books project wish that Fair Use didn’t exist, but it does, and it is a much stronger exception than “Fair Dealing” is in Canada.
commentarius - November 19, 2009 at 1:25 pm
Maybe libraries – those that for some odd reason found it necessary to acquire a book about Mr. Fevens’ family history – should also destroy all print copies of his book, just in case somebody were to make an illegal photocopy or perhaps open it without his express permission. Advice to authors uncomfortable with the 21st century information landscape: complete your priceless manuscript on a typewriter, package it carefully, and bury it in your back yard. Then destroy the typewriter and kill anyone with whom you have discussed the content. Then you can be reasonably sure that the Digitization bogeyman won’t get you.
d_fevens - November 19, 2009 at 6:22 pm
I am familiar with the concept of “Fair Use” and I am also aware that many people do not believe Google & Company’s use of it can be applied to the way they use in-copyright material. The University of Wisconsin calls their relationship with Google as a “partnership” and it is in fact a commercial partnership. The university library’s on line book catalogue cards even have a direct link to Google Books! It was an unauthorized commercial use of my book and my name! Being a commercial partnership voids the university’s privileges granted to them in Section 108, Chapter 1, Title 17 of the United States Code to make copies of in-copyright works contrary to Article 1, Section 8, Clause 8 of the United States Constitution. The Section 108 Study Group; (“a select committee of copyright experts charged with updating for the digital world the Copyright Act’s balance between the rights of creators and copyright owners and the needs of libraries and archives.” as the group is described on their web site) 2008 report states: Machines read and render digital content by copying it. As a result, copies are routinely made in connection with any use of a digital file. While these copies may be temporary or incidental to the use, they are considered “reproductions” under the copyright law for which authorization is required absent an applicable exception.(Introduction, Page 6, Second “bulleted” item) Even if Google had gone out and bought my book that they digitized without my authorization, “Fair Use” would still not entitle them to make a digital volume and place it in their so called “digital library”. Every time a digital book is searched, an entire “book” is used, not just one section but the entire book. It is not the “snippet” that Google displays that violates copyright law it is the digital volume that does. There is a genealogical section in “Fevens, a family history” and if I had known that my book was to be digitized, it would not have been published at all since it contains personal information of living individuals. If an individual finds their personal information displayed at Google Books they have no way of having it removed because Google & Company only recognizes the copyright holder to be eligible for their so called “opt-out” policy. All of my books were sold to or donated to family and friends, libraries and archives, sept four that were sold to a company that has standing orders with research libraries. I thought I could trust a library to respect my copyright!Douglas Fevens,Halifax, Nova ScotiaThe University of Wisconsin, Google, & Me
11159995 - November 20, 2009 at 2:07 pm
Douglas Fevens knows whereof he speaks, and the easy dismissal of other commentators of his objection does not reflect well on their own understanding of the law, which he clearly knows much better than they do. It should be noted that in its Settlement with the publisher/author plaintiffs Google did back off its approach of opt-out for in-copyright works, thus accepting what was the plaintiffs’ major objection in their infringement suit. Opt-out remains in the Settlement only as applying to “orphans works” where the copyright owner is not known or cannot be discovered. That is a significant concession to Google by the plaintiffs (and one of the main points of contention since the Settlement gives Google a virtual monopoly over making these works publicly available and charging for them), but Mr. Fevens’s work clearly does NOT fall into that category and he is entitled to the payment provided for in the Settlement and can remove his book from the system. Thus, while Google may still believe that what it did was “fair use,” in fact Google has accepted the claims of people like Mr. Fevens that what it did was not fair. — Sandy Thatcher, Penn State University Press