Many writers and scholars have criticized Google’s immense digitization project for errors in scanning quality and for many different kinds of problems with the metadata associated with each item.
Metadata include information like author, title, publisher, and publication date: the information that structures search queries and information retrieval in library catalogs and reference databases. Without accurate metadata, even searches using Google Book’s Advanced Search feature, which allows you to restrict searches to, for example, author: Dickens, Charles and publication date: 1850-1870, won’t result in a complete listing of the books Google has scanned that meet those search criteria. Precisely because Google Books encourages those of us working in historical fields to research new kinds of questions about and in public domain texts, the metadata problems can be very frustrating.
Mirlyn: the University of Michigan library catalog
However, there are other ways to locate digitized materials, including those scanned by Google, which many researchers are not yet aware of. When the Google Library Project began in 2004 with five library partners (Harvard, U of Michigan, Stanford, Oxford, and the New York Public Library), the University of Michigan began including links in Mirlyn, their library catalog, to the Google Books digital copy of items already held in their collections. In some instances, due to the order in which Google’s scanning and processing takes place, these links go to a copy that was digitized from another Google Library partner. The Michigan Digitization Project will eventually digitize all items in Michigan’s collections.
In September 2008, with significant leadership from Michigan University and Indiana University, the digital repository HathiTrust was formed as a collaboration among the thirteen universities of the Committee on Institutional Cooperation, the University of California system, and the University of Virginia. (The name comes from the Hindi word for elephant.) HathiTrust now includes additional library partners.
HathiTrust serves as a shared repository for digitized items from the member institutions. Many of these items are also included in Google Books, but some are not, such as rare items not held by a Google Library partner. Hathi Trust offers several different ways to search their materials, including a temporary catalog with basic bibliographic searching (a permanent catalog is being developed), full text searching, and the ability to view and search particular collections of materials. Several institutions, including Michigan, now include HathiTrust items in their library catalogs as well.
Use of Hathi Trust public domain materials is open to anyone and the site offers both scanned image view and a text-only view. However, only users affiliated with a Hathi partner institution may download PDFs of public domain items. Users can also create private or public collections of items. (Non-affiliated users may create a Guest ID through Mirlyn to use the collections feature.) Hathi does offer some limited searching of items that are not in the public domain and therefore not viewable in full text.
I should note that I am not affiliated with a member institution, nor do I have any professional involvement with Hathi Trust. I’m just an enthusiastic user. Because none of these tools are comprehensive, I use Mirlyn, Hathi’s catalog, and Google’s Advanced Book search in combination to find the items I’m interested in with the greatest amount of ease and accuracy.
Typically, I will first use Google’s Advanced Book search to see how large or problematic the results I’m getting seem. Then I’ll go to Mirlyn, which offers catalog access to some Google Books items and all HathiTrust items. For the latter, I’ll view them at the Hathi site first. If it’s a book I’ll be using extensively, I might then select a very specific search string to find the item in Google Books in order to download a PDF.
The digitization of public domain texts has transformed the kinds of research and teaching I can do, and the University of Michigan library and the Hathi Trust project offer important scholarly tools for finding and using those digitized items.
[Creative Commons licensed photo by flickr user timparkinson.]