February 4, 2008
A Million Books Scanned at U. of Michigan -- and Counting
Librarians at the University of Michigan at Ann Arbor threw themselves a party on Friday to celebrate a milestone in their ambitious effort to scan every single book in the collection. They scanned the one millionth book, leaving just 6.5-million to go.
Most of the scanning has been done as part of the library’s controversial deal with Google. The search giant is working with dozens of major libraries around the world to scan the full text of books to add to its index. But Michigan is one of the only institutions to agree to scan every one of its holdings — even those that are still covered by copyright. Some publishers have sued Google for copyright infringement over the scanning effort, though officials from Google say their effort is legal because they are not making the full text of copyrighted books available to the public.
Paul Courant, dean of libraries, wrote on his blog that the achievement is historic. “As far as I know, Michigan is the first library to have one million books from its own collections digitized and available for search (and, when in the public domain, available for viewing).”
Google has refused to release numbers on how many books have been scanned as part of its effort or to reveal which books are part of its index. But officials at Michigan have said that they are getting faster at scanning as they go.
“We all eagerly await (and it won’t be long) the next million, and the millions after that,” Mr. Courant wrote. —Jeffrey R. Young
Posted on Monday February 4, 2008 | Permalink |Comments
Commenting is closed for this article.
Previous: Recording Industry May Prevail Over U. of Maine Students
Next: Answer Your Cellphone. The Taxman Is Calling.
This is excellent! Well done!!
— Ina Smith Feb 4, 12:11 AM #
This type of initiative is a windfall for those of us working in the digital humanities, especially literary computing. Each work we need that comes on line is like winning a prize in the lottery. Let’s hope others join in the effort.
— Joel Goldfield Feb 4, 09:16 AM #
If there were just some way to reward publishers and authors under this scheme. I understand their point of view, too. The current situation is doubly troublesome because the way Google is getting around© is by blocking pages. Invariably the key phrase I am looking for is on a page that is withheld. Last week I Google-booked a phrase and found it, but the page was blocked. I went to Amazon and found the book there too. The page was blocked. (Is Amazon using the same database?) In any case, the value of the project as a research tool is greatly diminished if I still have to trudge to the library to see all the pages. Oh well, at least this way I’ll know what page to look on (hmmm, that will only work if it’s the same edition, rats!). Maybe they can work out a Rhapsody kind of solution. I’d gladly pay a monthly fee for access to a million+ books in their entirety!
— R. Killoren Feb 4, 09:58 AM #
Excellent news. Now if only we can make all this great information available for free to the public. Forget “rewarding” the publishers and authors—they should be thrilled at the number of hits they get. Now that’s academic spirit :-)
— J. Stone Feb 4, 02:28 PM #
I’m a computer geek not a librarian.
I think the best way to solve the issues between publishers and libraries is “micro payments”.
What if they charged a nickel to view the 25 words before/the 25 words after the phrase you’re attempting to find? Not only does this reward the author/publisher (if the book is popular) but suddenly there is an incentive to digitize old magazines and books.
If only Google had a method like PayPal to collect money from people ;-)
Mike Honeycutt
UNC Asheville
— Mike Honeycutt Feb 4, 05:05 PM #
Perhaps when the writers strike that has recently paralyzed Hollywood is resolved, it will provide an example for future scholarly work to follow and help solve the problem of giving financial credit for digitized information. Or perhaps I am just a dreamer.
— pirata Feb 4, 07:04 PM #
So what use will this project be if you can’t access the books??? The article needed to include just how someone might access the books. Blocked pages are useless.
— annon Feb 4, 10:15 PM #
Wonderful news! How nice if the titles are put across for public access.
— Dr.Y.M.Patil Feb 5, 01:48 AM #
When the collection will be scanned, I guess Mr Courant will close the library and fire the librarians . A perfect example of a Library without wall, without book and without librarian.
— Francois Lapelerie Feb 5, 03:49 AM #
Why would you think that? The digitized books still require librarians to organize, answer questions, maintain — just in different ways.
— Chris Feb 5, 09:48 AM #
When the collection is scanned, and Google users find access to the scanned pages blocked, guess what? — those users will have to get the books from a library. Google makes it pretty easy to find out which libraries hold a given book with their“find this book in a library” feature (which actually takes the user into OCLC’s “WorldCat.org”). The list of libraries shown is sorted by their distance from your location.
— Gary Charbonneau Feb 8, 08:20 AM #