• May 25, 2013

Previous

Next

Digitizing the Personal Library

September 28, 2010, 2:00 pm

Chronicle of Higher Education

Books take up space. That’s a problem for any librarian tasked with finding room on overcrowded shelves. It’s also a problem for a book-loving scholar who lives in a small New York City apartment with a toddler and more than 3,000 books. Under those conditions, something’s got to give. Chances are good it won’t be the toddler.

Alexander Halavais, an associate professor of communications at Quinnipiac University, found a partial solution to his city dweller’s no-space-for-books dilemma: Slice and scan. A digital file takes up a lot less room than a codex book does.

In a post on his blog, A Thaumaturgical Compendium, Mr. Halavais described what he had done to some 800 of his books so far: “First I cut the boards off, and then slice the bindings. I have tried a table saw, but a cheap stack cutter works better. Then I feed [the pages] into my little page-fed scanner, OCR them (imperfectly) using Acrobat, and back them up to a small networked attached storage device.” (See before-and-after pictures, above.) Many of the scanned books he also stores as image files.

The process is slow and, yes, painful. “I’m over a thousand books in, and even now I get that—especially with hardcovers, for some reason—that gut feeling of ‘Will I be judged by the book gods for doing this?’” Mr. Halavais said in an interview. “Destroying books is very difficult. If I could do this in a nondestructive way, I would.”

Google can draw on multitudes of technicians and high-end equipment for its huge book-scanning project. Mr. Halavais does it all himself and works with technology that suits a scholar’s budget. He has a Fujitsu ScanSnap that “cost maybe $400,” he said, pointing out that if you’re handy you can build your own scanner out of scrap for a hundred bucks.

The top candidates for translation into the digital sphere are books Mr. Halavais might want or need to consult for his research, which focuses on social computing. (As he explains on his blog, he examines “the ways new communication technologies are affecting politics, government, education, journalism, and business.”) So his library contains books about technology and society and politics. “A lot of it is political science from the 1960s,” Mr. Halavais said, as well as works such as Nicholas Negroponte’s Being Digital (Vintage, 1995) and some of Bill Gates’s oeuvre. Most are things that Mr. Halavais might want to cite now and then but is unlikely to reread all the way through.

Mr. Halavais plans to keep about 500 of his 3,000 books in codex form. Fiction and unique or lovely books won’t get the slice-and-scan treatment. Books heavy on photography or art don’t scan well and are spared. So are books with sentimental value. “I still have my mom’s copy of Das Kapital with her marginalia in it,” Mr. Halavais said. “That’s stuff that I probably won’t scan, because I like to have it in the physical form.”

The Chronicle asked Mr. Halavais whether the scanned books still felt like a library. Without a physical presence, “no, they really don’t,” he said. The scanned material “feels like a resource, but it does not feel like a library in the same way.”

Mr. Halavais described his scanning project to an audience of publishers at the Association of American University Presses’ conference in June as part of a talk on “The New University Press.” He expected the audience to be dismayed. “They weren’t really,” he said. “In the publishing industry, remainders get destroyed.”

His students, however, are appalled when he describes what he does to his books. “I have students who are very digitally minded and are in a program in interactive communication, but they’re aghast at the idea that I scan [books] and that I destroy them afterwards,” he said. “Both of those are really disturbing to them.”

As for himself, “I will always have a bookcase,” Mr. Halavais said. “But when I was a kid I thought when I grew up and became rich and famous, I would have a library room. That seems very unlikely now.”

This entry was posted in Student Life. Bookmark the permalink.

28 Responses to Digitizing the Personal Library

22055008 - September 28, 2010 at 4:56 pm

Not a word about copyright? Who else will have access to the digital files?

drpopejoy - September 28, 2010 at 5:02 pm

I could probably do this to textbooks that I use for teaching purposes, but as for all my other books (about 9000), I do not have the emotional ability to make this leap forward into the 21st century. As the kids grew up and left home, their rooms have become ancillary library rooms that are specialized offshoots of the main “libary” that takes up most of the 1800 square feet on the first floor. Destroy my books to digitalize them would be like murdering friends. Just not in my DNA.Michael W. Popejoy, Ph.D., M.P.H., FRSPH

zagros - September 28, 2010 at 5:40 pm

I have done the same with many of my journals. Copyright definitely enters into the picture with whether you share the material with others. Destruction of the books is a necessary step since if one passes on the book and keeps the digital copy, one now has a copyright issue. However, if I make a copy in my own home for my own use and don’t pass on the journal/book (since I have destroyed it), how is that a copyright issue? The fact remains that there is only 1 copy extant. Indeed, with e-books now proliferating (and offered by the publisher), isn’t the issue one of distribution (a copyright issue) as opposed to how the material is preserved?

annegjones - September 28, 2010 at 6:12 pm

Now that I’m in a smaller house, my huge library’s moving into a 10×10 storage unit. The books are organized by general categories and placed onto cheap shelves. I’m cataloguing each book on http://www.librarything.com, which instantly sorts by author, title, “tab,” etc., so I can see from just what’s out there–and I’m finding out how many copies of a given book I have unknowingly acquired and can happily donate. Librarything.com lets you scan the barcode on your book (if there is one) to load it into the system by isbn number, or you can type in a title or author and be presented with pretty much every edition ever published from which to click on yours. Its searches include amazon.com, Library of Congress, and many others. Thank you, librarything! And if you haven’t tried it, give it a whirl. It’s an amazing too even if you keep your books at home.

larrycuffe - September 29, 2010 at 7:24 am

Zagros comment 3′s comment is interesting, but betrays an american perspective. The idea of “a copy for your personal use” is American, and does not apply globally.I feel tempted to do this when reading hard copy books, because all the tools I might use on a digital text, such as search, summarize, outline etc. are not available in hard copy.I cant see it yet but I think a principle here should be “non destructive”, I don’t have problems with the scanning, just the destruction.In this spirit, I would expect the ghost of Anne Fadiman to make your apartment her first port of call when she dies. (If you have not read her “Ex Libris”, a hagiography of books, do so.

dsbergccp - September 29, 2010 at 7:26 am

On one side of the coin, academics appear to be like obsessive-compulsive hoarders of books. However, it is the other side that is more interesting: DISPOSAPHOBIA.They can’t let themselves throw away books and journals regardless of their dust ridden condition. A vague anxiety gets triggered when an academic tries to get rid of a book or old journal; the prof can’t get rid of it and intellectualizes that it will be needed someday or the old college textbook or novel will be read again in retirement. If you don’t believe me, go to your shelf and try to eliminate the some of the stash — does it make you queasy? DONATE the books to libraries, hospices, Haiti … where ever. Stop carrying them from place to place; let some other hoarder hoard them.

d_fevens - September 29, 2010 at 8:08 am

Copyright. Unlike Google & Company, (e.g. The University of Wisconsin) Professor Halavais is not, as far as I know, exploiting the digital files of his personal library. If he were to publish the digital copies of copyrighted works to the internet, there would be a problem. Even though Google Books only shows “snippets” of books in the search results of copyrighted works, the whole book was published or made available to the searcher, to get that search result.Douglas Fevens,Halifax, Nova ScotiaThe University of Wisconsin, Google, & Me

pippi - September 29, 2010 at 8:23 am

Sooner or later, there will be something like the iBooks app that will let you upload your personal book files, and when you can see them on a virtual shelf, perhaps it will feel more like a library.

jenhoward - September 29, 2010 at 8:35 am

Re copyright: The scanned books are for Mr. Halavais’s personal use, not for distribution, classroom use, etc. (In that sense, they’re no different from a codex collection, except that a printed book is easier to lend.) Sorry not to have made that clear in the story; I thought the headline made the point.

22268954 - September 29, 2010 at 8:38 am

I doubt I could do this to my own library…but maybe someone else’s! I’m reminded of my dear aunt, also a librarian who kept about ten books in her apartment. Whenever I expressed my amazement/dismay (unlike her I’m a book hog), she blithely reminded me that “That’s what libraries are for.” She’s long gone from this plane, but, my husband quotes her to me often. So maybe we’ve come full circle.Candace Broughton

psameca - September 29, 2010 at 8:39 am

Sorry, but the mere thought of slicing up books to scan them is abhorrent to me. All right, I AM a librarian and we fight against this kind of thing (i.e., razoring books and magazines by students and others) all the time. I make liberal use of both online and hardcopy sources in my own reading and in my recommendations to students and faculty, but I do believe there is something intrinsically valuable about experiencing a book. “Playing in the collection” – where I wander through my library looking at spines – allows for serendipitous discovery in a way that no Google/Nook/Whatever can duplicate. Reading is about looking and the visual and tactile work together to create a full experience.Moira Steven, Library DirectorMaine College of Art, Portland, ME

ted_major - September 29, 2010 at 9:42 am

There’s a copyright issue here, regardless of what Prof. Halavais does with the original book or the resulting files. US Copyright law prohibits making copies. What he’s doing is probably fair use, depending on the nature of the works being copied, the amount of the whole he copies, his use of the copies, and the economic effect of his copying, but (at least here in the US) there is no bright-line test for what is and is not fair use. Given that he’s copying whole books instead of excerpts, and that he is causing economic harm by making his own e-book instead of buying one from the publisher, a court could find that what he’s doing isn’t fair use. Without a lawsuit, there’s no way to know for sure.He also doesn’t need to destroy the books to scan them, and could probably scan them more quickly with a non-destructive home-built book scanner.

ldoll - September 29, 2010 at 10:15 am

Professor Halavais is not so much making a copy as he is transposing him copy into a form more accessible to him. In the process, he is destroying his physical copy. The publisher will probably benefit in the long run. Being cited by authorities in the field usually results in interested persons obtaining a copy, by either borrowing from a library or buying their own. As long as Professor Halavais does not distribute his scanned copy, I see no harm to the publisher, and violation of copyright. Enforcing their “rights” has just about killed the music industry; it’s time to come up with a new economic model and a better way to support authors. I fear that publishers are destined to go the way of buggy whip manufacturers….

kakerino - September 29, 2010 at 10:31 am

I agree with Ms. Steven about the pleasures of the physicality of books and also with DISBERGCCP’s comments about disposophobia. My first response, however, was amazement at the tediousness of the task Mr. Halavais has set for himself. For me, one of the greatest joys of teaching is that 90% of my work life is NOT tedious — that I am not forced to do the mindless crap I did as a grill cook. Kake

awegweiser - September 29, 2010 at 11:21 am

Bravo for the suggestion that a none destructive way of copying be found.I will soon be moving from my fairly large farm house (built in 1837) to a brand new apartment with modest space and, of course, the cost of moving lots of books. I am therefore giving all the text books to students at my University (if they will get off their butts and pick them up), others to the smallbookstore in this town that is struggling to survive and the Salvation Army/Good Will and thelike just for the picking up. I did have an otherwise quite successful garage sale but only about10-20% of the books went for peanuts (as expected).Although it is a book, I suppose that a 10-20 year old intro geology textbook could be dumped given that every 3 years or so a new edition comes out at around $100 a pop for often no realgood reason except the obvious.

ephotog - September 29, 2010 at 11:27 am

I can relate to BOOK DISPOSAPHOBIA even though I find it relatively easy to downsize other parts of my accumulations! Thanks for info on the home-built book scanner; I look forward to trying it.http://www.starr.net/is/fu.html has a chart on factors to consider with respect to fair use.

gotoole99 - September 29, 2010 at 11:28 am

What about search? Are these image files searchable? Keywords? Images?

matthew_tiffany - September 29, 2010 at 11:41 am

ScanSnap is one scanning choice. I use the portable NeatScan scanner, which costs less (especially when Woot.com offers it as their daily deal). It has the option of scanning pages into PDF files.gotoole99: I believe PDF files can be searched if you use Evernote to catalog the files.

athlwulf - September 29, 2010 at 12:07 pm

I find myself both fascinated and horrified. I’ve thought about digitizing some of my 4000+ books, but could never muster the strength of will to destroy them in the process. I have, already, bought several eBook versions of books I already own and wish publishers offered some sort of discount for buying a digital version along with the physical one. I still think a library in the home is more important than a list of files on my computer, especially with kids in the house. My son may look at books in our library and pick one that interests him, but I doubt he’s ever gone to iBools or Kindle to see what’s there.

halavais - September 29, 2010 at 1:01 pm

Thanks all for the great comments. Some responses:It’s still not easy for me to cut into a book. I’m in the process of making a non-destructive scanner (my second effort), but the idea that it is faster to scan the book intact is just wrong. The physical cutting of the book takes very little time, and a sheet-fed scanner is a completely different investment in time when compared with hand-turning each page. Nonetheless, as Kake notes above, it is a tedious process. I usually do batches of 20 books or so at a time, but this year my grand total is one batch of 20. The main reason I destroy the books is because it is a murky copyright area. I think I have a very good case for a fair use defense, but I really (please!) hope I never have to use it. In practice, by taking a physical volume out of circulation, I am helping a publisher’s bottom line. That book can never be resold–it ends with me. Yes, the searchability of the books was a point in favor of digitizing. In practice, Google Books has scanned many of the books I own (though perhaps not a majority) and so for simple search I could rely on that. My additional hope is that someday I might be able to apply other mining tools to my own library, but I have not really even experimented with that at this stage.By gaining these new ways of moving through my books, I lose the other–the spatial browsing and restrcturing of the collection Moira points to. There is no question that this is a significant sacrifice. One of the other significant negatives is that I can no longer lend my books to my students and colleagues. And that those books I no longer need cannot find their way back into used book trade and donations. In fact, I would say this is the greatest disadvantage. It’s unfortunate that the advantages of the first-sale doctrine have not (and perhaps cannot?) find an analogue in the digital world.It wasn’t–and isn’t–an obvious choice for everyone. I would be surprised if most people decided to do the same. But for me, a peripatetic scholar with modest living spaces, it seems to provide more advantages than disadvantages.- Alex, Book Butcher

lizcha - September 29, 2010 at 1:46 pm

Cannibalism on such a grand scale horrifies me, though I do sympathise with your motives (to some degree). Like most ardent readers, books are not just sources of information but also deeply personal objects, with a host of memories attached. Destroying them would be akin to destroying a piece of jewellery, a painting, or a momento. Something that would make me wary of digitisation is the prospect of access to files and possible loss of electronic archives. Just how many copies can/should you keep and what happens when your computer crashes, catches a bug, etc. etc.?I would only feel safe if I could lay my hands on a physical copy of the book – if I have to rent extra living space to accommodate my collection, so be it.

digilib - September 29, 2010 at 2:51 pm

I’ve been trying to interest librarians at the university where I work in offering this service to faculty members–let us scan and convert your books to searchable pdfs. We have an ATIZ book scanner so it wouldn’t be that much trouble. Problem is, I haven’t been able to convince any of them that it’s a service anyone would care about. Seeing this article makes me think it’s time to try them again.And I agree, if the digital facsimiles are for personal use (e.g., not to be distributed) then there’s no copyright implications.

beckerpa - September 29, 2010 at 10:23 pm

There is the theoretical and the practical in these pieces. My vision is to have a professional library filled with books that collect dust to show off my intellectual pursuits (many texts that I have yet to re-read), only to have that vision dashed by reality, which is not having this vision realized by many more decades because I can’t afford to save the down payment on a house (or the multi-thousands it would take to upgrade a fixture to the state it was before being trashed by renters – please save the diatribe of ‘you should buy now’ nonsense of the housing market).Therefore, the practical, realistic side of me comes out with the notion of actual space to display said library which I no longer have due to downsizing because of lack of raises or increase of salaries.I read this with very keen interest and find nothing wrong with it. We have books that we will bequeath to future family members, but with other titles, I can download on the nook/kindle/e-reader and now possibly scan without a hint of transgression.Reality in my life matters more than my philosophical issues of keeping books that I have no room for. Sad, but true. I envy those with the room to display their knowledge, but at this point, I’m very willing to let public libraries do this for me.

sefl_librarian - September 30, 2010 at 4:16 pm

Some people enjoy looking at their gardens and fine artwork in their homes. I enjoy gazing at my personal library of books, all lined up in their glory. It’s a beautiful site to behold. The way they feel and smell and even sound when I open them gives me a sense of peace. These are the books I collect for my personal library on topics special to me. I would never destroy them or scan them. They are worth more together as a library than individually as bits of data. If I want bits of data, I am perfectly happy to peruse through an ebook. If the ebook strikes my fancy, then I will look for it as a hardcopy to add to my collection. Perhaps I am a bookaholic, but my habits don’t harm anyone (just don’t ask my family about the money I spend or the amount of space I devote to books – they have a different opinion).

sagavia - October 1, 2010 at 7:51 am

It is not needful to cut apart the books. In fact it is slower. Acrobat is less accurate. Go look at http://www.diybookscanner.org/

mbelvadi - October 3, 2010 at 8:28 am

Mr Halavais, let me be one of the few librarians to speak up to say that what you are doing is entirely OK, and should not be an affront to librarians or anyone else. To the librarians who have responded, with “horror”, I would like to gently suggest that you are projecting your professional practice standards inappropriately onto a personal collection. The key here is that these are mass-produced books, not one-of-a-kind originals, manuscripts, etc. It’s absurd to suggest that any private individual (rather than a library) has any moral obligation to keep their own purchased books intact in perpetuity – that’s what libraries are for (as one commenter has already said). If libraries are doing such a poor job of managing the long term preservation of books that they think a private collection’s loss is worth a single moment’s fretting over, then we are seriously failing in our mission to society.The folks who keep trying to suggest nondestructive scanning methods apparently don’t understand what an “automatic document feeder” (aka “sheet fed scanner”) is, what it means to the amount of work involved, or what it requires. The only automated way I know of to scan books without a human being having to stand there and turn every page involves incredibly expensive scanning equipment with tiny vacuums and robotic arms, and very few libraries, much less individuals, can afford it.Finally, just a snarky comment – this story is yet another reason NOT to live in New York City – after all, something’s fundamentally wrong about a place where a serious scholar literally can’t afford the square feet necessary to keep their books and have a child.

mhmolinaro - October 5, 2010 at 1:32 pm

I hope after all of the effort and expense put into this project there is a plan in place for preserving the digital files. Digital files are unstable and subject to corruption. It would be unfortunate if the drives on the networked storage device failed and Professor Halavais lost not only his printed books but the digital surrogates as well. With books on the shelf you can be assured that when you open them in 20 years the words are still the same words, without active management of the digital files this simply isn’t true in the digital world.When I talk about digital preservation to people I often help people understand the issues by referencing things like eight track tapes, zip discs, floppy discs, Wordstar, etc.

sand6432 - October 11, 2010 at 8:54 pm

Several people commenting on this story, including Jennifer Howard herself, assume that the digitization doesn’t implicate copyright because the copying was done for “personal use.” They forget that back in the 1980s Hollywood studios sued Sony because its video recorders were being used by people to make copies of movies at home for “personal use.” The decision by the Supreme Court in 1984 held that this copying–even though duplication of entire movies was involved–was “fair use.” But the decision only ruled on the legality of copying for time-shifting, i.e., for enabling viewing at a later time, not for archiving the copies of movies on videotapes, so its relevance to Professor Halavais’s activity is limited. So far as I know, there has been no court decision directly on the type of activity the professor has undertaken, but it is certainly not true that he is taking no risk at all. There is much debate in the legal literature about how much, if at all, personal copying should be considered “fair use” when entire works are involved. L. Ray Patterson is one authority who made a strong case for its being “fair use”–use of the work, as he would say, not use of the copyright (which would only be involved if the professor were to try selling his digitized copies in the marketplace). The professor is undoubtedly minimizing his risk by destroying the physical copy of each book. But he is also not buying a digital copy that may be available in the market at a “reasonable” price, which would count against him on the fourth factor of fair use (impact on the market). But who knows what any given judge would say? A judge might even view this personal copying as acceptable if the professor refrained from selling the physical copies (if he used a nondestructive technology) and instead gave them away as a charitable gift to a public library. It’s unclear if the “first sale” doctrine would extend to the digitized copies because they were not bought by the professor but scanned from the original, which is at least a potential infringement of the reproduction right. I am curious that the professor assumes he cannot lend his digital copies, though. If he had bought them and kept them on physical media like CDs, he certainly could lend them because there would only be that one physical copy. He could even lend a digital copy from his computer if he temporarily removed it from his hard drive. The question about the possibility of a computer crash destroying his entire digitized library also raises the issue of backup copies. Copyright law does recognize the legality of making backup copies of software under certain conditions, and this might be extended to his set of circumstances. But, again, since every case of fair use depends on just what circumstances exist, and no court has ruled on just this set of circumstances, I wouldn’t want to be sticking my neck out and positioning myself to be the first test case. Litigation is expensive! —Sandy Thatcher

  • 1255 Twenty-Third St, N.W.
  • Washington, D.C. 20037
subscribe today

Get the insight you need for success in academe.