Digital Preservation

Last year I blogged about the pros and cons of Google versus the Open Content Alliance and I stand by most of what I wrote at that time. So far, Google’s content has remained free to access, although publishers have now reached a new understanding and profit-sharing with Google that has also placed new limits on what information one can find within a Google Book search. Interestingly, the new agreement has limited access to English language texts only, while other languages (like German) have been almost completely blocked by publishers abroad.

I do, however, want to draw your attention to two digitization projects right here in the Pioneer Valley. One of the most impressive digitization projects (beyond Google and the Open Source Alliance) has been the digitization of all known Yiddish language books by the Yiddish Book Center here in Amherst.  Begun in 1998 through a major grant from Steven Spielberg, the Yiddish Book Center set out to digitize every book that they had in their collection along with many supporting documents and art prints. Most of the digitization is now complete and the contents can be viewed either through their own website or through the Internet Archive. Although not as slick an interface as Google Books, the historic preservation aspect is far more commendable. The collection, when taken as a whole, is the first digitization project to successfully preserve and entire national literature. Digitization was in keeping with the YBC’s core mission to preserve the literary heritage of European Jews. The story of the YBC itself is a facinating one (watch the video to find out more).

Another digitization of note here in the Pioneer Valley is right here at UMass. As most of you know, the central research library here is named in honor of W. E. B. Du Bois. One of the many gems found in the library are Du Bois’ papers, which are housed in Special Collections. Although the Du Bois papers have been accessible to scholars for decades, it was only last year that the library was awarded a major grant from the Verizon Foundation to digitize the collection. The digitization process is slow-going, but some of its “fruits” are already being reaped. This past semester, some of my students worked on a project together with the UMass Center for Educational Software Development to create the first e-learning website to utilize the newly digitized papers. CESD is currently working to port this project onto a larger dedicated site running Omeka, an open-source database driven exhibit builder.

One thing that I’ve learned from my own work on digitizing projects is that the act of preserving can only be one part of the project. Preservation is the first step, but if one stops at this point, then it really raises the question behind the purpose. Information, whatever kind, needs to be well organized (and, if possible, annotated). If we think of such digitization projects as only the next iteration of the archive, we haven’t really done anything other than move information from one format to another. I do believe that there should be some sort of value-added element in such digitization projects. The Google Book project adds a great deal of value – there are links to other texts and one can extensively search through books that would have otherwise taken years and links to reviews as well. Amazon’s digitzation is much more restricted, but adds a great tool – listing which other books cite a given book. These are all first steps to what could potentially greatly alter the way in which digital archives and document collections interact with one another.