Preserving the Future

We’ve already spent some time this semester discussing the challenges of digital preservation, but this week I’d like us to pay particular attention to the role of digital archives. Most archives accession only the most important documents and usually limit their content to particular fields. For instance, state archives focus on state government. The UMass special collections (which is also the University Archive) of course has many documents related to the history of UMass history, but also specializes in the history of social movements (with the papers of W.E.B. DuBois as the highlight).Yet, even with these areas of specialization, archives (when we look at the collection of physical documents) cannot collect everything and must constantly make decisions regarding what to keep and what to throw away (literally onto the dustbin of history…).

The advent of digital preservation, however, has allowed archives to be less “picky” and accession many more items than in the past. What we as historians are left with is a double-edged sword of sorts. On the one hand, we now (or our counterparts in the future) have much more information than we normally would have access to. On the other hand, however, we now have much more information than we normally would have access to… Yes, it is the rhetorical embarrassment of riches. How, can we sort through all of this information? Most historians owe a great deal to the work done by archivists in organizing and culling the documents so that when we enter the archives, we can focus our intention on the contents of the documents, rather than finding the relevant documents. Yet, we now might be able to find information more quickly (through digitization) and also vastly widen the scope of our research, since we now have access to so many other layers of information and society (i.e. not just the official record, but also the voices of secondary and tertiary sources.

Some national archives have set out on an amazingly ambitious archiving and digitization program. A great example here is the national archive of the Netherlands (Koninklijke Bibliotheek) which has set out to digitize every known document pertaining to the Netherlands (every document!). The library’s “E-Depot” guarantees long-term storage of digital material and have been able to use this guarantee to attract most local and regional archives in the land to consolidate their information. The KB will accept any type of format – including websites, digitized originals, e-books, e-journals, and much more.

In 2008, the German parliament passed a digital preservation law regarding German websites. The law itself is rather bizarre to be honest – all websites much submit a copy of their website to the national library, the Deutsche Nationalbibliothek. According to the FAQ at the library’s own homepage, website producers must submit full copies of their web content to the library at regular intervals. The content covers all aspects of the website – from text, to pictures, to audio and video files, as well as any multi-media elements (like flash-based plug-ins). Needless to say, this law has created quite a backlash in Germany. The library still has not been able to set up a way for website producers to easily submit their content nor has the library explored a way to collect such data on their own (as Google or the Internet Archive does). In the meantime, the library has reduced its digital accession to just online journals, e-books, and university publications (master’s and Ph.D theses). The same FAQ claims that the technical aspects of collecting websites is still in development and as such is not yet accessioning them (although by law, the website producers still need to keep copies for future donation to the library!).

Not to be only Euro-centric here in today’s post, I’ll also add a few words about two great examples of digital preservation here in the US. First, check out this article about the state archivist of Minnesota (where else, right?). Also, the state of Washington was the first state to create a separate digital archive (rather then expand the existing physical archive). I won’t go into detail here, but do check out those sites and let me know in class what you think.

Digital Copyrights

In 1998 the US congress amended the US copyright laws to take in to account all of the various changes that have come about due to the new digital nature of much of our creative content production. You can read more about the Digital Millennium Copyright Act here, although a good short explanation of the highlights can be found here. As you will see, you don’t need to be a Intellectual Property Lawyer to tell that this was a major blow to the initial openness of information brought about by the internet. It is interesting that the main proponents of this were commercial interests, while those who opposed it were academics, scientists, and librarians – the same people that have gotten behind the Open Access journal movement. As you will see in your readings this week, an argument has been made (and failed) that the founding principle behind establishing a copyright law in the United States (and in other capitalist countries) was to force companies and individuals to share their work. A copyright was supposed to be treated similarly to a patent – in return for sharing your knowledge, you would receive a limited monopoly over that knowledge (or creative content) for a limited period of time. However, through the lobbying efforts of Walt Disney and other media giants, copyright laws have been extended and expanded to the point that some critics believe the system is now broken.

The Digital Millennium Copyright Act further limited the non-paying public from accessing this content and prohibits the circumvention of copyright protections (like encrypted commercial DVDs or CDs), which make it possible to produce a perfect digital copy of a movie or song. On the one hand, this is of course fair – it protects those who have invested a great deal of money in producing a film or album. Yet, the language of the DMCA is so broad, that it has the effect of limiting access to those who cannot pay.

As protections have gotten stronger (both technologically and legally) the cost of purchasing content has generally risen. Sure, the music companies and others have suffered greatly from the drop in CD sales and film purchases have dropped with file-sharing networks. Yet, it does seem that there should still be room for “fair use” by individuals who are acting within their rights to own a backup copy as was the case with analog media (you were allowed, for instance to make a cassette recording of a record without the fear of being sued by the record company). Now, there is a difference (at least I would argue) with making a copy for personal use (say an MP3 copy of a CD for use on your iPod) and distributing those files to millions of people over the internet or (in the extreme case) selling illegal digital copies for profit. Yet, the DMCA does not make a provision for creating a personal copy of digital media. I would be curious to hear the opinions of others on this matter – either in the comments or in class on Tuesday.

  • Here are some other things to think about for Tuesday:
  • What are the limits of copyright protection?
  • As an online exhibit designer, what do you need to know about copyright?
  • How can you protect your own or your institution’s intellectual property?
  • What do you think about the Open Access movement? Does it have enough support to fend off the interests of Hollywood?

ADDENDUM: Check out this interesting lawsuit by Viacom v. YouTube. What do you think about this?

Born Digital

I blogged about the assigned websites last year and you can read that post as wellby clicking here

This week I would like to expand on what I wrote last year by looking at a few different elements of those websites that I didn’t discuss last time around. First, there is a growing trend among digital archives to stop at just the digitization step and either ignore or purposely not curate the content that is being collected. This is becoming more and more common as individuals and institutions often underestimate the costs of setting up such a digital archive (and thus seldom have any money left to do any curatorial or analyzation of their own) or they believe that is not part of their purpose. These archives, like “real” archives are there to facilitate analysis by others. Although I agree that the accessibility to scholars and other interested parties is paramount, we might be missing a chance to rethink how an archivist interacts with scholars and the larger public. Perhaps the archivist could authenticate the donated item. Or the digital archivist could offer a way to provide contextual information to the “raw” data that is being collected. It seems that we have simply transferred the analog format onto the digital without looking at how it might be different this time around.

I do, however, want to raise an issue that I also raised last year and hope we can discuss in our seminar. How do we control for authenticity with such crowd-sourced digital archives? Does it matter if we know or don’t know the social, political, economic background of those who contributed to the digital archives? What might be the pros and cons of being able to contribute anonymously? Does the trend toward digital archives (and self-submission) mean that we no longer want to preserve analog sources? Don’t people still often take notes on paper, doodle, draw, or leave behind other artifacts? Are digital archives only good for digitaly produced items (digital photographs and electronic texts)? What would a hybrid archive look like in say 10 years time?

We should also discuss the concept of the “digital generation” and how the new generation might want to preserve their childhood now so they have access to it in the future.