From Collecting Dust to Digital Content: Lessons from Working with the Superiorland CDs

As an intern in Digital Content and Collections (DCC), I have been working with various older digital collections and projects, to address any issues that have come up since their creation. Each of these collections is unique in content and format, with digitization, description, and access processes designed within the context of grants, stakeholder goals, user needs, and technical capacity. Most recently, I completed a project to ensure access to digitized materials from the Superiorland Library Cooperative (SLC) and through this project learned several lessons that are useful for me -- and anyone else working with older digitized materials -- to keep in mind for future projects with digital collections.

SLC had applied for a Digitization for Preservation and Access (DPA) grant through the Library of Michigan (LoM) and used that funding to coordinate with libraries and historical societies to collect and digitize materials related to the history of Michigan’s Upper Peninsula. LoM coordinated the DPA grant program in the early 2000s in order to help local institutions digitize materials related to Michigan history and culture and make those materials available online. DCC had received the SLC materials through a 2006 memorandum of understanding (MOU) between the University of Michigan and the LoM, to ensure that items digitized through the DPA grant would be hosted online, since SLC member institutions had the capacity for digitization but not to make the materials available to online to users at that point in time.

Through this project, I discovered three key takeaways for working with older digital collections and projects:

  1. Check if someone else has already done the work.
  2. Digitization is not (always) preservation.
  3. Putting an item online does not automatically make it accessible.
Check if someone else has already done the work

Before resuming work on this project this year, 2018, the last communication between DCC and either LoM or SLC was nearly a decade ago, and our contacts at both organizations had since retired. Since significant time had passed since the digitization effort, we decided that it would make sense to see if the materials had already been digitized and uploaded by other institutions. For published materials, like books, WorldCat or HathiTrust Digital Library are good places to check.

Of the twenty-three items represented within the SLC materials, fourteen were found to be books that had since been digitized and made available online through HathiTrust Digital Library by other libraries. Because these volumes had already been made publicly available online elsewhere, we decided we did not need to host an additional copy, since the MOU stipulations of public online access were already fulfilled. I would then investigate the nine remaining volumes and extract the bibliographic information from the title page, so that we could put them online with the appropriate metadata.

Digitization is not (always) preservation

Two bankers box lids with the CDs containing the materials from the SLC collections
The physical SLC materials -- CDs in two cardboard lids.

The SLC materials came to me in the form of two cardboard bankers box lids stacked with CDs. While some of the CDs had printed slips with basic information about the item, others had only a couple words written on them with a Sharpie; paper-sleeved CDs were placed into padded envelopes, and many were labelled with only a word or two scrawled onto a small Post-It note. Since the labels were so sparse and there was no inventory list of the items or files within each item, each CD needed to be opened to understand what exactly its contents were.

It turned out that of the remaining nine volumes only one was actually a published book while the remaining eight were archival materials. We decided that we would create a catalog record for the volume and then directly ingest it into HathiTrust Digital Library. We were able to put Women on the Raisin during the War of 1812 online, and though the text is only searchable and not fully viewable for now (since the book is still in copyright), it is officially online and users can search it (and request the volume from their ILL service, if interested).

The remaining eight sets of materials turned out to be collections of unbound, archival material. These were split into letters and journals related to the War of 1812 and photographic collections from Upper Peninsula historical societies. The metadata files for the War of 1812 materials were in the MDB (Microsoft Database) file format, a format specific to Microsoft Access (not used in our group), and there were some initial difficulties extracting the information. For the largest historical photograph collection of twelve CDs, the first CD with the metadata file could not be opened at all, and a consultation with Digital Preservation Librarian Lance Stuchell determined that because an interruption had occurred with the initial burning of that particular CD, no files had actually been written to it. Though only one CD had issues, because it contained the metadata for that entire collection, we could not identify or display any of the materials for that collection.

There is sometimes an assumption that digitizing materials -- especially old or fragile items -- is the best way to keep them safe. However, as the problems with the CD showed, digital storage presents its own challenges, since digital materials can be irreparably or irretrievably damaged, just like physical ones. In fact, while a book or box of photographs exposed to water, fire, mold, or pests may still be usable after undergoing additional preservation, a hard drive subject to the same conditions is unlikely to be so lucky.

Putting an item online does not automatically make it accessible

Because the remaining materials were archival and not published, my initial search of HathiTrust Digital Library and WorldCat did not produce any matches. Additionally, the lack of a commensurate consolidated aggregate catalog for archival materials makes finding digitized archival materials much more difficult than published materials -- though the Digital Public Library of America and Europeana exist, they are more limited in scope. I did attempt to search via Google, but because archival items often have sparse, general, or non-unique description at the item level -- for example, multiple images titled “Telegraph train order” -- it was difficult to track down particular items.

However, because we could not move forward without the metadata for one of the collections, I took a step back to see if I could get in contact with someone at SLC who might have a backup copy of the missing metadata and image files. Through the SLC website, I discovered that in the twelve years since the MOU, all the archival materials (with the exception of one letter) had already been put online through the The Upper Peninsula Regional Digitization Center and the Monroe County Library System. (Links to each of the SLC collections have been included at the end of this blog post.)

Since the remaining letter is by the same writer as several of the others, we decided that it would make the most sense to have it available in the same place as the others. We then reached out to the manager of the existing repository to send them the remaining images and metadata.

Though the materials had already been digitized and put online, along with their corresponding metadata, because of the lack of a centralized aggregate archival repository or catalog and because archival materials often have non-unique metadata, I could not initially find them. It was only by discovering the correct digital portals through the SLC website that I was able to navigate to the materials. In effect, I was only able to find the materials because I knew that the materials had been digitized and were likely in a particular location. And, even with that knowledge, I had some difficulty with navigating between the collections and items in the online portals, as well as with using the website’s search function to find the collection or item I wanted. Though the SLC materials are historically rich and deserve to be online to maximize public access, I could imagine the average user facing similar challenges in discovering and using them.

Final Thoughts

Overall, this was a really interesting project in understanding how the digital library landscape has shifted over the past decade and in thinking about the challenges of working with digital materials in such a rapidly changing landscape. While adaptability and flexibility are certainly important traits, more thorough documentation can help ensure that projects do not fall through the cracks when staff turnover occurs and that future staff can carry on and improve upon older collections. And as we learn more about user behaviors and needs in digital environments, being able to apply these new lessons to older collections ensures that they continue to fulfill their original mission and vision of making information freely available.

I would like to thank all the people at the University of Michigan and the Library of Michigan who helped with this project. Thank you to Shannon White at the Library of Michigan for helping us liaise with the various stakeholder libraries and institutions, and for making sure that everyone’s needs and concerns are communicated clearly and met. Thank you to my supervisor Kat Hagedorn, for taking the initiative to rediscover and revive this project, to ensure that as many materials as possible are available to as many users as possible, and for your infinite patience in answering my endless questions. Thank you to my predecessor intern Kathy Kosinski, for doing preliminary investigation on the books that were part of these materials. Thank you to Lance Stuchell at the Digital Preservation Unit for lending your equipment and expertise to help with some tricky old hardware. And lastly, a big thank you to project manager Lauren Havens for many rounds of feedback and editing on this blog post. It was a pleasure working with all of you!

SLC Materials

Published volumes already in HathiTrust Digital Library from other institutions:

Item direct-ingested into HathiTrust Digital Library:

Archival collections available through the digital portals of SLC member-institutions: