U-M Library Podcast Series: Digital Preservation
February 13, 2020
Remember 5.25 inch floppy disks? No? This early digital storage medium became obsolete in the mid-1990s, but the disks that remain are more than just artifacts. Find out how Lance Stuchell, head of the library's digital preservation lab, coaxes information off of these and other obsolete digital media — once he figures out which side is up.
Speaker 1: If an item doesn't appear in our records, it does not exist.
Speaker 2: I've never seen so many books in all my life.
Speaker 3: Get back to the library.
Speaker 4: You're right. No human being would stack books like this.
Speaker 2: What do you want to take out?
Speaker 3: The librarian.
Speaker 1: Shhh. Quiet please.
Joe Linstroth: Welcome back to the University of Michigan Library podcast. I'm Joe Linstroth. The technology we use every day is constantly changing. For those of you who are a certain age, you might remember saving your work on a floppy disk that actually flopped.
Speaker 5: Are you worried about a floppy disk losing all your payroll records?
Speaker 6: No.
Speaker 7: If you had 3M diskettes, you wouldn't have to worry. 3M floppys are certified 100% error free. No floppy is more reliable. 3M diskettes, one less thing to worry about.
Joe: Now those diskettes touted in that 1985 commercial eventually gave way to the three and a half inch plastic ones that didn't flop anymore. Then came CDs. Then thumb drives and external hard drives. The list goes on. Over the last few years, the U of M Library has been getting more and more archival materials stored on these old devices as well as in formats and operating systems that have long been forgotten.
Lance Stuchell: We were getting in this material that we didn't really know how to handle.
Joe: That's Lance Stuchell. He heads up U of M Library's Digital Preservation Unit.
Lance: So it was on obsolete media. It was on floppy disks, zip disks, kind of external hard drives, things like that. We didn't really have the capacity to take that material in, get it off the material while following best practice. So we had to establish this lab and get some specialized equipment so we could do that work. So if someone gives us a collection, it might be part paper, but it might have like a box of floppy disks. And so those floppy disks come here and then we get the material off those floppy disks and kind of moved into a better place.
Joe: For this episode of the podcast, I went over to the U of M Library's Digital Preservation Lab to check out Lance in action.
Lance: We're in the Buhr Building, which is not-
Joe: Centrally located.
Lance: Right. Thank you for rescuing me on something, a nice thing to say.
Joe: Is there an example that comes to mind from a collection that you've worked on that really just sticks out?
Lance: Yeah, so actually it was the first, I was testing the five and a quarter floppy drive. So five and a quarter floppy drives, you have to get vintage ones. They don't make those anymore. And you have to get a modern kind of a circuit board that will convert it to USB so you can use it with a modern computer. So it was very new to us.
We had just gotten the actual drives in. We purchased them from eBay and John Sayles' collection has a lot of five and a quarter floppy disks. So I just randomly kind of picked a disk up and put it in the floppy disk. And after about an hour when I realized I was putting the disk in upside down, then I was like, Oh, that would probably work better. So I flipped it around and it worked. It showed all the files. And I randomly picked on a file and it was actually, it looked like it was a cover letter that was sent to Charlie Sheen that contained kind of research materials for the character he would be playing in Eight Men Out.
Joe: That's a late eighties movie about the Black Sox-
Joe: The Chicago white Sox that threw the World Series.
Lance: Yeah. Right. So that was cool, instead of being like a random like financial information spreadsheet or something.
Speaker 8: We're going to see the Sox.
Speaker 9: Baseball. 1919. There were no free agents, no million dollar salaries, but there was a team no one could beat. The true story of the team they called the Black Sox and the scandal that broke the heart of a nation.
Lance: So that was kind of cool. I will say it's also really cool when we get faculty members or graduate students that come in and have work that they can't access anymore cause it's on a floppy disk or something. We just a couple of weeks ago had a researcher whose specialization is calls of crickets. So it was all these recordings of cricket calls and so they were actually on zip disks, so we were able to set up our zip disk set up and recover everything and kind of give them back to him on a regular like USB drive. He was very appreciative that we were around and kind of the word is getting out that we can provide this type of thing to campus, which is something we really enjoy doing. Just helping people out with that kind of thing.
Joe: Interesting. All right, so let's get a taste of how you do this.
Joe: You have something here from the U of M Library Collection that you haven't worked on yet. What do you have and what collection is it from?
Lance: Yep, so it's from the Stephanie Mills Collection and this particular disk was identified by the archivist as being particularly interesting. So one of the things she's known for is a 2002 book, Epicurean Simplicity. And this disk actually has some chapters from that book written in draft form.
Joe: And real quick just for listeners at home, it is a floppy disk and in someone's handwriting it reads: Epicurean Simplicity, overpopulation, fetal personhood, moral and ethical, with an ellipsis, all saved in two formats. WordPerfect 6,7,8. ANSI Windows Generic. WP.
Lance: Yes. You don't usually get labels that good. That's a great label. Particularly because it's calling out what format the material is in.
Joe: I mean, is that helpful to you?
Lance: Oh, it's super helpful. It's super helpful because I can tell — so that ANSI Windows generic, that's basically a text file, so it's going to be really simple and it's probably just going to be readable right off the disk.
Joe: ANSI, I just read all the letters.
Lance: Yeah, no, that's okay.
Joe: You can tell what an inexperienced techie I am.
Lance: Right. And then WordPerfect, right? Everyone was using WordPerfect in the late nineties, early two thousands.
Joe: For sure.
Lance: Yeah and now no one uses it anymore. So because that's a proprietary file format that's not around anymore, it's obsolete, that's going to be more challenging. But since we know that this was actually saved in two formats, the likelihood that we're going to get something off of this is pretty high.
Joe: All right, let it rip. Let's slip it in here.
Lance: Okay. So nothing happens at first because the computer is not recognizing the old file system. So we use something called FTK Imager. FTK is a forensic toolkit. And you'll notice as kind of I'm walking through this, it uses very forensic terms. So kind of law enforcement. This is a tool that law enforcement uses. There's a kind of odd parallel between what an archivist wants out of digital content like this and what a forensic investigator might be looking for. [Whirring sounds in background.] And what I'm doing now is drilling into the file structure. So it's taking a while to actually read off this disk to get into the actual files.
So it identifies all of these files in the directory. So that's good. It's identifying some of them as .wpd, which is WordPerfect. And it's also, which makes any digital preservation librarian happy, is that there's also some .txt files. So those are really simple. Those are the ASCII — or rather, the ANSI texts. So we're still looking at the disks. So this looks like —
Joe: Some words just popped up.
Joe: Chapter one Epicurean Simplicity.
Lance: Yeah, so someone interested in seeing her writing process or seeing an early draft of the book. I think they would have just gotten an early draft of chapter one from her book.
Joe: Okay. So it looks like we have something here.
Joe: We've found, you know, at least what might be a draft, an early draft of her book, Stephanie Mills' book Epicurean Simplicity. Okay, so once you've got the material unlocked, what happens next?
Lance: Yeah, so we have a great kind of partnership with archivists in special collections, so special collections is part of the library. So like in this case, the archivist that identified this disk is an interesting one. I can kind of do what I just explained I do and then we would turn that over to the archivist, at least telling them — so we create a lot of metadata and things like that as we do this. And if they were interested in a specific disk, like this disk, I would email the archivist and say, Oh, we have stuff. But it would definitely be kind of the archivist would now, they're the subject expert, so they would be the one that'd be like, Oh, we just struck gold, or like, Oh no one will care about this particular thing.
Joe: So to find out whether we had struck gold, I headed on over to the Special Collections Research Center on the sixth floor of the Hatcher Graduate Library to ask the curator. [Elevator dings.]
Julia Herrada: My name is Julie Herrada. I'm the curator of the Joseph A. Labadie Collection in the Special Collections Research Center at the University of Michigan Library. My connection with the Stephanie Mills collection is that I acquired the collection for inclusion into the Labadie Collection in about 2006.
Joe: So, Lance Stuchell, over at the library's Digital Preservation Unit, pulled some information off of a floppy disk that was among Stephanie Mills collection. What did he find?
Julia: Well what was on that floppy disk was a draft of one of her books, and it was typed onto probably an old word processing program in probably the eighties or nineties, and then it was left there and never pulled off of that again until very recently.
Joe: So when you looked at it, when Lance sent you the email with the connections, how useful was having this particular information that Lance showed me how to take off this disk?
Julia: Well, it is an early draft of one of her books and it does show, maybe, if you were to compare that with other drafts of her book, you would see the various changes that were made in the writing process. I'm not sure how important that is in this context, but we didn't know what that included until we actually went and pulled it off of that floppy disk. And that goes with any AV, any analog format. We don't always know what's on there until we go through that process. And it might be not very important, but it might be great.
Joe: How important is the Digital Preservation Unit to the work that you do in managing the Labadie Collection?
Julie: Oh, it's very important. I mean, we wouldn't, a lot of material would be lost without them. We wouldn't know what was on there. We would never be able to access it. And it degrades over time, too. So it's important and imperative that we capture this information as soon as possible.
Joe: That was Julie Herrada, the curator of U of M Library’s Labadie Collection. Okay. So our digital detective work failed to discover anything Earth shattering this time around, but, more and more material is coming into the U of M Library and all sorts of digital forms. I think it's safe to say that in the years to come, the trend will only get bigger. That means, the technology we're using now will sooner or later become obsolete. And so will the technology we'll be using after that, and after that. So back at the Digital Preservation Lab, I had one last question for Lance. How are you preparing now for what you'll need to do your job ten, 20, 30 years from now?
Lance: Yeah, that's a great question. So I think there's a couple of things we're doing. One of the things is we're making sure that the digital content we create as the library is preservable. So what I mean by that is it's in a good format and we don't really worry about the container. We don't really worry about what media it's on because we know that media is going to be obsolete and we know that file format will likely be obsolete. So we just want to understand what's the important content in this file, and how do we migrate that forward, and we have a plan. The terrifying thing, it's terrifying but it's also job security, are things like this, like archival collections where people are creating digital content at their homes, right? And they might not know everything there is to know about creating digital content cause they're not digital archivists or librarians or digital preservation librarians.
And so what we try to do is we try to share our knowledge to the folks that are interested. So we do activities on campus that reach out to the campus community. We've done some public events, like a public library and things just to kind of push out how to create good digital content, because this will always be a thing. Obsolescence will always be a thing, right? There's-
Joe: It seems even faster and faster.
Lance: Yeah. Well all of the equipment and all the software we use, right, it's being made by people who have a commercial interest in us buying it again. Right? So it's always going to kind of be like that. But acknowledging that, we just kind of to try to apply some of our digital preservation principles: create good content to begin with, and then manage the content rather than the physical media. We try to push that out so people know about it.
But not everyone has time to do that [content management]. And a lot of people are still going to be creating digital content, on an external hard drive probably nowadays, and the archive won't get it for 15 years later, and we're probably using different technology then. So part of our mission will be collecting older equipment as it goes obsolete, so we have it if we need it. But really, ideally, we're kind of pushing out that knowledge so people know just what it means to create good digital content, in case it ends up here, but also for themselves so the next generation can have their digital photos and things like that.
Joe: You touched on it briefly, sounds like job security.
Lance: Yeah, it's totally job security. Yeah, it's nice. It's horrifying sometimes, like when you open up a box and realize there's moldy floppies in there, but yeah. Yeah, it's job security.
Joe: And that's a wrap for this edition of the University of Michigan Library podcast. A special thanks to librarians Lance Stuchell and Julie Herrada. For more stories about what's happening at the library check out the latest edition of our digital magazine, which is out now at magazine.lib.umich.edu. I'm Joe Linstroth until next time.