Web Archiving Pilot Program

About the Program

The Library is piloting a web archiving initiative to collect, preserve and provide access to online content through Archive-It, the Internet Archive’s service. Archived websites are freely available to the public and can be viewed by date of capture via our University of Michigan Library Archive-It partner page

Our pilot is scoped into three collections, described below, that complement existing collection areas and research interests on campus. Websites are selected on an individual basis by subject specialists and curators based on the collection development policies for each of the three current collecting areas. Please feel free to reach out to them with suggestions. 

Diversity in Children’s Literature

(Angie Oehrli and Juli McLoone

In recent years, there has been a growing awareness of the importance of diversity in children’s literature among authors, publishers, librarians, and readers. Therefore, a web archive focused on diversity in children’s literature will complement current collection strengths across the Library and will serve future researchers by documenting an important movement in the field of children’s literature. The centrality of social media to the movement, and as a site for ongoing conversations, makes web-archiving a particularly appropriate mechanism for documenting the diversity in children’s literature “discussion.” We hope that, as much as possible, the scope of the archive focuses on the authentic voices of content creators. Formats collected will include social media accounts, blogs, organizational websites, selected publishers’ websites, selected awards websites and other digital content. 

Interactive Fiction in Queer Digital Culture

(Sigrid Cordell and Meredith Kahn)

New digital platforms and tools for creative expression have transformed the landscape in experimental fiction and led to the emergence of a queer interactive fiction scene that pushes against mainstream commercial literary and gaming culture. LGBTQ writers are using tools that democratize development (because they require few programming skills) to create interactive fictions that take up questions of race, class, LGBTQ identity, mental and physical health, and a host of other issues. Because these works of interactive fiction are independently produced and often made freely available online, there are few mechanisms for preservation, and significant examples can disappear from defunct personal websites. We are collecting websites by individual authors/creators, examples of interactive fiction, and other publicly contextual available information about this form of queer digital culture.

Water Politics in Michigan

(Catherine Morse

The state of Michigan has seen numerous crises and controversies surrounding the protection of its lakes and rivers as well as the responsibility to provide safe, clean water to residents. The Flint water crisis; the price increases and subsequent water shut-offs in Detroit; and the reaction of local communities to the Enbridge Mackinac pipeline have sparked debates about racism, environmental justice and water as a basic human right. This collection attempts to capture the dialogue about access to clean water taking place on websites and social media between government officials and communities.

How It Works

Archive-it crawls and stores web content using the Heritrix crawler. The crawler begins from a seed list of entry-point URLs and proceeds via links. Archive-it is designed to crawl websites without interfering with access. Most crawls will be run only a couple of times per year, and will last for a few days. Once a crawl is complete, the crawler ceases to interact with the server. If you have any questions or concerns, please contact us at webarchiving@umich.edu

All preserved sites will be prominently labeled as an “archived web page” with information about the capture to avoid confusion with “live” websites. Archived versions of websites may appear to be incomplete. Certain types of content are difficult to capture, such as streaming media, database-driven content, and javascript-driven content. Additionally, we do not capture areas of websites that are password-protected. 

Parties who have questions or concerns should see the library’s Take-Down Policy: Addressing Copyright Concerns.

Give feedback about this page
Last modified: 06/14/2018