News Archive

July 1, 2004

The TCP has released a new newsletter for Summer of 2004 containing general updates for the EEBO-TCP, Evans-TCP projects, and ECCO-TCP. This newsletter is released bi-annually by the project for people interested in updates, how the projects are used, and more.

If you have any questions or comments about the newsletter or would like some paper copies, please contact shawnmar@umich.edu.

August 6, 2004

Oberlin Group Schools join EEBO and EEBO-TCP

Over the summer a large number of Oberlin group (a group of liberal arts colleges throughout the U.S.) including Albion College, Bates College, Bucknell University, Carleton College, Colby College, Earlham College, Middlebury College, Mt. Holyoke College, Trinity College, Wesleyan Universityjoined bought the EEBO product and came into the EEBO-TCP project. Several may still come in, and we are grateful for their contributions that will enable us to create thousands of texts and we hope to be able to discuss possibilities for integrating these texts and images into undergraduate curricula.

September 2, 2004

Before the JISC's launch event in October Pat Leon wrote a report discussing the value of EEBO and the TCP. It is available on the JISC website, and also reprinted below.

 

Early English Books Online : The Holy Grail of Online Resources?

Early English Books Online, now made available free to every college and university in the UK, is introducing lecturers, students and researchers to the full possibilities of online resources. Pat Leon reports on this extraordinary resource as well as a new project which will enhance it still further.

Early English Books Online: The Holy Grail of Online Resources?

In the final chapters of the popular thriller, the Da Vinci Code, hero and heroine pin their hopes of finding the Holy Grail on Kings College London software sifting through a vast digital library of texts for just a few words. Fiction maybe, but such fine-tuned trawls of vast national archives are fast becoming fact as technology advances. Early English Books Online (EEBO) is one example of this.

The site holds digital page images of more than 125,000 books, pamphlets, treatises, sermons, plays and other works published between 1473 and 1700. The originals sit in three catalogues - Pollard and Redgrave, Thomason Tracts and Wing - all previously available on microfilm. Variations in typography, spelling and punctuation, however, make word searches difficult. To solve this, in 1999 the Text Creation Partnership was formed by the universities of Michigan, Oxford and microfilmers Proquest. The target was to key in a fifth of the works as SGML/XML text.

Now JISC, as part of a deal struck earlier this year with ProQuest to make EEBO free to UK universities and colleges, is calling on UK academics and librarians to choose their favourite texts. A US-based advisory panel will review suggestions. Emma Beer is JISC coordinator of a special UK launch on October 25 at the British Library, which together with the US Council on Library and Information Resources is also steering the project. Professor Lisa Jardine will be introducing the day. Beer says: "This is a crucial chance for UK academics to have their say about titles they would like keyed in."

The TCP has attracted US$6 million (£3.3 million) of which JISC has contributed £750,000 on behalf of all UK further and higher education. This compares with US and Canadian universities, which pay anything up to $50,000 (£28,000) each. ProQuest has matched 20 per cent of contributions. The target is $9 million (£5 million).

EEBO users in the UK are enthusiastic about opening the site to a wider, more diverse audience. Justin Champion, Professor of History of Early Modern Ideas at Royal Holloway, University of London, has used EEBO for nearly six years. "Once you get a taste of what research can be like with EEBO you want more. It transforms how you work. I can work at 2am. I can scribble on my printouts. I'm not restricted by library opening times. I’ve cut my transport costs and time. It's simply more efficient.”

Champion's teaching has also changed. "When planning courses I used to take months identifying texts, locating libraries and trying to get the material photocopied. Now I just send students to the url."

Richard Sugg, an English lecturer, agrees. He discovered EEBO when he moved to Cardiff in 2001. "Cardiff was one of only a handful of universities that subscribed to EEBO," he says.

Expense was the main reason. JISC, however, has bought a license in perpetuity. The only charge to institutions is a hosting fee of between £2,200 and £78 a year according to student numbers.

"EEBO's made a colossal difference to research," Sugg says. "Arnold Hunt recently wrote in Times Literary Supplement that internet resources such as EEBO have turned research 'from a labour-intensive handicraft into a mechanised industry'. Before when writing a book or article you had to make notes of what you needed to check in primary sources in the British Library or wherever, now once you've got your password you just go to EEBO."

In teaching, EEBO changes the nature of courses. "Students can't buy or borrow so many texts. With this they can push the limits of their initiative, imagination and industry in a way that not even the most adventurous could have done before unless they lived near a famous library."

Matthew Steggle, lecturer in English at Sheffield Hallam University and editor of the e-journal Early Modern Literary Studies, says: "My university doesn't have EEBO so my experience is by free trial. It's been a revelation."

Steggle was interested in Renaissance work on the Greek dramatist, Aristophanes. "I pumped in the word 'Aristophanes' and he came up all over the place, not at all where I expected. In sermons, for example, he was cited as an authority on classical ideas of the after-life." His worry is that smaller university departments might not have the clout needed to persuade libraries to buy into this resource. "Pressure needs to come from academics from different departments banding together," he says.

Tim Hitchcock, Professor of 18th-century History at the University of Hertfordshire, has been using EEBO on and off for a couple of years for his research and website on the Old Bailey and more recently for a book on begging. Hitchcock believes that EEBO technology lends itself to the development of what he calls "historical forensic linguistics". "Because the text exists as searchable XML, there is an index of every word that appears. I'd like to see tools for querying these indexes, say, through word proximity statistics. This would revolutionise our understanding of the origins of texts, of the linguistic links between them, and of the influences and copying. This cannot be done on paper, but it could allow us to redraw the intellectual map of early modern Britain."

Such searches are reliant on accurate text input. The TCP has outsourced keyboarding to three companies in India. Two keyboarders simultaneously transcribe texts and add tagging to capture the structure of the work, such as chapters, paragraphs, etc. A third person checks both transcriptions before sending them off to Michigan and Oxford for sample reviewing. If more than one mistake per 20,000 characters is found, texts are sent back. The current mistake rate is one per 100,000 characters.

Judith Siefring, who works at Oxford Digital Library on the project, says that some 6,499 texts are already available online. Some 250 plus texts a month are reviewed per month. “We check and edit the tagging of each text and after that assign types and generally bring the texts up to standard. We have to make editorial changes that take account of the variety of texts, e.g. poetry, prose literature, plays, sermons, almanacs, dictionaries, mathematical treatises, etc."

Siefring welcomes the extra contact with UK academics JISC involvement is bringing. Across the Atlantic, Shawn Martin, TCP outreach librarian at Michigan University, has found US academics' requests for texts has varied. "They range from canonical works used in undergraduate classes (Spenser, Marlowe, Shakespeare) to obscure titles used only in specific graduate seminars or research projects (treatises on windmills, sermons on specific psalms, works on lexicography).

Martin's brief involves not only EEBO but Evans Early American Imprints and Eighteenth-Century Collections Online (ECCO). He is anxious that UK audiences do not see these projects as purely commercial. "We try to get as much academic collaboration as possible not only in text selection, but by helping academics with syllabi, scholarly projects, or other initiatives. We have more than 100 libraries internationally supporting this."

But for Royal Holloway's Justin Champion there are still “battles to be fought with academics who like to head off to the research library, bury their head in books, have a coffee break and at some point write up their notes. They say IT’s not for them, but if you don’t use the web nowadays, it’s like walking around in shackles.” Perhaps they just don't believe they'll find their Holy Grail.

TCP Board Meeting
Minutes
October 21, 2004

Attending:

Board Members: William Gosling (Chair, University of Michigan), Betty Bengston (University of Washington), Mark Dimunation (Library of Congress, for Deanna Marcum), Marianne Gaunt (Rutgers University), Ronald Milne (Oxford University), William Miller (Florida Atlantic University), Jeff Moyer (Gale, for Richard Foley), Remmel Nunn (Readex), Mary Sauer-Games (ProQuest Information and Learning), David Stam (Syracuse University), William Walker (University of Miami)

TCP Staff and Guests: Ross Coleman (University of Sydney), Shawn Martin (University of Michigan), Mark Sandler (University of Michigan), Paul Schaffner (University of Michigan), Perry Willett (University of Michigan)

Unable to attend:

Nancy Davenport (CLIR), David Ferriero (New York Public Library), Sarah Michalak (University of Utah), Carole Moore (University of Toronto),

I. Welcome

William Gosling opened the meeting, welcomed new members including representatives from two new publishers joining the TCP effort including Remmel Nunn from Readex and Jeff Moyer from Gale, and Mary Sauer-Games from ProQuest.

II. Evolution of the TCP Board

Since the TCP has evolved from one project in cooperation with ProQuest to now three projects in cooperation with three different commercial publishers, it is useful to consider how the Board might adapt to accommodate the changing situation. The Board discussed whether it was fitting for it to evolve as a TCP Board concerned with all potential TCP projects (including those to which individual members of the Board may not subscribe) and how its meetings could be structured to address overarching TCP objectives as well as the specifics of individual projects.

The representatives of the three companies began the discussion by highlighting the things they would feel uncomfortable revealing in front of their competitors. These included pricing, their contributions to the TCP, and general marketing strategy. It was agreed that the Board should attempt to structure its meetings so that all members could be present and, if at some point there was a need to divulge sensitive information, the Board could hold an executive session in which that could be discussed.

III. Project Recruitment updates

Mark Sandler then discussed the upcoming event celebrating the JISC launch of EEBO and EEBO-TCP. The JISC is contributing significantly to the project and will allow us to create thousands of new texts. It will also increase the number of potential users to all higher and further education institutions in the UK.

Shawn Martin then updated the board on current partner recruitment over the past year which was quite significant. Over thirty institutions joined since the last Board meeting. Among those were the Oberlin group of liberal arts colleges the Consortium of Prairie and Pacific Libraries (COPPUL) in Western Canada, as well as UCLA and UC-Berkeley. Prospects are also very good for the coming year, and TCP continues to grow with recruitment of 30 new partners in both the Evans and ECCO initiatives. Board members also suggested several independent research libraries and large public library systems as potential recruits for TCP as well as some international institutes and organizations.

IV. Budget Review

Discussion of the current TCP Budget then began. The Board considered a new proposal to structure the budget so that TCP allocates our outreach and administration costs across all three projects based on production output rather than itemizing costs individually. It was agreed that this should work for now, but should be reconsidered later because outreach costs for EEBO should be less when the project comes to a close. It was agreed that at some point, the outreach and administration costs should be viewed as “billable hours,” and allocated as such, but in the meantime the budget as presented will work.

The Board then considered several scenarios detailing the future growth of the project. Mark Sandler presented figures showing past growth and gave some projections for future growth. It was agreed that TCP should aim to recruit 28 new partners before the next Board meeting, and that would ensure the creation of 25,000 texts and could present TCP with a stronger case as it goes for second round funding, grant funding, and other sources over the coming years.

 

V. Board Discussion Items

Rights of Use

The Board discussed the increasing number of requests the TCP has been getting to use TCP text in other projects. Shawn Martin presented some examples of previous decisions and asked for board advice on how to go about future ones. The Board agreed that decisions made so far had been easily justifiable and that it would be possible for TCP to sell its text to commercial publishers according to a fee schedule to be developed. It was agreed that TCP staff should draft more specific guidelines and circulate them among the Board for further consideration.

Price Increase

Mark Sandler then presented a proposal to the Board regarding a price increase and some possible ways of doing so. The Board agreed to a 20% increase to commence on April 1. It was also suggested that we might wish to investigate the impact of this increase on smaller institutions and determine what they might like to see in terms of structure and price.

Multiple editions

Shawn Martin then presented an idea put forth by many faculty requesting second editions of text for inclusion in TCP (rather than selecting only the first) if faculty and scholars offered to pay for them. The Board decided that this should not be a problem if scholars or their institutions wished to pay for the text and TCP staff should take into consideration further administrative costs if needed.

Evolving TCP Model to fit community goals


This discussion came up in several contexts throughout the meeting. TCP has become a large project serving many different scholars and partnering with many new companies. Therefore, it would be beneficial to add new revenue and sustain the project beyond the three commitments currently in place. Nevertheless, this does add complications in administration and current contracts. The Board discussed several possibilities.
Potential grants were discussed to fund publicly accessible texts for the project. It was also suggested that TCP set itself up as a model of potential study for things like digital preservation, collection development, education at both a college and K-12 level, and scholarly access and communication. It was noted that further expansion of the TCP, especially for grant funding might cause problems with some of our existing contracts with commercial publishers and that any grant funding we receive should be used to expand the project rather than fund current commitments (i.e. go beyond the 25,000 committed to in EEBO).

The Board also discussed the possibility of an ongoing partnership in which members pay a kind of subscription that keeps the TCP project going “for the common good.” It was thought that there might be potential for creating a kind of membership organization at least among the best customers of the TCP. The Board agreed that after EEBO-TCP approaches people from a position of strength with 25,000 texts due to be completed, it would be beneficial to pursue this as a possible model for continuing TCP growth.

VI. Project Development and Production Updates

ProQuest

Mary Sauer-Games updated the Board on the many new developments with the EEBO product. They have sold to 43 new institutions this year, have included new material like the Thomason tracts, and have added 6,500 TCP texts to their database so far. To date, they have had over 61,000 view of TCP text.

Readex

Remmel Nunn then discussed the developments for the Evans product. So far, over 106 institutions have subscribed to Evans and they are further developing the Shaw-Shoemaker database. He also discussed the recent selection task force meeting held in Worcester, MA. The greatest challenge will be to select those texts in the coming months.

Gale

Jeff Moyer then updated the Board on its progress with the ECCO product which contains over 26 million pages and 155,000 volumes. To date, they have 60 customers including 6 Canadian and 11 international institutions. They have also done OCR for the ECCO product and are interested in how the TCP text will work and integrate with their OCR.

Australia

Ross Coleman from the University of Sydney was a guest of the Board in the afternoon to discuss opportunities for the three commercial publishers and TCP in Australia. He detailed the structure of the libraries in Australia and suggested that TCP might be able to maximize its impact by selling jointly with the commercial publishers and that they have a strong tradition of consortial purchasing. Therefore, it would be best to sell to the consortium rather than one by one.

Production

Perry Willett and Paul Schaffner then presented the Board with the latest production updates. Perry, who had come from a TCP partner institution and was now in charge of TCP production remarked on the amazing ability of the staff to meet production targets and goals. Paul discussed what some of those goals were and remarked on the 8,000 texts produced so far and that roughly 14 to 21 books per day are managed through our process. He also discussed the difficulty in deciphering the many unusual characters in early printing.

Outreach

Shawn Martin then updated the Board on the many outreach efforts this year (exclusive of new partner recruitment). In addition to conferences and publications, TCP has piloted new projects at the University of Toronto, National Library of Wales, and the University of Chicago. Scholarly projects around the world are increasingly asking for TCP’s help and partnership in their own projects. TCP now has two teams of School of Information students studying the effects of TCP on education, and initiatives set in motion last year like the Academic Advisory Group and requests for records have been successful. Overall, it has been a very good year for TCP in terms of both recruitment and other outreach.

VII. Planning for next meeting

The meeting adjourned approximately 3:00 p.m. and TCP staff will be in touch with Board members regarding additional materials and plans for the next TCP Board meeting.

October 25, 2004

The Joint Information Systems Committee (JISC) in the U.K. recently launched both Early English Books Online and the Text Creation Partnership at a recent event at the British Library entitled "Waking Up at the British Library." The event was a great success and raised awareness of this resource to many academics in the U.K. Below is an agenda for the event, and links to the relevant presentations. For more information about the JISC and EEBO-TCP, you can also go to their event site. JISC also put out some excellent press releases that will give TCP much publicity throughout the UK.

"...WAKING UP IN THE BRITISH LIBRARY..."

British Library Conference Centre

25 October 2004

10.00 - 16.00



PROGRAMME

10.00 REGISTRATION and COFFEE

10.25 OPENING REMARKS
JOHN TUCK
Head of British Collections, The British Library

10.30 KEYNOTE SPEAKER
PROF LISA JARDINE
Professor of Renaissance Studies at Queen Mary and
Honorary Fellow of King’s College Cambridge, Director of
AHRB Centre for Lives and Letters

11.00 INTRODUCTION TO EEBO AND TCP

THE TCP PROJECT - STRATEGY
MARK SANDLER

University of Michigan
Collection Development Officer

11.30 REFRESHMENTS

11.45 THE TCP PROJECT – EDITORIAL

JONATHAN BLANEY, EMMA LEESON, JUDITH SIEFRING
University of Oxford
TCP Reviewers

12.15 USING EEBO-TCP IN TEACHING AND RESEARCH
DR MATTHEW STEGGLE

Department of English, Sheffield Hallam University

PROF JUSTIN CHAMPION
Department of History, Royal Holloway, University of London

12.45 QUESTION AND ANSWER PANEL WITH SPEAKERS AND
LORRAINE ESTELLE, Collections Team Manager, JISC

13.00 LUNCH

13.45 HANDS-ON WORKSHOP

15.40 OPEN FORUM

16.00 TEA and CLOSE

 

Mark Sandler, who attended the event on behalf of the TCP commented that:

'Before too much time slips by, I wanted to convey my thanks to all the participants who took time from their busy schedules to give consideration to the EEBO-TCP initiative. While it is easy to think of JISC as "just another library consortium," in practice it far exceeds that narrow characterisation. It is one thing to serve as a purchasing agent for a group of libraries, but quite another to take responsibility for extending educational opportunity to all students in the UK. Even more impressive is taking responsibility for ensuring that resources, once purchased, are effectively used. JISC is an exemplar for library consortia and cooperative library initiatives everywhere.

As for the "Waking up in the British Library" program, it was extremely gratifying to hear from faculty that it was contributing to their research and teaching. The faculty speakers touched on some critical themes about EEBO and EEBO-TCP. They talked about the democratising effects of making the texts and images widely accessible. At least two of the speakers used the word "transformative" to describe the research and pedagogical impact of searchable text presented in conjunction with page images from original editions. They also mentioned the broad interdisciplinary access that searchable text provides - one no longer need be a specialist in the early modern period to seek out and find relevant resources. And finally, one or more of the speakers noted that even for specialists, this has greatly extended their reach to uncover new information - be it central or contextual to the primary themes of their research. Needless to say, all of these reactions are quite reinforcing to the EEBO-TCP production teams at Michigan and Oxford who always believed they were making an important contribution, and are now hearing that they've been on the right course.

The broad range of access that the JISC negotiated on behalf of the UK will undoubtedly breathe new energy and new ideas into the project. We are eager to capture examples of classroom use of the project from different types of institutions of higher education. We're hoping to see many UK submissions coming forward from the annual student essay contest jointly sponsored by ProQuest and the TCP. We're poised and ready to assist scholars in their research and teaching by moving forward the texts they need, or providing detailed information about search options. And finally, we're open to partnering with many related initiatives underway in the UK, understanding that linking resources is a great service to users in an increasingly complex information world.

We are committed to keeping the project website fresh, providing updates on project output (currently at 7,891 texts online), new texts being added, instructional ideas, meetings, and forms for suggesting useful titles for keyboarding. We look forward to lots of ongoing contact with users in the UK, and hope that the JISC staff can stay involved as well in providing feedback to the project, and continuing to encourage uptake and experimentation. Thank you again for all your efforts in organizing the excellent rollout event at the BL.'

 

November 10, 2004

TCP Hosts University of Michigan Focus Group on Electronic Resources
Summary by Julia Gardner

Attendees: Julia Gardner (Student, School of Information), Shawn Martin (TCP Project Outreach Librarian, University Library), YuFang Lin (Student, School of Information), Steven Mullaney (Professor, Department of English), Devon Persing (Student, School of Information), Eric Rabkin (Professor, Department of English), Rasika Ramesh (Student, School of Information), Mark Sandler (Collection Development Officer, University Library), Kathy Schroeder (Student, School of Information), and James Sweeney (Student, School of Information)

Class use of e-resources
Both faculty members indicated they (and the students) valued ease of access to electronic materials. They indicated that particularly now, with widespread broadband access, they feel comfortable assuming all students can get access to electronic texts at any time. Their comments indicated that, as anticipated, students prefer resources that are always available, easily searched, and accessible from their point of need (i.e. does not require a trip to the library).

One faculty member does not teach in early modern studies, and the texts he works with are not currently available online; however he indicated that if the materials were available electronically he would construct different assignments from the ones he now gives, in order to more fully utilize the options available from electronic text.

Both indicated that they would like to see wireless access in classrooms to facilitate greater use of electronic resources and to support the spontaneous incorporation of such texts into discussion. They felt this environment would enable a more collaborative process among students and professor, with the professor, for example, referencing a passage in the course of discussion and being able to count on some of the students with laptops in class quickly locating the reference. They felt the current lack of connectivity limited the extent and ways in which professors make use of e-resources.

New Uses for e-resources
Both users felt that their fellow professors needed to see something new that can be done using e-resources in order to change their thinking about the role such texts could play in course design, assignments and teaching. Similarly, they felt it was important to show students examples of work that can be done using e-texts that excited them, to show students how they can engage in constructive learning.

One example given was showing students how to construct search sets, the results of which could then lead to new thematic analysis or comparison of the texts under consideration. They emphasized how e-resources can help both students and faculty alike learn to ask new questions, and how certain search functions and results can make the text ask the researcher new questions.

Finally, they addressed the issue of plagiarism, and the concern that e-resources can make plagiarism both difficult and easy to detect. They emphasized the need to demonstrate how inventive use of e-resources can allow the instructor to construct tasks that don’t invite student plagiarism, that are in fact complex, process-oriented tasks. Not only would such assignments be resistant to plagiarism attempts, they also engage the students in performing meaningful research.


Interface design issues
Both stressed that for scholarly research it is vital an e-resource does not close off interpretive possibilities by making decisions prematurely, such as deciding to modernize all spelling. Similarly they don’t want the resource to decide what content is valuable, but would instead want everything from an original source to appear, and to appear in its original form in the electronic version. They did note that for searchable text, issues such as having all the variable spellings were not necessary.

They discussed the cost of information and the tendency, especially among students, to go for what is “cheap,” that is, easily accessible and fast. They pointed out the importance of a resource having accurate recall and precision in returning search results because students will typically go with the first few results they get, and assume they have “searched everything.”

Wish list
Frequency distribution
Collocation ability
Z scores
Speedier load time for EEBO
Vocal element – accessibility issue, way to make text available for the blind
Ability to enter a root term and have all variations returned (without having to truncate root)
Be able to easily incorporate multimedia
– show and listen to clips from Shakespeare plays, compare ways a scene is interpreted in different performances
– be able to click on a passage of text and be taken to video clip of that scene

User Support
The users had suggestions for both in-person instruction and online tutorials. Online materials include:
• Help button function. Could open a pop-up window that either gives an explanation of the function or term, or that contains a menu of mini-lessons. The user could choose which mini-lesson to take.
• Online tutorial available similar to those offered by Macromedia

In-person instruction involved two levels of training:
• One level would be geared more to undergraduates, familiarizing them with the resources central to their major, for instance, the MLAB.
• More sophisticated levels of instruction for graduate students that would enable them to get up to speed as quickly as possible for incoming students, introduce them to the range of e-resources available and instruct in how to use.

In discussing in-person and online, the users raised the issue of human resources required to create and maintain online tutorials for materials that might change quickly. As was noted, the online and in-person instruction option are not mutually exclusive, and while some types of instruction needs could be supported through online help, others could better be served by face-to-face interaction.

A third element of user support also emerged in the discussion, pertaining to video use. As mentioned earlier, the professors would like a way to make video material accessible in the same 24/7 manner currently available through electronic reserves for printed text. They raised the possibilities of obtaining site licenses for video, or creating a keyed server to provide streaming video, so that the material would be available only to authenticated students enrolled in that class.


Back to News Archive