News Archive
July 1, 2004
The TCP has released a new newsletter
for Summer of 2004 containing general updates for the EEBO-TCP,
Evans-TCP projects, and ECCO-TCP. This newsletter is released
bi-annually by the project for people interested in updates, how
the projects are used, and more.
If you have any questions or comments about the newsletter or
would like some paper copies, please contact shawnmar@umich.edu.
August 6, 2004
Oberlin Group Schools join EEBO and EEBO-TCP
Over the summer a large number of Oberlin group (a group of liberal
arts colleges throughout the U.S.) including Albion College, Bates
College, Bucknell University, Carleton College, Colby College,
Earlham College, Middlebury College, Mt. Holyoke College, Trinity
College, Wesleyan Universityjoined bought the EEBO product and
came into the EEBO-TCP project. Several may still come in, and
we are grateful for their contributions that will enable us to
create thousands of texts and we hope to be able to discuss possibilities
for integrating these texts and images into undergraduate curricula.
September 2, 2004
Before the JISC's launch event in October Pat Leon wrote a
report discussing the value of EEBO and the TCP. It is available
on the JISC
website, and also reprinted below.
Early English Books Online : The Holy Grail of Online
Resources?
Early English Books Online, now made available free to every
college and university in the UK, is introducing lecturers,
students and researchers to the full possibilities of online
resources. Pat Leon reports on this extraordinary resource as
well as a new project which will enhance it still further.
Early English Books Online: The Holy Grail of Online Resources?
In the final chapters of the popular thriller, the Da Vinci
Code, hero and heroine pin their hopes of finding the Holy Grail
on Kings College London software sifting through a vast digital
library of texts for just a few words. Fiction maybe, but such
fine-tuned trawls of vast national archives are fast becoming
fact as technology advances. Early English Books Online (EEBO)
is one example of this.
The site holds digital page images of more than 125,000 books,
pamphlets, treatises, sermons, plays and other works published
between 1473 and 1700. The originals sit in three catalogues
- Pollard and Redgrave, Thomason Tracts and Wing - all previously
available on microfilm. Variations in typography, spelling and
punctuation, however, make word searches difficult. To solve
this, in 1999 the Text Creation Partnership was formed by the
universities of Michigan, Oxford and microfilmers Proquest.
The target was to key in a fifth of the works as SGML/XML text.
Now JISC, as part of a deal struck earlier this year with ProQuest
to make EEBO free to UK universities and colleges, is calling
on UK academics and librarians to choose their favourite texts.
A US-based advisory panel will review suggestions. Emma Beer
is JISC coordinator of a special UK launch on October 25 at
the British Library, which together with the US Council on Library
and Information Resources is also steering the project. Professor
Lisa Jardine will be introducing the day. Beer says: "This
is a crucial chance for UK academics to have their say about
titles they would like keyed in."
The TCP has attracted US$6 million (£3.3 million) of
which JISC has contributed £750,000 on behalf of all UK
further and higher education. This compares with US and Canadian
universities, which pay anything up to $50,000 (£28,000)
each. ProQuest has matched 20 per cent of contributions. The
target is $9 million (£5 million).
EEBO users in the UK are enthusiastic about opening the site
to a wider, more diverse audience. Justin Champion, Professor
of History of Early Modern Ideas at Royal Holloway, University
of London, has used EEBO for nearly six years. "Once you
get a taste of what research can be like with EEBO you want
more. It transforms how you work. I can work at 2am. I can scribble
on my printouts. I'm not restricted by library opening times.
I’ve cut my transport costs and time. It's simply more
efficient.”
Champion's teaching has also changed. "When planning courses
I used to take months identifying texts, locating libraries
and trying to get the material photocopied. Now I just send
students to the url."
Richard Sugg, an English lecturer, agrees. He discovered EEBO
when he moved to Cardiff in 2001. "Cardiff was one of only
a handful of universities that subscribed to EEBO," he
says.
Expense was the main reason. JISC, however, has bought a license
in perpetuity. The only charge to institutions is a hosting
fee of between £2,200 and £78 a year according to
student numbers.
"EEBO's made a colossal difference to research,"
Sugg says. "Arnold Hunt recently wrote in Times Literary
Supplement that internet resources such as EEBO have turned
research 'from a labour-intensive handicraft into a mechanised
industry'. Before when writing a book or article you had to
make notes of what you needed to check in primary sources in
the British Library or wherever, now once you've got your password
you just go to EEBO."
In teaching, EEBO changes the nature of courses. "Students
can't buy or borrow so many texts. With this they can push the
limits of their initiative, imagination and industry in a way
that not even the most adventurous could have done before unless
they lived near a famous library."
Matthew Steggle, lecturer in English at Sheffield Hallam University
and editor of the e-journal Early Modern Literary Studies, says:
"My university doesn't have EEBO so my experience is by
free trial. It's been a revelation."
Steggle was interested in Renaissance work on the Greek dramatist,
Aristophanes. "I pumped in the word 'Aristophanes' and
he came up all over the place, not at all where I expected.
In sermons, for example, he was cited as an authority on classical
ideas of the after-life." His worry is that smaller university
departments might not have the clout needed to persuade libraries
to buy into this resource. "Pressure needs to come from
academics from different departments banding together,"
he says.
Tim Hitchcock, Professor of 18th-century History at the University
of Hertfordshire, has been using EEBO on and off for a couple
of years for his research and website on the Old Bailey and
more recently for a book on begging. Hitchcock believes that
EEBO technology lends itself to the development of what he calls
"historical forensic linguistics". "Because the
text exists as searchable XML, there is an index of every word
that appears. I'd like to see tools for querying these indexes,
say, through word proximity statistics. This would revolutionise
our understanding of the origins of texts, of the linguistic
links between them, and of the influences and copying. This
cannot be done on paper, but it could allow us to redraw the
intellectual map of early modern Britain."
Such searches are reliant on accurate text input. The TCP has
outsourced keyboarding to three companies in India. Two keyboarders
simultaneously transcribe texts and add tagging to capture the
structure of the work, such as chapters, paragraphs, etc. A
third person checks both transcriptions before sending them
off to Michigan and Oxford for sample reviewing. If more than
one mistake per 20,000 characters is found, texts are sent back.
The current mistake rate is one per 100,000 characters.
Judith Siefring, who works at Oxford Digital Library on the
project, says that some 6,499 texts are already available online.
Some 250 plus texts a month are reviewed per month. “We
check and edit the tagging of each text and after that assign
types and generally bring the texts up to standard. We have
to make editorial changes that take account of the variety of
texts, e.g. poetry, prose literature, plays, sermons, almanacs,
dictionaries, mathematical treatises, etc."
Siefring welcomes the extra contact with UK academics JISC
involvement is bringing. Across the Atlantic, Shawn Martin,
TCP outreach librarian at Michigan University, has found US
academics' requests for texts has varied. "They range from
canonical works used in undergraduate classes (Spenser, Marlowe,
Shakespeare) to obscure titles used only in specific graduate
seminars or research projects (treatises on windmills, sermons
on specific psalms, works on lexicography).
Martin's brief involves not only EEBO but Evans Early American
Imprints and Eighteenth-Century Collections Online (ECCO). He
is anxious that UK audiences do not see these projects as purely
commercial. "We try to get as much academic collaboration
as possible not only in text selection, but by helping academics
with syllabi, scholarly projects, or other initiatives. We have
more than 100 libraries internationally supporting this."
But for Royal Holloway's Justin Champion there are still “battles
to be fought with academics who like to head off to the research
library, bury their head in books, have a coffee break and at
some point write up their notes. They say IT’s not for
them, but if you don’t use the web nowadays, it’s
like walking around in shackles.” Perhaps they just don't
believe they'll find their Holy Grail.
TCP Board Meeting
Minutes
October 21, 2004
Attending:
Board Members: William Gosling (Chair, University of Michigan),
Betty Bengston (University of Washington), Mark Dimunation (Library
of Congress, for Deanna Marcum), Marianne Gaunt (Rutgers University),
Ronald Milne (Oxford University), William Miller (Florida Atlantic
University), Jeff Moyer (Gale, for Richard Foley), Remmel Nunn
(Readex), Mary Sauer-Games (ProQuest Information and Learning),
David Stam (Syracuse University), William Walker (University
of Miami)
TCP Staff and Guests: Ross Coleman (University of Sydney), Shawn
Martin (University of Michigan), Mark Sandler (University of
Michigan), Paul Schaffner (University of Michigan), Perry Willett
(University of Michigan)
Unable to attend:
Nancy Davenport (CLIR), David Ferriero (New York Public Library),
Sarah Michalak (University of Utah), Carole Moore (University
of Toronto),
I. Welcome
William Gosling opened the meeting, welcomed new members including
representatives from two new publishers joining the TCP effort
including Remmel Nunn from Readex and Jeff Moyer from Gale,
and Mary Sauer-Games from ProQuest.
II. Evolution of the TCP Board
Since the TCP has evolved from one project in cooperation with
ProQuest to now three projects in cooperation with three different
commercial publishers, it is useful to consider how the Board
might adapt to accommodate the changing situation. The Board
discussed whether it was fitting for it to evolve as a TCP Board
concerned with all potential TCP projects (including those to
which individual members of the Board may not subscribe) and
how its meetings could be structured to address overarching
TCP objectives as well as the specifics of individual projects.
The representatives of the three companies began the discussion
by highlighting the things they would feel uncomfortable revealing
in front of their competitors. These included pricing, their
contributions to the TCP, and general marketing strategy. It
was agreed that the Board should attempt to structure its meetings
so that all members could be present and, if at some point there
was a need to divulge sensitive information, the Board could
hold an executive session in which that could be discussed.
III. Project Recruitment updates
Mark Sandler then discussed the upcoming event celebrating
the JISC launch of EEBO and EEBO-TCP. The JISC is contributing
significantly to the project and will allow us to create thousands
of new texts. It will also increase the number of potential
users to all higher and further education institutions in the
UK.
Shawn Martin then updated the board on current partner recruitment
over the past year which was quite significant. Over thirty
institutions joined since the last Board meeting. Among those
were the Oberlin group of liberal arts colleges the Consortium
of Prairie and Pacific Libraries (COPPUL) in Western Canada,
as well as UCLA and UC-Berkeley. Prospects are also very good
for the coming year, and TCP continues to grow with recruitment
of 30 new partners in both the Evans and ECCO initiatives. Board
members also suggested several independent research libraries
and large public library systems as potential recruits for TCP
as well as some international institutes and organizations.
IV. Budget Review
Discussion of the current TCP Budget then began. The Board
considered a new proposal to structure the budget so that TCP
allocates our outreach and administration costs across all three
projects based on production output rather than itemizing costs
individually. It was agreed that this should work for now, but
should be reconsidered later because outreach costs for EEBO
should be less when the project comes to a close. It was agreed
that at some point, the outreach and administration costs should
be viewed as “billable hours,” and allocated as
such, but in the meantime the budget as presented will work.
The Board then considered several scenarios detailing the future
growth of the project. Mark Sandler presented figures showing
past growth and gave some projections for future growth. It
was agreed that TCP should aim to recruit 28 new partners before
the next Board meeting, and that would ensure the creation of
25,000 texts and could present TCP with a stronger case as it
goes for second round funding, grant funding, and other sources
over the coming years.
V. Board Discussion Items
Rights of Use
The Board discussed the increasing number of requests the TCP
has been getting to use TCP text in other projects. Shawn Martin
presented some examples of previous decisions and asked for
board advice on how to go about future ones. The Board agreed
that decisions made so far had been easily justifiable and that
it would be possible for TCP to sell its text to commercial
publishers according to a fee schedule to be developed. It was
agreed that TCP staff should draft more specific guidelines
and circulate them among the Board for further consideration.
Price Increase
Mark Sandler then presented a proposal to the Board regarding
a price increase and some possible ways of doing so. The Board
agreed to a 20% increase to commence on April 1. It was also
suggested that we might wish to investigate the impact of this
increase on smaller institutions and determine what they might
like to see in terms of structure and price.
Multiple editions
Shawn Martin then presented an idea put forth by many faculty
requesting second editions of text for inclusion in TCP (rather
than selecting only the first) if faculty and scholars offered
to pay for them. The Board decided that this should not be a
problem if scholars or their institutions wished to pay for
the text and TCP staff should take into consideration further
administrative costs if needed.
Evolving TCP Model to fit community goals
This discussion came up in several contexts throughout the meeting.
TCP has become a large project serving many different scholars
and partnering with many new companies. Therefore, it would
be beneficial to add new revenue and sustain the project beyond
the three commitments currently in place. Nevertheless, this
does add complications in administration and current contracts.
The Board discussed several possibilities.
Potential grants were discussed to fund publicly accessible
texts for the project. It was also suggested that TCP set itself
up as a model of potential study for things like digital preservation,
collection development, education at both a college and K-12
level, and scholarly access and communication. It was noted
that further expansion of the TCP, especially for grant funding
might cause problems with some of our existing contracts with
commercial publishers and that any grant funding we receive
should be used to expand the project rather than fund current
commitments (i.e. go beyond the 25,000 committed to in EEBO).
The Board also discussed the possibility of an ongoing partnership
in which members pay a kind of subscription that keeps the TCP
project going “for the common good.” It was thought
that there might be potential for creating a kind of membership
organization at least among the best customers of the TCP. The
Board agreed that after EEBO-TCP approaches people from a position
of strength with 25,000 texts due to be completed, it would
be beneficial to pursue this as a possible model for continuing
TCP growth.
VI. Project Development and Production Updates
ProQuest
Mary Sauer-Games updated the Board on the many new developments
with the EEBO product. They have sold to 43 new institutions
this year, have included new material like the Thomason tracts,
and have added 6,500 TCP texts to their database so far. To
date, they have had over 61,000 view of TCP text.
Readex
Remmel Nunn then discussed the developments for the Evans product.
So far, over 106 institutions have subscribed to Evans and they
are further developing the Shaw-Shoemaker database. He also
discussed the recent selection task force meeting held in Worcester,
MA. The greatest challenge will be to select those texts in
the coming months.
Gale
Jeff Moyer then updated the Board on its progress with the ECCO
product which contains over 26 million pages and 155,000 volumes.
To date, they have 60 customers including 6 Canadian and 11
international institutions. They have also done OCR for the
ECCO product and are interested in how the TCP text will work
and integrate with their OCR.
Australia
Ross Coleman from the University of Sydney was a guest of the
Board in the afternoon to discuss opportunities for the three
commercial publishers and TCP in Australia. He detailed the
structure of the libraries in Australia and suggested that TCP
might be able to maximize its impact by selling jointly with
the commercial publishers and that they have a strong tradition
of consortial purchasing. Therefore, it would be best to sell
to the consortium rather than one by one.
Production
Perry Willett and Paul Schaffner then presented the Board with
the latest production updates. Perry, who had come from a TCP
partner institution and was now in charge of TCP production
remarked on the amazing ability of the staff to meet production
targets and goals. Paul discussed what some of those goals were
and remarked on the 8,000 texts produced so far and that roughly
14 to 21 books per day are managed through our process. He also
discussed the difficulty in deciphering the many unusual characters
in early printing.
Outreach
Shawn Martin then updated the Board on the many outreach efforts
this year (exclusive of new partner recruitment). In addition
to conferences and publications, TCP has piloted new projects
at the University of Toronto, National Library of Wales, and
the University of Chicago. Scholarly projects around the world
are increasingly asking for TCP’s help and partnership
in their own projects. TCP now has two teams of School of Information
students studying the effects of TCP on education, and initiatives
set in motion last year like the Academic Advisory Group and
requests for records have been successful. Overall, it has been
a very good year for TCP in terms of both recruitment and other
outreach.
VII. Planning for next meeting
The meeting adjourned approximately 3:00 p.m. and TCP staff
will be in touch with Board members regarding additional materials
and plans for the next TCP Board meeting.
October 25, 2004
The Joint
Information Systems Committee (JISC) in the U.K.
recently launched both Early English Books Online and the Text
Creation Partnership at a recent event at the British Library
entitled "Waking Up at the British Library." The event
was a great success and raised awareness of this resource to
many academics in the U.K. Below is an agenda for the event,
and links to the relevant presentations. For more information
about the JISC and EEBO-TCP, you can also go to their event
site. JISC also put out some excellent press
releases that will give TCP much publicity
throughout the UK.
"...WAKING UP IN THE BRITISH LIBRARY..."
British Library Conference Centre
25 October 2004
10.00 - 16.00
PROGRAMME
10.00 REGISTRATION and COFFEE
10.25 OPENING REMARKS
JOHN TUCK
Head of British Collections, The British Library
10.30 KEYNOTE SPEAKER
PROF LISA JARDINE
Professor of Renaissance Studies at Queen Mary and
Honorary Fellow of King’s College Cambridge, Director
of
AHRB Centre for Lives and Letters
11.00 INTRODUCTION TO EEBO AND TCP
THE
TCP PROJECT - STRATEGY
MARK SANDLER
University of Michigan
Collection Development Officer
11.30 REFRESHMENTS
11.45 THE
TCP PROJECT – EDITORIAL
JONATHAN
BLANEY, EMMA LEESON, JUDITH SIEFRING
University of Oxford
TCP Reviewers
12.15 USING
EEBO-TCP IN TEACHING AND RESEARCH
DR MATTHEW STEGGLE
Department of English, Sheffield Hallam University
PROF JUSTIN CHAMPION
Department of History, Royal Holloway, University of London
12.45 QUESTION AND ANSWER PANEL WITH SPEAKERS AND
LORRAINE ESTELLE, Collections Team Manager, JISC
13.00 LUNCH
13.45 HANDS-ON WORKSHOP
15.40 OPEN FORUM
16.00 TEA and CLOSE
Mark Sandler, who attended the event on behalf of the TCP commented
that:
'Before too much time slips by, I wanted to convey my thanks
to all the participants who took time from their busy schedules
to give consideration to the EEBO-TCP initiative. While it is
easy to think of JISC as "just another library consortium,"
in practice it far exceeds that narrow characterisation. It
is one thing to serve as a purchasing agent for a group of libraries,
but quite another to take responsibility for extending educational
opportunity to all students in the UK. Even more impressive
is taking responsibility for ensuring that resources, once purchased,
are effectively used. JISC is an exemplar for library consortia
and cooperative library initiatives everywhere.
As for the "Waking up in the British Library" program,
it was extremely gratifying to hear from faculty that it was
contributing to their research and teaching. The faculty speakers
touched on some critical themes about EEBO and EEBO-TCP. They
talked about the democratising effects of making the texts and
images widely accessible. At least two of the speakers used
the word "transformative" to describe the research
and pedagogical impact of searchable text presented in conjunction
with page images from original editions. They also mentioned
the broad interdisciplinary access that searchable text provides
- one no longer need be a specialist in the early modern period
to seek out and find relevant resources. And finally, one or
more of the speakers noted that even for specialists, this has
greatly extended their reach to uncover new information - be
it central or contextual to the primary themes of their research.
Needless to say, all of these reactions are quite reinforcing
to the EEBO-TCP production teams at Michigan and Oxford who
always believed they were making an important contribution,
and are now hearing that they've been on the right course.
The broad range of access that the JISC negotiated on behalf
of the UK will undoubtedly breathe new energy and new ideas
into the project. We are eager to capture examples of classroom
use of the project from different types of institutions of higher
education. We're hoping to see many UK submissions coming forward
from the annual student essay contest jointly sponsored by ProQuest
and the TCP. We're poised and ready to assist scholars in their
research and teaching by moving forward the texts they need,
or providing detailed information about search options. And
finally, we're open to partnering with many related initiatives
underway in the UK, understanding that linking resources is
a great service to users in an increasingly complex information
world.
We are committed to keeping the project website fresh, providing
updates on project output (currently at 7,891 texts online),
new texts being added, instructional ideas, meetings, and forms
for suggesting useful titles for keyboarding. We look forward
to lots of ongoing contact with users in the UK, and hope that
the JISC staff can stay involved as well in providing feedback
to the project, and continuing to encourage uptake and experimentation.
Thank you again for all your efforts in organizing the excellent
rollout event at the BL.'
November 10, 2004
TCP Hosts University of Michigan Focus Group on Electronic
Resources
Summary by Julia Gardner
Attendees: Julia Gardner (Student, School of Information),
Shawn Martin (TCP Project Outreach Librarian, University Library),
YuFang Lin (Student, School of Information), Steven Mullaney (Professor,
Department of English), Devon Persing (Student, School of Information),
Eric Rabkin (Professor, Department of English), Rasika Ramesh
(Student, School of Information), Mark Sandler (Collection Development
Officer, University Library), Kathy Schroeder (Student, School
of Information), and James Sweeney (Student, School of Information)
Class use of e-resources
Both faculty members indicated they (and the students) valued
ease of access to electronic materials. They indicated that particularly
now, with widespread broadband access, they feel comfortable assuming
all students can get access to electronic texts at any time. Their
comments indicated that, as anticipated, students prefer resources
that are always available, easily searched, and accessible from
their point of need (i.e. does not require a trip to the library).
One faculty member does not teach in early modern studies, and
the texts he works with are not currently available online; however
he indicated that if the materials were available electronically
he would construct different assignments from the ones he now
gives, in order to more fully utilize the options available from
electronic text.
Both indicated that they would like to see wireless access in
classrooms to facilitate greater use of electronic resources and
to support the spontaneous incorporation of such texts into discussion.
They felt this environment would enable a more collaborative process
among students and professor, with the professor, for example,
referencing a passage in the course of discussion and being able
to count on some of the students with laptops in class quickly
locating the reference. They felt the current lack of connectivity
limited the extent and ways in which professors make use of e-resources.
New Uses for e-resources
Both users felt that their fellow professors needed to see something
new that can be done using e-resources in order to change their
thinking about the role such texts could play in course design,
assignments and teaching. Similarly, they felt it was important
to show students examples of work that can be done using e-texts
that excited them, to show students how they can engage in constructive
learning.
One example given was showing students how to construct search
sets, the results of which could then lead to new thematic analysis
or comparison of the texts under consideration. They emphasized
how e-resources can help both students and faculty alike learn
to ask new questions, and how certain search functions and results
can make the text ask the researcher new questions.
Finally, they addressed the issue of plagiarism, and the concern
that e-resources can make plagiarism both difficult and easy to
detect. They emphasized the need to demonstrate how inventive
use of e-resources can allow the instructor to construct tasks
that don’t invite student plagiarism, that are in fact complex,
process-oriented tasks. Not only would such assignments be resistant
to plagiarism attempts, they also engage the students in performing
meaningful research.
Interface design issues
Both stressed that for scholarly research it is vital an e-resource
does not close off interpretive possibilities by making decisions
prematurely, such as deciding to modernize all spelling. Similarly
they don’t want the resource to decide what content is valuable,
but would instead want everything from an original source to appear,
and to appear in its original form in the electronic version.
They did note that for searchable text, issues such as having
all the variable spellings were not necessary.
They discussed the cost of information and the tendency, especially
among students, to go for what is “cheap,” that is,
easily accessible and fast. They pointed out the importance of
a resource having accurate recall and precision in returning search
results because students will typically go with the first few
results they get, and assume they have “searched everything.”
Wish list
Frequency distribution
Collocation ability
Z scores
Speedier load time for EEBO
Vocal element – accessibility issue, way to make text available
for the blind
Ability to enter a root term and have all variations returned
(without having to truncate root)
Be able to easily incorporate multimedia
– show and listen to clips from Shakespeare plays, compare
ways a scene is interpreted in different performances
– be able to click on a passage of text and be taken to
video clip of that scene
User Support
The users had suggestions for both in-person instruction and online
tutorials. Online materials include:
• Help button function. Could open a pop-up window that
either gives an explanation of the function or term, or that contains
a menu of mini-lessons. The user could choose which mini-lesson
to take.
• Online tutorial available similar to those offered by
Macromedia
In-person instruction involved two levels of training:
• One level would be geared more to undergraduates, familiarizing
them with the resources central to their major, for instance,
the MLAB.
• More sophisticated levels of instruction for graduate
students that would enable them to get up to speed as quickly
as possible for incoming students, introduce them to the range
of e-resources available and instruct in how to use.
In discussing in-person and online, the users raised the issue
of human resources required to create and maintain online tutorials
for materials that might change quickly. As was noted, the online
and in-person instruction option are not mutually exclusive, and
while some types of instruction needs could be supported through
online help, others could better be served by face-to-face interaction.
A third element of user support also emerged in the discussion,
pertaining to video use. As mentioned earlier, the professors
would like a way to make video material accessible in the same
24/7 manner currently available through electronic reserves for
printed text. They raised the possibilities of obtaining site
licenses for video, or creating a keyed server to provide streaming
video, so that the material would be available only to authenticated
students enrolled in that class.
Back to News
Archive