Dec: Marsh installed the current version of ImageMagick on
clamato for testing with the MDP access system. Snavely coordinated or
participated in a number of detailed discussions with the MDP group on
the rights database, conventions for naming handles, and mechanisms for
storing metadata on MDP processing and for MDP linking between Mirlyn
and the prototype access system.
Jan: Snavely, working with Wilkin, Willett, and Valentine,
developed and documented a strategy for rights information management
for MDP, and began discussions with DLPS staff on integration with the
access system. Feeman worked with DLPS staff to begin developing
scripts for creating METS files and handles for all MDP items, which
are dependencies of the access system being developed by DLPS staff.
Feb: Snavely completed discussions with DLPS staff on the
design
and use of the rights database, and built the database. Feeman
completed
initial work on scripts to generate METS files and handles, and created
them for all MDP volumes; continued work on download and validation
scripting
(which will be made available to Google partners via networked CVS set
up
by Snavely); developed overview documentation of the system; and began
work on
scripting to populate the rights database.
Mar: Snavely, in discussions led by Willett, floated the idea
of a workflow involving Google for brittle books, and began to review
storage RFIs. Feeman operationalized download, validation, and ingest
processes, and released the first version of the system, dubbed GROOVE,
via CVS to interested partners.
Apr: Feeman prepared a predictive estimate of GROOVE capacity
for Snavely, and Snavely introduced the workflow idea for brittle books
to Google.
May: Wilkin and Snavely discussed additional network
throughput
requirements with ITCS to meet projected volume increases.
June: Marsh and Snavely observed stability problems with the
Apache module chosen to handle usage throttling, and began
troubleshooting. Feeman began examining the source code for the module
to make a performance improvement unrelated to the stability problem.
Snavely added attributes to the rights database to accommodate a
different copyright stance adopted by general counsel. Snavely
discussed security issues with Siraj at Google in the first of a series
of conversations aimed at developing and meeting Google security
requirements for the access system. Snavely conducted a meeting with
local digitization staff to decide how Mirlyn item statistics and
status codes should be used to track local digitization that will go
through Google for ingest, and Feeman conducted a meeting with the same
staff to discuss the automated validation these materials will undergo.
July: Snavely resolved the stability problems with Apache
with several minor code changes and a complete server rebuild, and
Feeman integrated her changes with those. Snavely phased the new server
build into production, clearing the last technical obstacle for
release. Willett researched IP-to-country mapping databases, and
Snavely chose one, installed the required software and weekly update
routines, and provided an explanation of the API to Farber for
integration into the access system. Snavely continued discussions with
Siraj on security of the access system, and was given the OK to release
the system, and a promise of more concrete security guidelines as they
develop. Feeman, Snavely, and Syrigos discussed ideas for a
high-performance QA design interface. Snavely met with Hockett from
ITCom to measure network download performance and sent test results to
Google.
Aug: The access system, dubbed MBooks, was made available to
the public.
Institutional Repository.
Apr: Marsh began to upgrade sambuca to Fedora Core 4, but had
to abort
the process due to suspected media errors and subsequent problems.
May: Marsh added repurposed storage to sambuca and Snavely
configured it for use with DSpace.
Aug: Marsh upgraded sambuca to Fedora Core 4, and Snavely,
with assistance from Blanco and Marsh, installed a version of Postgres
compatible with the existing database.
Shibboleth.
Dec: Snavely began discussions with ITCS to create a
lightweight project plan for a Shibboleth pilot project with ProQuest.
Jan: Snavely discussed the foundations of a project plan with
Doster to help ITCS with planning.
May: Snavely met with Doster and McGowan at ITCS to discuss a
draft project plan, and initiated contact with ProQuest.
June: Snavely and Doster met with ProQuest staff to discuss
feasibility and timing for a pilot, and Snavely began discussions with
Internet2 staff to try to identify additional subscribers or resource
providers to help increase the critical mass for participation.
July: ProQuest offered to join Snavely and Doster for a
montly status report, and the first joint call was scheduled for August.
Aug: All shared updates from their organizations and
reiterated their interest in the project, stressing the importance of
additional participants in order to gain critical mass.
Infrastructure
Improvements to storage infrastructure, web server redundancy.
Dec: Marsh continued work on setting up the network storage
server on north campus, but ran into hardware problems with the storage
array and began troubleshooting.
Jan: Marsh solved the problem with the storage array on the
north campus storage server, completed the setup of networked storage,
and began installation and configuration of new web servers.
Feb: Marsh began compiling and installing a new web server
configuration on the new web servers which will include support for
mod_perl, PHP, SSL, and a tunable usage governer to slow or prevent
automated downloading.
Mar: Marsh continued work on the new north campus web
servers, and began troubleshooting storage networking problems observed
after Snavely attempted to put in place NFS tuning parameters currently
in use on central campus.
Apr: Marsh continued to troubleshoot storage networking
problems on the north campus storage server.
May: Marsh completed troubleshooting on the north campus
storage server, copied data to the new server, completed the setup of
dlps11 (the first of two new identical web servers) and began working
with DLPS to test. Snavely installed and configured TSM and started
backups running.
June: Marsh continued testing with DLPS, resolving several
minor problems along the way.
July: Snavely resolved stability problems with the Apache
server used on the new web servers (see above), and Marsh resolved
stability problems with the storage server that were occurring during
backups or any jobs consuming large amounts of memory.
Aug: dlps11 and dlps12, the two new web servers using the
networked storage configuration, were put into production for
non-public text alongside the existing server, dlps8.
COSIGN.
Jan: Snavely coordinated a discussion with DLPS staff on a
single service hostname for DLPS services, and DLPS decided on a name.
The new name will be used in the upcoming rollout of new web servers on
north campus.
Feb: Snavely met with Dueber and Lu to discuss the
migration
process to COSIGN, and the impacts on Web Services applications;
Snavely
also began planning a new cutover strategy involving a long transition
window in which Lib Auth credentials are obtained via COSIGN
authentication.
Mar: Snavely began work on modifying Library Authentication
to use Kerberos 5 authentication at the request of Guthrie at ITCS (who
is managing the phase-out of Kerberos 4 support in coming months). The
test Apache instance set up for this urgent development will be used to
develop the COSIGN support mentioned above.
Apr: Snavely completed and released the new version of
Library Authentication with Kerberos 5 support; and developed and began
testing a script to generate Library Authentication credentials using
COSIGN authentication which will be used in the transition to COSIGN
for all library services.
May: Snavely completed development of the script to allow
users to log in via CoSign, adapted the logout script to match, and put
these features into production with a notice of the upcoming change.
The system is staged for cutover. Snavely initiated a number of
discussions
with user service points inside and outside the library and began
providing notification to affected patrons.
June: Snavely sent email notification in coordination with
Grad Circ, SPO, and several content providers to over a thousand users
to prepare for the cutover.
July: Snavely cut over authentication services to use
COSIGN as planned without service interruption. All library-hosted
services are now in a transitional state; COSIGN authentication is used
to generate library auth credentials. The next step will be to have
library auth recognize COSIGN credentials directly, which will involve
server-level changes and some adjustments to application logic.
Redundant MySQL.
Dec: Snavely set up TSM backups on soymilk, the MySQL
server on north campus and synchronized the installed packages on both
cowmilk and soymilk to ensure their configurations are identical.
Jan: Snavely set up replication on the new MySQL servers,
developed monitoring scripts to detect replication problems, and
created and timed a cutover plan. Sheppard, Snavely, and Weise tested
the new servers and coordinated a cutover involving only 35 minutes of
downtime.
July: Snavely and Weise worked together to troubleshoot the
cause of occasional MySQL replication failures. The cause was theorized
to be the type of query used in Image Class, which Weise adjusted.
Snavely also made a few suggestions to optimize the process for
preparing image metadata databases, and Weise found ways to integrate
them for substantial performance improvements.
Production server replacements/upgrades.
Dec: Marsh began meeting with storage vendors to seek
solutions and pricing for a major storage upgrade for the servers that
host Mirlyn and peripheral services.
Jan: Marsh continued to meet with vendors to discuss
additional details and gather pricing information.
Feb: Marsh, using scripting developed by Feeman, began to
quantify I/O performance characteristics on several of the servers that
will use the new storage to confirm requirements will be met.
Mar: Marsh continued to meet with vendors to discuss
configuration details, performance requirements, and pricing; and with
Lewis from CAEN for recommendations and help with evaluating options.
Apr: Marsh discussed final recommendations for a storage
system with Snavely. Snavely consulted with university purchasing and
was advised to conduct a formal RFQ, and given several sample documents
to work from; Marsh and Snavely will work on the RFP together as
quickly as possible.
May: Marsh ordered and installed an additonal storage array
for the MDP storage server, and Feeman configured GROOVE to begin using
it.
May: Snavely drafted the RFQ with assistance from
purchasing,
and submitted it. Vendor quotes are due June 30.
July: RFQ responses were received from purchasing.
Responses from additional vendors beyond our original research were
received, so after some back-and-forth to make sure all materials were
received, Marsh began preparing an evaluation of the responses.
Aug: Marsh and Snavely prepared the evaluation of the RFQ
responses, selecting the Pillar Axiom system, and forwarded it to
purchasing; Marsh began finalizing details of the purchase agreement,
including a finalized quotation and testing criteria.
Replace DLXS statistics.
Jan: Feeman began to review the COUNTER and SUSHI
guidelines for statistics systems for electronic research resources.
Mar: Feeman and Snavely met with Dennis, Powell, and
Willett to discuss objectives and plans for the new stats system.
Apr: Feeman began researching tools and options, and
prepared a working design document to track issues and guide
implementation.
May: Feeman finished researching tools and options,
concluding
that no options flexible enough for our needs exist. Feeman and Snavely
continued to discuss implementation options. Feeman revised the design
document and developed a draft XML configuration schema for the stats
system.
June: Feeman continued design and development of the stats
system, calling a meeting with interested parties to get comments on
initial design ideas.
July: Feeman continued development of the system, providing
all existing features, and replacing title-level stats with
COUNTER-style reports deliverable in Excel format. Snavely assisted in
a design of the top-level search form to better integrate all features
on one screen.
Aug: Feeman continued development of the system.
Develop an authentication and authorization management
system.
Centralize LIT backup.
Formalize system administration infrastructure.
Dec: Marsh began to develop generic documentation for OS
installation and configuration that will be applicable to both Fedora
and Debian OS distributions.
Revisit logging practives.
DHCP/DNS services.
Development server replacements/upgrades.
Feb: Marsh upgraded clamato to Fedora Core 4.
May: Marsh upgraded sangria from 1 GB to 2.5 GB of memory,
resulting in additional load handling capacity for OAI harvesting and
indexing. The upgrade revealed a problem with SCSI connectivity which
was
resolved by Marsh and Snavely over the next week.
Aug: Marsh upgraded martini, the OCR file server, to Fedora
Core 4 and resolved resulting problems with AFS client software.
Security assessment.
Work flow, processes, and organizational work
Trusted Digital Repository (TDR) compliance and quality
management.
May: Feeman developed a design document for a system to
monitor checksums in the new repository and began developing code.
Documentation and standardization of processes.
July: Formalization and standardization of processes will
be a likely result of the ITSS unit security planning (see above).
Improve coordination of data loading.
LIT work requests and scheduling.
LIT on-call service.
Cost model.
Technology trends awareness.
July: Registrations for the September Storage Decisions
conference, recommended by Lewis at CAEN, were approved for Marsh and
Snavely.
Miscellaneous and/or unplanned activity
Dec: Jessica Feeman was hired as a programmer in Core Services,
and will start work in the new year.
Jan: Marsh upgraded the Oxygen XML editor to the newest version
on all servers.
Feb: Snavely, using an inventory of volumes supplied by
Geitgey, began an analysis of all LLMC page images to determine the
extent of problems in rendering derivatives, finding 122650 problem
images on the first pass of analysis.
Feb: Snavely pursued getting console access to servers at the
Duderstadt Center server environment, and Marsh worked with CAEN staff
on the particulars of usage. CAEN staff will no longer provide
non-business-hours support for hardware, so this capability was
critical.
Mar: Marsh installed a rack-mount monitor in the Duderstadt
Center server environment.
Mar: Snavely continued analysis on LLMC page image problems
using JHOVE to determine the validity of the files impacted.
Mar: Marsh reviewed and renewed all Sun support contracts for
the next fiscal year.
Apr: Snavely installed network tuning utilities and began the
install of a high-performance ssh version on quik at the request of
contacts at the Survivors of the Shoah Visual History Foundation.
Apr: Snavely completed the analysis of LLMC page image
problems, concluding that all problems would be resolved by the
imminent
move to Linux hosting.
June: Snavely set up the mechanisms to test load-balancing and
failover for Searchtools and the library proxy.
June: Snavely and Munce attended the ITSS unit security liaison
kick-off
meeting and will involve Marsh and Syrigos in the program.
July: Marsh and Snavely, along with Munce and Syrigos, attended
the second ITSS
unit security liaison workshop. The end goal of the project will be to
create an IT security plan for the library that will involve some level
of security assessment of services, and result in the creation of new
procedures and policies.
July: Marsh migrated all servers using AFS from CAEN to ITCS,
which will simplify account administration. Snavely reworked and
simplified the dot file customization scheme to streamline new user
setup.
Aug: Marsh, Munce, Snavely and Syrigos attended the third ITSS
unit security liaison workshop and continued to inventory servers and
potential risks in the library environment.
Aug: Snavely along with Web Services staff attended a
conference call with Ex Libris to discuss server load testing and
configuration recommendations. Marsh assisted Web Services staff with
load testing, and we await configuration recommendations from Ex Libris.
System Performance:
Arbor Lakes server environment
ethel.umdl (ARC): No down time
ezra.umdl (Verde): No down time
gracie.umdl (Aleph): No down time
hiram.umdl (Zebra): No down time
tequila.umdl (Metalib and SFX): No down time
Central Campus server environment:
belle.umdl (Library Web): No down time
coffee.umdl (Oracle): No down time
cowmilk.umdl (MySQL): No down time
gin.umdl (WebCheckout): No down time
merlot.umdl (DLPS public collections): No down time
sambuca.umdl (Deep Blue): No down time
quik.umdl (Survivors of the Shoah cache): No down time
ting.umdl (library proxy): No down time
North Campus server environment:
dlps6.umdl (Oracle): No down time
dlps7.umdl (Numeric and Geospatial Data): No down time
dlps8.umdl (DLPS non-public collections): No down
time
dlps9.umdl (DLPS image collections): No down time
dlps10.umdl (DLPS public collections): No down time
dlps11.umdl (DLPS non-public collections): No down time
dlps12.umdl (DLPS non-public collections): No down time
soymilk.umdl (MySQL): No down time
Data Loading and Archival Statistics
CD/DVDs loaded or reloaded
(bitonal page images)
38
CD/DVDs loaded or reloaded and
processed (contone images)