Core Services Cumulative Monthly Report, November 2007

Previous monthly reports

  • Content/Web infrastructure
    • Primary objectives
      • Finish migration of DLPS web services to redundant networked storage.
        • Dec: Marsh installed new storage on the Hatcher storage server, and Snavely began batch synchronizing ETS and Image Services content to it.
        • Jan: Snavely completed batch synchronization of ETS and Image Services content to the Hatcher storage server and put routine synchronization routines in place to keep it up to date. Marsh upgraded the storage array from dlps8 (which was retired) with new disks and current firmware, installed it on on the Duderstadt Center storage server, and Snavely began batch synchronizing HTI content to it.
        • Feb: Snavely completed batch synchonization of HTI content to the Duderstadt storage server, put routine synchronization routines in place to keep it up to date, and began discussions with DLPS about workflow changes required to finish the consolidation.
        • Mar: Snavely worked with DLPS to transition the new consolidated server name, quod.lib.umich.edu, into production for HTI resources while maintaining the HTI website; Snavely continued discussions with DLPS about the final steps of consolidation.
        • Apr: Snavely worked with DLPS to plan for consolidating storage on the two storage servers, including determining authoritative version of content where they were present in both environments, and discussing solutions to workflow impacts.
        • May: Snavely consolidated the data on the two storage servers with no service outage, coordinated changes to loading routines and other processes that update data in production to update both locations, and reconfigured all DLPS web servers to each serve all DLPS content.
        • July: Snavely began configuring and testing two new servers, hefeweizen and porter, for DLPS web service.
        • Aug: Snavely consolidated the historical web server names ets.umdl.umich.edu and images.umdl.umich.edu to the new server name quod.lib.umich.edu and continued setting up the two new DLPS web servers.
        • Sep: Snavely finished setting up the two new DLPS web servers, and Hawkins and Morse conducted testing.
        • Oct: Snavely resolved a stability problem with the new web servers and put them into service.
      • Replace web load-balancing system.
      • Increase scalability of ingest/processing/validation processes with multiple front-end servers and/or SSI clustering.
        • Dec: Marsh and Snavely installed a new blade server chassis that will be used to house multiple new servers, some of which will be dedicated to content ingest and processing.
        • Jan: Feeman and Marsh configured the new blade server chassis and began to install operating systems on the blade servers.
        • Feb: Feeman and Marsh completed the blade server chassis configuration and Marsh began an OS install on the first blade. Feeman prepared detailed throughput metrics for different stages of GROOVE that will be used to tune the aggregate throughput once blade servers are available.
        • Mar: Marsh continued to work on OS installation on the first blade server. Feeman revised throughput metrics for GROOVE.
        • May: Marsh completed OS installations on the first two blade servers, and Feeman made several adaptations to GROOVE for parallel operation and moved all ingest processing to the new blade servers for a substantial increase in throughput.
        • June: Feeman made several improvements in GROOVE for more efficient operation. These additional efficiencies combined with the speed of the new blade servers enabled GROOVE to ingest more data in June than we had ever done in a single month.
        • Aug: Feeman added functionality in GROOVE to better handle input from rights determination, to detect C-size volumes as they are downloaded (using a feed of data from Mirlyn), and to extract samples from C-sized volumes for quality checking.
        • Aug: Feeman created custom reports for Karle-Zenith and Willett by extracting information from GRIN.
        • Sep: Feeman, working with Powell, began to adapt GROOVE to handle ingest of legacy DLPS materials.
        • Oct: Feeman began to modify GROOVE to support the use of a namespace identifier in METS file creation and elsewhere internally.
      • Apply automation and processes created for MDP to local Digital Conversion workflows.
      • Migrate to new web statistics system.
        • Dec: Feeman began running the new stats system side-by-side with the old starting in December and ported all non-title-level legacy statistics from the old system to the new. Because title-level statistics data were stored in the old system in a non-portable way, and because the new system provides this information in the form of COUNTER-style reports, title-level data was not ported but instead the web pages for title-level stats were harvested and will be made available alongside the new system.
        • Jan: Feeman continued to work out small problems in display and tabulation in preparation for making the system viewable as a release candidate.
        • Feb: Feeman added a totalling feature and made several adjustments to COUNTER statistics for closer adherence to the guidelines, and Feeman and Snavely helped Bonn prepare an announcement about the new system for hosting partners.
        • Mar: The new stats system was well-received by hosting partners and a tentative release date was scheduled for mid-April. The review brought a problem to light, and Feeman reloaded and reprocessed a large amount of data to correct it. Feeman and Snavely, prompted by Bonn, began discussions with DLPS on mechanisms to re-introduce session-based statistics, which were abandoned some time ago when DLXS was re-engineered to store session information in web cookies.
        • Apr: The new stats system was released, and the interface to the old stats system was shut down. Tabulation will continue in the old system for at least another month.
        • May: After finding and fixing a problem in the statistics system after release, Feeman re-ported tabulation data for December through May from the old system.
        • July: Feeman began work on enhancements to the stats system for displaying usage of public resources, such as MBooks, to the public, and for including additional totals in custom reports.
        • Aug: Feeman further enhanced new totals features in response to feedback from DLPS staff and prepared documentation that explains the types of reports available and the breakdown of information they contain based on the privilege of the user accessing the system.
        • Sep: Feeman increased the performance of the system through query optimization.
        • Oct: Feeman added the capbility for COUNTER Database Report 1 in response to requests for this functionality for BAS. The change is being reviewed and will be put into production once approved.
        • Nov: Feeman rewrote the process for managing log file rotation, and stopped the processing done by the old stats system.
      • Finish transition to CoSign authentication.
    • Secondary Objectives
      • Develop web-based authorization management system.
        • Dec: Feeman and Snavely developed an ordered list of bug fixes to address and new features to be developed for the authentication and authorization management system.
        • Jan: Feeman investigated several Perl modules that may be useful for IP address range manipulation.
        • Feb: Feeman and Snavely worked together to create a development environment for this system and Feeman began investigating bugs and working on new features.
        • Mar: Feeman continued to make progress on the management system, and she and Snavely worked together on approaches for validating IP addresses input by the user and for handling ether particulars of IP address editing.
        • Apr: Feeman finished bug fixes and enhancements to the management system, and will review the fixes and new features with Snavely at a future meeting.
        • May: Feeman and Snavely walked through the enhancements to the management system, identifying several small areas for tweaks, and will demonstrate it for Patterson, who will take over primary maintenance responsibilities.
        • July: Feeman demonstrated the management system to Patterson and Snavely, and worked through several examples. Patterson was added to the email list for IP address updates to begin to see the type of requests that are routinely processed.
      • Migrate Library Web Services to networked storage.
      • Continue work with ITCS on Shibboleth origin pilot.
        • Dec: Snavely contacted the people who expressed interest in a Shibboleth pilot with ProQuest to determine their level of interest and readiness, and received one or two enthusiastic responses; ProQuest remains somewhat on the fence about implementation. Snavely sought information on other service providers currently supporting Shibboleth from Internet2 and consulted with Dennis and Folger on preferences.
        • Jan: Doster and Snavely discussed strategy, and decided to keep contact with ProQuest but pursue work with other service providers as identified by Folger. Doster and new manager Montague reiterated support for the library's work with Shibboleth.
        • Feb: Snavely contacted JSTOR and Elsevier to determine their interest level in a pilot; JSTOR has just announced Shibboleth support and was responsive to initial questions.
        • Mar: Doster and Snavely discussed ITCS plans for building out production Shibboleth infrastructure, and Snavely shared these plans with Elsevier and JSTOR to begin to estimate when a pilot may take place.
        • Apr: Snavely shared plans for Shibboleth development with ProQuest on a quarterly conference call.
        • Aug: Doster reported to Snavely plans for developing a testing infrastructure at ITCS, and established a proof-of-concept query from a current version of Shibboleth to the access control database.
      • Migrate all web service to Apache 2.
  • Storage
    • Primary Objectives
      • Migrate to new Pillar Axiom SAN in ALDC.
        • Dec: Marsh discovered and resolved a Solaris problem relating to the mapping of SAN LUNs to OS devices.
        • Jan: Marsh, Prettyman, and Snavely developed a plan to migrate clyde from legacy storage to the SAN, and did the move a few days later, collecting transfer rate statistics that will be used to predict the downtime window for future production server migrations.
        • Feb: Marsh installed dual redundant host bus adapters in gracie, connected it to the SAN, and began configuring and testing them. Marsh ordered a host bus adapter for ethel.
        • Mar: After repeated problems with connectivity between gracie and the SAN, Marsh involved Pillar and Sun technical support, and began using ethel to troubleshoot the problems and try out potential solutions.
        • Apr: Marsh migrated ethel from legacy storage to the SAN, and scheduled two outages for gracie to troubleshoot persistent SAN connectivity problems. The troubleshooting eliminated some potential causes of the problems from consideration, but did not yet resolve them.
        • May: Marsh continued troubleshooting SAN connectivity problems on gracie, involving LSI, Pillar, and Sun support.
        • June: Working with Lewis and Sun support, Snavely was able to complete the installation of host bus adapters in gracie successfully, and began troubleshooting SAN connectivity on one of the adapters.
        • July: Working with Lewis and Pillar support, Snavely resolved the issues with SAN connectivity and began configuring and troubleshooting the multipathing configuration.
        • Aug: Snavely continued to work with Pillar support to troubleshoot problems with multipathing.
        • Sep: Snavely arrived at a working multipathing configuration, set it up on all servers connected to the SAN, and successfully moved one filesystem on gracie to the SAN.
        • Oct: Snavely moved the first of three remaining filesystems on gracie to the SAN, but observed substantial performance problems, and after a week of troubleshooting, moved the filesystem back to its old array and began working with the Advanced Solutions Group at Pillar.
        • Nov: After Pillar confirmed that the multipathing software was the cause of the performance problem observed in October, Snavely implemented and tested a new driver and multipathing configuration recommended by Pillar support.
      • Finish MDP RFP process and design and install sustainable, large-scale storage solution for MDP project.
        • Jan: Snavely developed an evaluation spreadsheet based on criteria laid out in the RFP, received the RFP responses, and began reviewing them. Snavely contacted Bridges and Lewis to seek their help in evaluating options once the responses have been summarized.
        • Feb: Snavely continued preparing summaries of vendor responses and scheduled a meeting for mid-March with Lewis and Marsh.
        • Mar: Snavely, consulting with Lewis and Marsh, selected the top proposals from all responses and pursued presentations from those bidders to be scheduled in April.
        • Apr: Snavely coordinated very productive presentations and question-and-answer sessions with the remaining bidders, and began drafting a best-and-final RFQ to be issued in May which will collect pricing information on final system configurations.
        • May: Snavely issued a best-and-final RFQ, answered several questions from bidders, received the responses, and began making contact with bidder-supplied references.
        • June: Snavely, consulting with Lewis, reviewed the RFQ responses, made a final decision, and began making detailed preparations for installation while the gears turn in purchasing.
        • Aug: Snavely moved two racks to new data center space which will be used to mount the new storage and several accompanying servers.
        • Sep: Feeman and Snavely worked with the local Isilon tech to install and initially configure the new 100 TB NAS cluster.
        • Oct: Funding was approved for a second 100 TB NAS cluster, and Feeman and Snavely again worked with the local Isilon tech to install and initially configure it.
  • System administration
    • Primary Objectives
      • Finish transition of production services to TSM backup service and upgrade existing backup system for development servers.
        • Dec: Snavely upgraded the current development backup server to new hardware to increase performance and reliability.
        • Oct: Snavely replaced a faulty tape drive on the development backup server and reconfigured its internal disks for better performance.
      • Improve remote access capabilities to server resources using serial console servers, network KVM (for blades), and VPN device(s).
        • Jan: Feeman and Marsh configured the network KVM component of the blade server chassis while setting up the chassis itself, as reported above.
      • Replace production and development servers as scheduled.
        • Jan: Marsh upgraded the OS on sangria to Fedora Core 4.
        • Oct: Feeman, Korner, and Patterson installed the OS on the replacement Metalib server; further installation work is pending electrical analysis at Arbor Lakes.
      • Standardize and document system administration processes.
      • Transition to Gigabit Ethernet service in room 10.
        • Feb: DSS coordinated the installation of new switches in room 10.
    • Secondary Objectives
      • Begin to develop centralized system administration mechanisms.
      • Transition to ITCS-supported DHCP.
  • Security
    • Primary Objectives
      • Complete security planning and perform TBD high-priority tasks.
        • Dec: Marsh, Munce, Snavely, and Syrigos completed a first draft of a security plan.
        • Jan: Marsh, Munce, Snavely, and Syrigos began discussing high-priority tasks drawn from the security planning process, and submitted to ITSS the first draft of the security plan and a first draft of an incident response procedure.
        • Feb: Marsh, Munce, Snavely, and Syrigos continued to develop ideas in three areas: a) staff guidelines for security practices targeted toward distinct user types, b) establishing periodic internal audits in areas such as user accounts and firewall maintenance, and c) developing and testing a local process for security incidents.
    • Secondary Objectives
      • Investigate virtual firewall service and request/implement as appropriate.
  • Initiatives
    • Support integration/rearchitecture/massive scaling of locally-developed content systems (Deep Blue, DLXS, MBooks).
    • Delve deeper into SSI clustering and server virtualization; build proof-of-concept for virtual servers on an SSI cluster with an eye toward centralized server resources for sharing among all LIT units.
    • Explore centralized storage for sharing among all LIT units.
  • Miscellaneous, routine, or unplanned activity
    • Dec: Snavely met with CAS and ITCS staff to discuss the dissolution of ITCS WATS and plans for transitioning application support for a WATS-developed e-commerce system for ILLiad to Core Services; Snavely discussed this work with Feeman, who began examining the application and developing local documentation.
    • Jan: Feeman and Snavely met again with CAS staff to discuss the progress on examining and documenting the e-commerce system as well documenting ITCS support contacts we will need in order to take on support. Snavely suggested a timeframe for taking on support, and Feeman continued to work on documentation.
    • Feb: Feeman completed documentation of the ILLiad e-commerce system and Core Services took on support for it.
    • Mar: Feeman, Snavely, and Syrigos met to discuss several enhancements to the ILLiad e-commerce system, some proposed by CAS staff, and some suggested by Core Services.
    • Mar: Marsh installed a new temporary storage server for MDP to work alongside the existing one, adding 6 TB of capacity.
    • Apr: Feeman, working with Rennie, added support for JPEG 2000 validation and ingest to GROOVE, and Snavely began coordinating a date to cut over to JPEG 2000 delivery from Google.
    • Apr: Marsh installed and configured buttermilk and cream, two new servers that will be used for Oracle, replacing dlps6 and coffee.
    • May: Snavely and Varnum discussed several upcoming upgrades and projects for Web Services, and made a list of action items. Among them is a new server for Metalib, and Marsh suggested a hardware configuration for that new server.
    • June: After over seven years at the library, Marsh left the university effective June 15.
    • June: Feeman created scripts to extract data from GRIN and build simple reports for Karle-Zenith and Willett.
    • June: Feeman re-wrote large sections of the ILLiad e-commerce system, and took advantage of the opportunity to increase the security of some areas, to comply with a change to the Verisign API scheduled for July.
    • June: Snavely reconfigured the existing data preparation server in DLPS, sangria, as a storage server, and set up two new blade servers for data preparation in DLPS. As previously planned with DLPS staff, one server will temporarily serve as the DLXS workshop server for several weeks. Snavely retired the old dedicated DLXS workshop server.
    • July: Feeman began work on changes to the ILLiad e-commerce system to accommodate an ILLiad upgrade in August, and also made some security, cosmetic, and navigational improvements.
    • July: Snavely initiated a project with MAIS Oracle administrators to develop a formal procedure for recovering Aleph from backup.
    • Aug: Snavely met again with MAIS Oracle staff to discuss Oracle backup and recovery.
    • Sep: Snavely installed and configured a new blade server chassis and networking hardware in the MACC data center to begin building the new hosting infrastructure for MBooks.
    • Oct: Snavely received and moved the new Sun Honeycomb CAS storage system to the MACC.
    • Oct: Sebastien Korner was hired as a programmer, and started work on October 29.
    • Nov: Feeman, Korner, and Patterson installed and configured two new web servers and two new MySQL servers at the MACC for use with MBooks.
    • Nov: Feeman received a response from our PCI (Payment Card Industry) self-evaluation of the ILLiad e-commerce system via CAS, and worked with ITCS and Snavely to prepare a response.
    • Nov: Feeman, Korner, and Snavely worked with Sun technicians to power up and configure the Honeycomb system.
  • Service Availability
    • Arbor Lakes Data Center (ALDC)
      • ethel.umdl (ARC): No down time
      • ezra.umdl (Verde): No down time
      • gracie.umdl (Aleph): Down on Sunday, November 4 from 6:00am to 10:30am, on Sunday, November 11 from 6:00am to 9:55am, on Sunday, November 18 from 6:30am to 9:55am, and on Sunday, November 25 from 7:30am to 9:30am to test and implement system configuration changes related to a storage system.
      • hiram.umdl (Zebra): No down time
      • tequila.umdl (Metalib and SFX): Down (SFX only) on Saturday, November 25 from 1:30pm to 1:40pm due to a software malfunction.
    • Duderstadt Center Machine Room (DCMR):
      • dlps6.umdl (Oracle): No down time
      • dlps7.umdl (Numeric and Geospatial Data): No down time
      • dlps11.umdl (DLPS collections): No down time
      • dlps12.umdl (DLPS collections): No down time
      • soymilk.umdl (MySQL): No down time
    • Hatcher Server Room: All non-redundant services were down on Wednesday, November 14 from 6:30am to 8:30am due to a scheduled power outage.
      • belle.umdl (Library Web): No additional down time
      • coffee.umdl (Oracle): No additional down time
      • cowmilk.umdl (MySQL): No additional down time
      • hefeweizen.umdl (DLPS collections and MBooks): Down on Wednesday, November 14 from 6:30am to 12:30pm due to a scheduled power outage, and MBooks service degraded (many books unavailable) from Wednesday, November 14 at 12:30pm to Thursday, November 15 at approximately 8:00am due to problems encountered during the power outage, and on Wednesday, December 19 from 7:00am to 8:00am for a storage upgrade.
      • merlot.umdl (DLPS collections and MBooks): See hefeweizen.umdl
      • porter.umdl (DLPS collections and MBooks): See hefeweizen.umdl
      • quik.umdl (Survivors of the Shoah cache): No additional down time
      • sambuca.umdl (Deep Blue): Down on Wednesday, November 21 from 6:30am to 9:10am to move the server and storage to a different data center.
      • ting.umdl (Library Proxy): No additional down time
  • Data Loading/Archival Statistics
    CD/DVDs loaded or reloaded (bitonal page images) 10
    CD/DVDs loaded or reloaded and processed (contone images) 111
    Removable drives loaded or reloaded 1
    CD/DVDs burnt or duplicated 14
If you can read this, your browser isn't honoring our stylesheet requests

Send us your questions and comments.

libwebsystems@umich.edu

Your question or comment:

Sending . . .



Loading ...

Your message has been sent

There was a problem sending your message.

Please try again later. Or send it to libwebsystems@umich.edu in your favorite email client.
Your message was: