Beyond Google Books: Getting Locally-Digitized Material into HathiTrust

Page from early printed edition of Dante's Divine Comedy with an elaborate border and capital N.

HathiTrust started out with only content digitized by Google, but a goal from early on was to support digitized book material from a variety of sources. One early effort provided a toolkit to partners for preparing content, but which turned out to require more technical effort than was reasonable. We rethought our approach and simplified the requirements for partners while maintaining the same high quality standards for HathiTrust.

Quality in HathiTrust (Re-Posting)

Skew in a Google-digitized volume in HathiTrust

This is a re-posting of a HathiTrust blog post. HathiTrust receives well over a hundred inquiries every month about quality problems with page images or OCR text of volumes in HathiTrust. That’s the bad news. The good news is that in most of these cases, there is something they can do about it. A new blog post is intended to shed some light on the thinking and practices about quality in HathiTrust.

Pager

Page 1 of 14