Browse

Browse

Code4Lib Lightning Talk

The HathiTrust Data & Bib API
Sample perl code to gather bib data, technical data about the zip file contents, and a zip file containing the document's scan image and ocr output.
use LWP::Simple;

while ( my $line = <HATHIFILE> ) {
  chomp $line;
  my @line = split("\t", $line);
  my $access = $line[2];
  next unless $access eq 'allow';

  my $title = $line[11];
  my $imprint = $line[12];
  my $recordid = $line[4];
  my $htid = $line[0];
  my $zipfile = get('http://services.hathitrust.org/api/htd/aggregate/'. $htid);
  my $structure = get('http://services.hathitrust.org/api/htd/structure/'. $htid);
  my $bibdata = get('http://catalog.hathitrust.org/api/volumes/full/recordid/'. $recordid .'.json');
}