Code4Lib Lightning Talk
The HathiTrust Data & Bib API
- First get lists and identifiers from the hathifiles export
- Next walk through the Data API to figure out what we can share.
- Finally walk through the Bib API for metadata
Sample perl code to gather bib data, technical data about the zip file contents, and a zip file containing the document's scan image and ocr output.
use LWP::Simple;
while ( my $line = <HATHIFILE> ) {
chomp $line;
my @line = split("\t", $line);
my $access = $line[2];
next unless $access eq 'allow';
my $title = $line[11];
my $imprint = $line[12];
my $recordid = $line[4];
my $htid = $line[0];
my $zipfile = get('http://services.hathitrust.org/api/htd/aggregate/'. $htid);
my $structure = get('http://services.hathitrust.org/api/htd/structure/'. $htid);
my $bibdata = get('http://catalog.hathitrust.org/api/volumes/full/recordid/'. $recordid .'.json');
}



tag this page