Excerpt |
---|
Learn how to understand the data that you will be working with. |
Volumes
Items in HathiTrust are called volumes. A volume is a discrete object that was digitized and cataloged as one unit. In the case of HathiTrust and its collection, volumes are typically books (monographs), but they may also be one issue of a periodical, several issues of a periodical bound and described together, or even a musical score. Keep in mind that a volume may be an anthology containing multiple works! Currently, volumes in HathiTrust start as physical objects that are digitized and added to the HathiTrust Digital Library.
...
HathiTrust volumes are identified via unique HathiTrust IDs. These alpha-numeric IDs track volumes across HathiTrust and HTRC systems. This volume of Jane Austen's letters has the volume ID hvd.32044021076179. When viewing a volume in the Digital Library, the volume ID can be found in the URL after "id=". The volume ID can be used to call metadata via the HathiTrust's Bibliographic API or to pull volume content via the HathiTrust's Data API. Additionally, the volume ID is often present in the file and/or directory name for content pertaining to a specific volume, and it also makes up the (pairtree) directory structure for volumes accessed via HathiTrust dataset requests or the HTRC Extracted Features Dataset.
...
A volume's copyright and license status affects how researchers are permitted to interact with the data for that volume. Only public domain volumes are available via the HathiTrust Data API and custom data request process. AdditonallyAdditionally, there are access restrictions based on the digitizing agent of volumes in HathiTrust that impact use if the HT Data API and custom data request procedures. Read more here: https://www.hathitrust.org/data. Only public domain volumes are available for research via the HTRC Analytics site and Data Capsulses compute environments. HTRC Extracted Features and the HT+Bookworm tool, however, do provide analytic access to derived data from the entire corpus.
...