Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

HTRC Extracted Features are one of the ways in which users of HTRC's tools can perform non-consumptive analysis of text in the HathiTrust Digital Library's corpus. As with most HTRC functions, Extracted Features are available for HTRC Worksets. Worksets, which are user-create collections of volumes from the HTDL, can be small (one volume) to large (more than thousands of volumes), and at their core, consist simply of a list of HathiTrust Volume IDs.  

Researchers have several options for creating their workset, including quering the /wiki/spaces/INT/pages/43417814Researchers who do not yet have a workset and who only want to work with the public domain texts can create a workset in the /wiki/spaces/INT/pages/43418520This workset, or list of volume IDs, can be created by building a collection of volumes in HTDL and downloading the volume-ID containing metadata, or by making use of one of the other metadata sources from HathiTrust. Contact htc-help@hathitrust.org if you need assistance creating a list of volume IDs

Download Format

Files

The HTRC Extracted Features files are formatted in JSON. For more information about the fields, see the documentation for each release

...