Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Excerpt
hiddentrue

Worksets are user-created collections of HathiTrust volumes to be treated as data and analyzed using HTRC tools and services.


Panel
borderStylesolid

What are worksets?

HTRC worksets are user-created collections of HathiTrust volumes to be treated as data and analyzed using HTRC tools and services. Worksets are curated by researchers, and they can be shared and cited to improve reproducibility. They are a foundational piece of all the work you will do in HTRC Analytics.

Auibutton
titleCreate and browse worksets
typeprimary
urlhttps://analytics.hathitrust.org/worksets
Auibutton
titleFollow a Tutoriala tutorial to create or validate a workset
typestandard
urlHTRC Workset Tutorials

...

Download and analyze extracted features. If you would like to download the extracted features Extracted Features of volumes in a collection for analyzing with your own code, you can do so easily using the Extracted Features Download Helper Algorithm, which will generate a shell script to download the extracted features Extracted Features for the volumes in a workset using rsync. (See more on downloading extracted features Extracted Features here.)

Access additional features in a data capsule. Volumes in a workset can be analyzed in the HTRC Data Capsule Environment using the command line interface HTRC Workset Toolkit. It streamlines access to the HTRC Data API and includes utilities to pull text data and volume metadata into a capsule. Additionally, it allows a researcher to point OCR text data to analysis tools that are also available in the capsule. 

...