Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Excerpt

Learn the three different ways you can create worksets in HTRC Analtyics, as well as how to validate and download a workset to your personal machine.

Browse public worksets

There are many existing worksets in HTRC that you can use instead of creating your own. 

...

  • You can filter worksets by name, or you can narrow the display to your worksets only.
  • You can click on the hyperlinks to see the volumes in the workset and minimal metadata about each volume. You can follow the links in the “Title” field to see the volume in the HathiTrust Digital Library. You can also click on  "Download" button to download the HathiTrust volume IDs in the workset .



Creating a workset

There are three ways to create a workset to HTRC Analytics:

  1. Import a collection from HathiTrust
  2. Import your selected results from HTRC Workset Builder 2.0
  3. Upload a list of HathiTrust volume identifiers for your volumes of interest

Anchor
importfromht
importfromht

You can create a workset directly from a public HathiTrust collection. There are many existing collections, or you can create your own by following these steps:

...

  • Collections can be created temporarily as you browse, or you can log-in to save your collection. Note that the credentials for the HathiTrust Digital Library are different from your HTRC Analytics account and are available only to users are HathiTrust partner institutions, although guest accounts can be created. (See: How to create a guest account.)
    • Note: For volumes in your HathiTrust collection that are not available via HTRC, which occasionally happens, the metadata will still appear in your workset, so that if and when it is available it will be included, but when you run an algorithm on your workset, it will be excluded from the job. 
  • Make sure you make your HathiTrust collection public. You will get an error if you try to import a private collection as a workset.
  • In a separate browser tab, go to HTRC Analytics and sign in. 

  • Click "Worksets" from the top menu on the home page.
  • Click the orange "Create A Workset" button toward the top right of the Worksets page.
  • Click on the blue "Import from HathiTrust" button



  • Return to your collection page on HathiTrust, and copy the URL. 



  • Go back to HTRC Analytics, and paste the URL into the field that says "HathiTrust Collection URL."
  • Click "Fetch collection"





  • A workset name will be suggested for you; you can edit it if desired.
  • Add information to the description field if not pre-populated. 
  • Click the checkbox to make your workset private–so that it is accessible to and viewable by you alone–or leave the box unchecked to make your workset public. 
  • Click "Create workset."

Anchor
importfromwsb
importfromwsb

You can import selected results directly from the HTRC Workset Builder. To do this, following these steps:

...

  • Choosing this option will generate the same page as above, but with an empty "Selection ID" field in which you can paste a selection cart ID, which is given in the bar to the left of the "Export Workset" button on the shopping cart view page:




Anchor
importvolIDfile
importvolIDfile



You can create a workset to use with HTRC algorithms by uploading a list of HahtiTrust volume IDs. 

...

  • Once you have a list of volume IDs, make sure it conforms to the file requirements. Your volume ID list must be in CSV, TSV, or TXT format, and the only thing it must contain are the volume IDs in the left-most column. Additional fields will be ignored, so while they can be present, they won't affect the upload or the metadata for your workset. The file should contain a header row containing the text "volume" or "id".
  • Click "Worksets" from the top menu on the home page.



  • Click the orange "Create A Workset" button toward the top right of the Worksets page.
  • Click the "Upload File" button."



  • Give your workset a name and description.



  • Upload the file by clicking "Choose File."
  • Click the checkbox to make your workset private–so that it is accessible to and viewable by you alone–or leave the box unchecked to make your workset public. 
  • Click "Create workset."

Validate a workset

HathiTrust is a dynamic repository: It continues to grow, and, with less frequency, items are removed or their access profile changes. In order to check if the volumes in your workset are available for analysis using HTRC algorithms or the HTRC Data Capsule environment, you can validate a workset. 

...

Validating a workset will show you how many of the volumes in your workset are currently accessible via HTRC algorithms or the HTRC Data Capsule environment. You can download either the volume IDs that are valid or those that are not. You could then upload the valid IDs as a new workset, if you wanted.


Download a workset

After you have created a workset, you can download it as a list of volume identifiers in comma separated value (csv) format. Because each workset is functionally a list of pointers to content in the HathiTrust Digital Library, the full text of the volumes is not included in the download. If you are interested in receiving a dataset from the HathiTrust to do research on your own machine, please refer to the directions for requesting a custom dataset. The volume identifiers in a workset are consistent with the volume identifiers used elsewhere across the HathiTrust.

...