* Here we introduce the Extracted Features functionality (currently under beta release) that have recently been developed by the HathiTrust Research Center. This functionality is one of the ways in which users of the HathiTrust Research Center can perform non-consumptive analysis of subsets of the HathiTrust Digital Library's corpus that they have custom-selected by means of the workset mechanism available through the HTRC. (Currently, this functionality is available only for the HathiTrust Digital Library's public domain corpus, consisting of slightly less than 5 million volumes.)
...
This section shows you how to create a custom workset, for the volume(s) contained in which you will eventually download the corresponding advanced and basic EF data files. Your workset can contain as many volumes as you wish. However, the example workset for this section will consist, for the sake of simplicity, of a single volume from the HathiTrust Digital Library's public domain collection: a published-in-1920 edition of the book of poems titled Buch der Lieder by the German poet Heinrich Heine. Then we show you how you can download the EF data files corresponding to this workset. (One of the use cases for the EF approach to non-consumptive text analysis that we have posted also uses this particular book by Heine to make its point.)
2.1
...
Navigate to the HTRC Secure HathiTrust Analytics Research Commons (SHARC)
Navigate to “Create Virtual Machine” tab and fill in the form. You need to choose an image from the drop down list, provide username and password for the VNC session, and choose how many CPU and memory you want your virtual machine has. Finally you hit the "Create VM" button. The VM creation procedure usually takes about 1 minute to finish.
2.2 Show a Virtual Machine Status
Navigate to “Virtual Machines” page, you can see all the VMs and available operations associated with the VM.
You can click on the vm id link to see more details about the VM. The “VM Initial Logging User ID” and “VM Initial Logging Password” are the username and password you use to log into the VM. These are different from the ones you use to open your VNC session to the VM. The “Public IP” and “VNC port” are information you need to open a VNC session. You can also use “Public IP” and “SSH port” to log into VM through ssh but this is only allowed in maintenance mode.
2.3 Start a Virtual Machine
You can start a virtual machine in the “Virtual Machine” page. This operation usually takes 2 ~ 3 minutes. Once the VM starts successfully, you can see more available operations e.g., switch, stop, and delete.
the HTRC Secure HathiTrust Analytics Research Commons (SHARC). Click on the link stating “Sign In” at the upper right corner of the screen.
2.2 Sign in to HTRC SHARC
After Step 1, you will reach the screen shown below. Enter your HTRC username and password at the respective fields, and then click on the “Sign In” button.
2.3 Verify that you are logged in to HTRC
After Step 2, you will arrive at the screen shown below. Verify that your HTRC username appears at the upper right corner, showing that you are successfully logged in to HTRC.
2.4 Log into a Virtual Machine
...
By default, VM starts in maintenance mode where you can have network access. To switch to secure mode, you can hit the “Switch to Security” button. Once you perform the mode switch, within the VNC session, your screen is frozen in a short time. After that, you can resume your work. To switch from secure mode to maintenance mode, make sure you eject/unmount the secure volume before switching out of secure mode to ensure that any changes made to the secure volume are made permanent.
In the portal, go to "HTRC Data Capsule -> Show Virtual Machines" for the page for switching between modes. In the maintenance mode, click on the "Switch To Secure Mode" button in the portal to switch to the secure mode.
In the secure mode, click on the "Switch To Maintenance Mode" button in the portal to switch to the secure mode.
2.6 Stop a Virtual Machine
...