Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Excerpt

Understand the basics of using a data capsule.



Panel
borderStylesolid

The HTRC Data Capsule environment provides individual, secure computing environments to analyze content in the HathiTrust Digital Library. Researchers can create virtual machines (called Capsules) to which they can import and then analyze HathiTrust text data. Researchers can only perform computational analysis within the secure Data Capsule environment and then export the results of their analysis. Volume text may not be exported outside the HTRC Data Capsule, and data products leaving a Capsule must undergo results review prior to release to ensure they meet the HTRC's policy for non-consumptive data exports.

Auibutton
titleUse a Capsule
typeprimary
urlhttps://analytics.hathitrust.org/staticcapsules
Auibutton
titleSee Data Capsule Specs and Usage Guide
typestandard
urlHTRC Data Capsule Specifications and Usage Guide
Auibutton
titleFollow a tutorial
typestandard
urlHTRC Data Capsule Step-by-Step Tutorials


Expand
titleWatch an introductory video about the Capsules

User interfaces shown in this video may be outdated, but step-by-step instructions are up to date.

We're updating our videos to show latest changes!

Multimedia
nameDataCapsuleIntro.mp4
width75%
height75%



Capsule specifications

What's in a Capsule?

Out-of-the-box, Capsules are Ubuntu virtual machines with increased security settings. Researchers have the option to set certain parameters for their Capsule when they create it. Capsules come pre-loaded with standard data analysis programs and software. While Capsules come with standard tools pre-installed, ranging from Anaconda and R to Voyant Tools, and can be configured with sample public domain data already loaded for testing, any other data or tools the researcher plans to use will need to be brought into the Capsule by the researcher. A Capsule is an almost blank slate that can be customized for each researcher's needs!

Kinds of Capsules

There are three kinds of capsules: Demo Capsules, Research Capsules, and Customized Research Capsules. Researchers can request for their Research and Customized Research Capsules to have full-corpus access, and approval is limited to those from HathiTrust member institutions. 

Using a capsule

Creating a Capsule

Capsules operate from the HTRC Analytics website, which requires an HTRC account to log-in. 

Auibutton
titleCreate an HTRC Analytics account
typeprimary
urlhttps://analytics.hathitrust.org/signuppage
 
Auibutton
titleFollow a tutorial
typestandard
urlHTRC Data Capsule TutorialTutorials

You'll use the site to create and administer your Capsule. 

 

Auibutton
titleCreate a Capsule
typeprimary
urlhttps://analytics.hathitrust.org/staticcapsules
Auibutton
titleFollow a tutorial
typestandard
urlCreate or convert a Capsule

Research in a Capsule

In HTRC Analytics, you'll have the option work with your Capsule either via a remote desktop viewer (to see your Capsule's desktop) or a terminal viewer (to interact with your Capsule via a command line interface). 

Capsules are intended for researchers who want access to HathiTrust text data in flexible, individually-driven environment. Capsules can be shared between up to 5 collaborators. Researchers looking for a point-and-click option should explore HTRC Algorithms

We offer several step-by-step guides for using a Capsule (see links "Data Capsule Specs and Usage Guide" and "Follow a Tutorial" located at the top of this page).

Development details


Read more

The HTRC Data Capsule system was prototyped through funding from the Alfred P. Sloan Foundation (2011-2015). The final report is available here: Final report.  

Extension of the HTRC Data Capsule project to larger compute resources and better integration with the HTRC worksets was recently funded by a grant from the Andrew T. Mellon Foundation (2016-2018).  

Anchor
ref1
ref1
Kevin Borders, Eric Vander Weele, Billy Lau, and Atul Prakash, Protecting Confidential Data on Personal Computers with Storage Capsules. Proceedings of the 18th USENIX Security Symposium, Aug. 2009. 

Anchor
ref2
ref2
Zeng, J., Ruan, G., Crowell, A., Prakash, A., & Plale, B. (2014, June). Cloud Computing Data Capsules for Non-Consumptive Use of Texts. In Proceedings of the 5th ACM workshop on Scientific cloud computing (pp. 9-16). ACM.

Plale, Beth; Prakash, Atul; McDonald, Robert (2015). The Data Capsule for Non-Consumptive Research: Final Report. Available from http://hdl.handle.net/2022/19277