HTRC Release 2.0

HTRC Release 2.0

News:  Release 2.0, December 2013:  contains numerous enhancements and fixes.  These are listed in the Changes Since Last Release column.  Those that are user-facing feature enhancements are highlighted in red.  

Software Service

Software Service Functionality

Changes Since Last Release

Changes Directly Affecting Users?

Software Service

Software Service Functionality

Changes Since Last Release

Changes Directly Affecting Users?

HTRC-App

A client to retrieve data from Data API. Mainly used by Meandre workflow.

  • Use HTTP POST instead of HTTP GET

x

HTRC-Compute-Agent

Job submission and monitoring.

New features

  • submission of jobs to PBS-based clusters such as Quarry and BigRed2

  • support for user-submitted CSV files

Technical updates and bug fixes

  • akka configuration settings especially those related to performance, e.g. thread pool size

  • error handling for registry queries

  • specification of paths in configuration files rather than in the code

 

x

x

 

x

x

x

HTRC-Data-DataAPI

  • Retrieve volume and page contents

  • Return token count for volumes/pages on the fly

New features

  • Return token count for volumes/pages. The computation is done on the server side.

 

 

HTRC-Data-DeleteNoticeTool

internal tool for deleting volumes listed in the weekly deletion notice emails sent from the HathiTrust

None

x

HTRC-Data-Ingester

internal service that brings the HathiTrust corpus into the HathiTrust Research Center

None

x

HTRC-Data-LogIngester

internal tool that collects log information for agent, portal, data api and solr proxy

  • added support for agent log

x

HTRC-Data-RegistryExtension

Provides the backend storage service/retrieval functionality for HTRC

  • worksets

  • jobs

  • algorithms

  • files

  • updated workset metadata to include whether the workset is public or not

  • added .csv filetype registration for CXF servlet

  • updated schema to reflect changes to properties XML

  • reworked CSV workset export functionality

  • added "volumeCount" workset metadata field

  • Added support for extended workset properties for workset creation and retrieve workset volumes operation.

x

x

x

x

x

x

HTRC-Data-SolrProxy

a proxy service between users and real Solr cluster to protect index from being modified and audit user requests

  • handles errors by returning xml error messages

  • uses 2 different cores, one for metadata and the other for OCR

x

x

HTRC-Meandre-Components

Meandre components that use the client in HTRC-App to connect to the data api to retrieve the data.

  • switched to using Maven

  • cleaned up dead code

  • removed plain-text password authentication in favor of token-based authentication

  • updated to use latest version of DataAPI

  • bug fix: ensured proper close of client connections

x

x

x

x

x

HTRC-Meandre-Flows

Meandre flows that provide the algorithmic functionality, like token counts, topic modeling, entity extraction and dunning log likelihood analysis.

  • added flows to version control

  • fixed Dunning Log-likelihood to output both over- and under- represented data and provide those outputs in a downloadable format

  • added Naive-Bayes classification algorithm and provide training and testing confusion matrices along with actual and predictive values.

x

HTRC-Security-Auditor

a utility package meant to be used by other components for generating audit logs in a more consistent format

None

x

HTRC-Security-OAuth2

OAuth2 Authentication Related Components:

  • OAuth2 filter to use with web applications

  • WSO2 IS customizations

  • OAuth2 User Information Service for Reverse Lookup Using OAuth2 Access Token

  • OAuth2 Client API for Java Applications

 

None

x

HTRC-Tools-BackupAndRestore

Command line tool used to backup user accounts and registry contents of HTRC stacks.

This is a new component for this release.

x

HTRC-Tools-UserManager

Command line tool used to perform user management actions for HTRC (user creation, password changes, etc.)

None

x

HTRC-UI-AuditAnalyzer

a GUI for visualize stats info for logs collected by HTRC-Data-LogIngester

added agent log analysis

x

HTRC-UI-Blacklight

Workset Builder

  • query documents by catalog and full text contents

  • filter results by facet

  • manually select and deselect documents

  • save named worksets for future reference and use in Portal

bug fixes:

  • better workflow and messages when unauthenticated user tries to create a workset

  • fixed loading gif display

tech changes:

  • point to new Solr index

  • point to updated Registry

  • add asset_host so assets can be built automatically

  • improved code branching per environment (extended use of htrc.yml)

feature adds:

  • display custom htrc metadata fields on volume detail page

  • add a sign up link in the header

  • add a link to the HT page turner on the volume detail page

  • remove non-functioning email and SMS links on volume detail page

 

x

 

x

x

x

x

HTRC-UI-Portal2

  • Browse worksets

  • Upload CSV worksets

  • Browse Algorithms

  • Execute Algorithms

  • Browse algorithm execution results

  • Create and manage user accounts

New features

  • UI improvements

  • Password reset function

  • Display list of volumes of each workset

  • Display htrc metadata and HT page turner on each volume of workset