...
HT-Bookworm (under development by HTRC and groups from Baylor and Northeastern) and the Data Capsule (which will allow users to run computational methods against protected data) are some techniques being developed that will make copyrighted material more usable to users in the future. In order to for HT-Bookworm to work, the back end will need to be changed from mySql (as it is currently) to Solr. Solr is likely to be more scalable. The plan is also to integrate HT-Bookworm with HTRC worksets, so that one would be able to go in both directions — from the Data Capsule to the workset, and from the workset to the Data Capsule. That is, from what the user discovers using HT-Bookworm, the user might be able to automatically generate a workset. The goal is to make the HT-Bookworm work with all the public-domain material in HTRC within a year. If by then HTRC gets the copyrighted data, that will be sought to be integrated, too.
There will be more interest when there is more tutorial information. Also, the things that faculty want to do don't always fit into existing tutorial information. Doing something more visually with topic models, such as Termite from Stanford, may be good.
If someone wants to submit an algorithm to the portal, how is it decided? Currently, it is decided on a one-on-one basis. Setting up a workflow/process for people to submit their algorithms will be useful.