...
Using the extracted features twill require some effort on the part of the user. The number of users who have been waiting to try to do this is likely to be small. The bottleneck is not so much the HTRC, but rather the smallness of the number of users.
Discussion: When people will see what they can do with the features, and when they see that it is tractable, then there will be more acceptance for this.
From RachaelRachel: There’s a tool (“Paper Machine”) that came out recently from the American Sociological Association [?] They are doing some extraction, from government documents about policy statements. They are trying to do data visualization, some kind of word cloud out of it.
Discussion: Are government documents in the HT corpus always clearly marked as such? Not sure.
...