Pre-meeting notes:
-word break (Ted Underwood)
...
-derived stats and informational data (Beth Plale)
Meeting notes:
Ted Underwood (a professor at Department of English, UIUC) discussed his user case with HTRC people during the meeting.
...
Loretta summarizes requested extensions: count of number of lines on page, count of number of lines that start with a capitalized token, use dictionary to deal with hypens or nondictionary tokens at end of line and start of next line and combine these tokens only if they exist in a dictionary, add counts for punctuation tokens
Action item:
Sayan, Jiaan will get Python script from Ted, and work on a simpler version/logic of that script.