...
He explained the similarity with an example from the last user group meeting, i.e. : How to study the meaning of the word "creativity"? It becomes a common word after world war World War II, but we don't know its meaning and whywhy, and what this means. The tool he developed can help with the broad context of the word.
...
High dimensionality with historical text processing: it's very large dimensions with this data set. He looked at the most frequent words given a year. One question is how we make sure we're not biases biased towards the most informative year.
He ignored adverbs but kept the verbs, but words such as "were" may not be so important. He also removed hand-crafted stop list. Locality sensitive hashing were used for scalability. Used a cluster to distribute the processing by years. It took less than an hour to finish the job.
...