How to do Your Own Topic Modeling
From the post:
In the first Teaching with Technology Tuesday of the fall 2011 semester, David Newman delivered a presentation on topic modeling to a full house in Bass’s L01 classroom. His research concentrates on data mining and machine learning, and he has been working with Yale for the past three years in an IMLS funded project on the applications of topic modeling in museum and library collections. In Tuesday’s talk, David broke down what topic modeling is, how it can be useful, and introduced a tool he designed to make the process accessible to anyone who can use a computer.
Summary of what sounds like an interesting presentation on the use of topic modeling (Latent Dirichlet Allocation/LDA) along with links to software. Enough detail that if topic modeling is unfamiliar, you will get the gist of it.
The usual cautions about LDA apply: It can’t model what’s not present, works at the document level (too coarse for many purposes), your use of the software has a dramatic impact on the results, etc. Useful tool, just be careful how much you rely upon it without checking the results.