Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

June 29, 2011

Topic Modeling Sarah Palin’s Emails

Filed under: Latent Dirichlet Allocation (LDA),Linguistics — Patrick Durusau @ 9:05 am

Topic Modeling Sarah Palin’s Emails from Edwin Chen.

From the post:

LDA-based Email Browser

Earlier this month, several thousand emails from Sarah Palin’s time as governor of Alaska were released. The emails weren’t organized in any fashion, though, so to make them easier to browse, I did some topic modeling (in particular, using latent Dirichlet allocation) to separate the documents into different groups.

Interesting analysis and promise of more to follow.

With a US presidential election next year, there is little doubt there will be friendly as well as hostile floods of documents.

Time to sharpen your data extraction tools.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress