Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

December 20, 2010

The Sensitivity of Latent Dirichlet Allocation for Information Retrieval

The Sensitivity of Latent Dirichlet Allocation for Information Retrieval Author: Laurence A. F. Park, The University of Melbourne. slides

Abstract:

It has been shown that the use of topic models for Information retrieval provides an increase in precision when used in the appropriate form. Latent Dirichlet Allocation (LDA) is a generative topic model that allows us to model documents using a Dirichlet prior. Using this topic model, we are able to obtain a fitted Dirichlet parameter that provides the maximum likelihood for the document set. In this article, we examine the sensitivity of LDA with respect to the Dirichlet parameter when used for Information retrieval. We compare the topic model computation times, storage requirements and retrieval precision of fitted LDA to LDA with a uniform Dirichlet prior. The results show there there is no significant benefit of using fitted LDA over the LDA with a constant Dirichlet parameter, hence showing that LDA is insensitive with respect to the Dirichlet parameter when used for Information retrieval.

Note that topic is used in semantic analysis (of various kinds) to mean highly probable words and not in the technical sense of the TMDM or XTM.

Extraction of highly probably words from documents can be useful in the construction of topic maps for those documents.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress