Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 22, 2010

A Term Association Inference Model for Single Documents:….

Filed under: Data Mining,Document Classification,Information Retrieval,Summarization — Patrick Durusau @ 6:36 am

A Term Association Inference Model for Single Documents: A Stepping Stone for Investigation through Information Extraction Author(s): Sukanya Manna and Tom Gedeon Keywords: Information retrieval, investigation, Gain of Words, Gain of Sentences, term significance, summarization

Abstract:

In this paper, we propose a term association model which extracts significant terms as well as the important regions from a single document. This model is a basis for a systematic form of subjective data analysis which captures the notion of relatedness of different discourse structures considered in the document, without having a predefined knowledge-base. This is a paving stone for investigation or security purposes, where possible patterns need to be figured out from a witness statement or a few witness statements. This is unlikely to be possible in predictive data mining where the system can not work efficiently in the absence of existing patterns or large amount of data. This model overcomes the basic drawback of existing language models for choosing significant terms in single documents. We used a text summarization method to validate a part of this work and compare our term significance with a modified version of Salton’s [1].

Excellent work that illustrates how re-thinking of fundamental assumptions of data mining can lead to useful results.

Questions:

  1. Create an annotated bibliography of citations to this article.
  2. Citations of items in the bibliography since this paper (2008)? List and annotate.
  3. How would you use this approach with a document archive project? (3-5 pages, no citations)

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress