Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

March 14, 2012

New index statistics in Lucene 4.0

Filed under: Indexing,Lucene — Patrick Durusau @ 7:35 pm

New index statistics in Lucene 4.0

Mike McCandless writes:

In the past, Lucene recorded only the bare minimal aggregate index statistics necessary to support its hard-wired classic vector space scoring model.

Fortunately, this situation is wildly improved in trunk (to be 4.0), where we have a selection of modern scoring models, including Okapi BM25, Language models, Divergence from Randomness models and Information-based models. To support these, we now save a number of commonly used index statistics per index segment, and make them available at search time.

Mike uses a simple example to illustrate the statistics available in Lucene 4.0.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress