Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

March 27, 2012

Result Grouping Made Easier

Filed under: Lucene — Patrick Durusau @ 7:17 pm

Result Grouping Made Easier

From the post:

Lucene has result grouping for a while now as a contrib in Lucene 3.x and as a module in the upcoming 4.0 release. In both releases the actual grouping is performed with Lucene Collectors. As a Lucene user you need to use various of these Collectors in searches. However these Collectors have many constructor arguments. So they can become quite cumbersome to use grouping in pure Lucene apps. The example below illustrates this.

(code omitted)

In the above example basic grouping with caching is used and also the group count is retrieved. As you can see there is quite a lot of coding involved. Recently a grouping convenience utility has been added to the Lucene grouping module to alleviate this problem. As the code example below illustrates, using the GroupingSearch utility is much easier than interacting with actual grouping collectors.

Normally the document count is returned as hit count. However in the situation where groups are being used as hit, rather than a document the document count will not work with pagination. For this reason the group count can be used the have correct pagination. The group count returns the number of unique groups matching the query. The group count can in the case be used as hit count since the individual hits are groups.

There are really two lessons here.

The first lesson is that if you need the GroupingSearch utility, use it.

Second is that Lucene is evolving rapidly enough that if you are a regular user, you need to be monitoring developments and releases carefully.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress