Flexible Searching with Solr and Sunspot

Flexible Searching with Solr and Sunspot.

Mike Pack writes:

Just about every type of datastore has some form of indexing. A typical relational database, such as MySQL or PostreSQL, can index fields for efficient querying. Most document databases, like MongoDB, contain indexing as well. Indexing in a relational database is almost always done for one reason: speed. However, sometimes you need more than just speed, you need flexibility. That’s where Solr comes in.

In this article, I want to outline how Solr can benefit your project’s indexing capabilities. I’ll start by introducing indexing and expand to show how Solr can be used within a Rails application.

If you are a Ruby fan (or not), this post is a nice introduction to some of the power of Solr for indexing.

At the same time, it is a poster child for what is inflexible about Solr query expansion.

Mike uses the following example for synonyms/query expansion:

# citi is the stem of cities
citi => city

# copi is the stem of copies
copi => copy

Well, that works no doubt, if those expansions are uniform across a body of texts. Depending on the size of the collection, that may or may not be the case. That is the uniformity of the expansion of strings.

We could say:

#cop is a synonym for the police
cop => police

Meanwhile, elsewhere in the collection we need:

#cop is the stem of copulate
cop => copulate

Without more properties to distinguish the two (or more) cases, we are going to get false positives in one case or the other.

Leave a Reply

You must be logged in to post a comment.