Flexible Searching with Solr and Sunspot.
Mike Pack writes:
Just about every type of datastore has some form of indexing. A typical relational database, such as MySQL or PostreSQL, can index fields for efficient querying. Most document databases, like MongoDB, contain indexing as well. Indexing in a relational database is almost always done for one reason: speed. However, sometimes you need more than just speed, you need flexibility. That’s where Solr comes in.
In this article, I want to outline how Solr can benefit your project’s indexing capabilities. I’ll start by introducing indexing and expand to show how Solr can be used within a Rails application.
If you are a Ruby fan (or not), this post is a nice introduction to some of the power of Solr for indexing.
At the same time, it is a poster child for what is inflexible about Solr query expansion.
Mike uses the following example for synonyms/query expansion:
# citi is the stem of cities
citi => city# copi is the stem of copies
copi => copy
Well, that works no doubt, if those expansions are uniform across a body of texts. Depending on the size of the collection, that may or may not be the case. That is the uniformity of the expansion of strings.
We could say:
#cop is a synonym for the police
cop => police
Meanwhile, elsewhere in the collection we need:
#cop is the stem of copulate
cop => copulate
Without more properties to distinguish the two (or more) cases, we are going to get false positives in one case or the other.