A little over 18 months ago we talked to Jake Luciani about Lucandra – a Cassandra-based Lucene backend. Since then Jake has moved away from raw Lucene and married Cassandra with Solr, which is why Lucandra now goes by Solandra. Let’s see what Jake and Solandra are up to these days.
What is the current status of Solandra in terms of features and stability?
Solandra has gone through a few iterations. First as Lucandra which partitioned data by terms and used thrift to communicate with Cassandra. This worked for a few big use cases, mainly how to manage a index per user, and garnered a number of adopters. But it performed poorly when you had very large indexes with many dense terms, due to the number and size of remote calls needed to fulfill a query.Last summer I started off on a new approach based on Solr that would address Lucandra’s shortcomings: Solandra. The core idea of Solandra is to use Cassandra as a foundation for scaling Solr. It achieves this by embedding Solr in the Cassandra runtime and uses the Cassandra routing layer to auto shard a index across the ring (by document). This means good random distribution of data for writes (using Cassandra’s RandomParitioner) and good search performance since individual shards can be searched in parallel across nodes (using SolrDistributedSearch). Cassandra is responsible for sharding, replication, failover and compaction. The end user now gets a single scalable component for search without changing API’s which will scale in the background for them. Since search functionality is performed by Solr so it will support anything Solr does.
I gave a talk recently on Solandra and how it works: http://blip.tv/datastax/scaling-solr-with-cassandra-5491642
…more follows, worth your attention.