Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

September 15, 2011

Solr Digest, Spring-Summer 2011, Part 2: Solr Cloud and Near Real Time Search

Filed under: Lucene,Solr — Patrick Durusau @ 7:51 pm

Solr Digest, Spring-Summer 2011, Part 2: Solr Cloud and Near Real Time Search

Just to temp you to read the rest of the post:

As promised in Part 1 of Solr Digest, Spring-Summer 2011, in this Part 2 post we’ll summarize what’s new with Solr’s Near-Real-Time Search support and Solr Cloud (if you love clouds and search with some big data on the side, get in touch). Let’s first examine what is being worked on for Solr Cloud and what else is in the queue for the near future. A good overview of what is currently functional can be found in the old Solr Cloud wiki page. Also, there is now another wiki page covering New Solr Cloud Design, which we find quite useful. The individual pieces of Solr Cloud functionality that are being worked on are as follows:

  • Work is still in progress on Distributed Indexing and Shard distribution policy. Patches exist, although they are now over 6 months old, so you can expect to see them updated soon.
  • As part of the Distributed Indexing effort, shard leader functionality deals with leader election and with publishing the information about which node is a leader of which shard and in Zookeeper in order to notify all interested parties. The development is pretty active here and initial patches already exist.
  • At some point in the future, Replication Handler may become cloud aware, which means it should be possible to switch the roles of masters and slaves, master URLs will be able to change based on cluster state, etc. The work hasn’t started on this issue.
  • Another feature Solr Cloud will have is automatic Spliting and migrating of Indices. The idea is that when some shard’s index becomes too large or the shard itself starts having bad query response times, we should be able to split parts of that index and migrate it (or merge) with indices on other (less loaded) nodes. Again, the work on this hasn’t started yet. Once this is implemented one will be able to split and move/merge indices using a Solr Core Admin as described in SOLR-2593.
  • To achieve more efficiency in search and gain control over where exactly each document gets indexed to, you will be able to define a custom shard lookup mechanism. This way, you’ll be able to limit execution of search requests to only some shards that are known to hold target documents, thus making the query more efficient and faster. This, along with the above mentioned shard distribution policy, is akin to routing functionality in ElasticSearch.

Isn’t that an amazing level of activity? I get tired just reading about it. 😉 Now if it can just be applied as cleverly as it has been written.

BTW, Part 1 if you are interested.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress