Building Scalable Search from Scratch with ElasticSearch by Ram Viswanadha.
From the post:
1 Introduction
Savvy is an online community for the world’s product enthusiasts. Our communities are the product trendsetters that the rest of the world follows. Across the site, our users are able to compare products, ask and answer product questions, share product reviews, and generally share their product interests with one another. Savvy1.com boasts a vibrant community that save products on the site at the rate of 1 product every second. We wanted to provide a search bar that can search across various entities in the system – users, products, coupons, collections, etc. – and return the results in a timely fashion.
2 Requirements
The search server should satisfy the following requirements:
- Full Text Search: The ability to not only return documents that contain the exact keywords, but also documents that contain words that are related or relevant to the keywords.
- Clustering: The ability to distribute data across multiple nodes for load balancing and efficient searching.
- Horizontal Scalability: The ability to increase the capacity of the cluster by adding more nodes.
- Read and Write Efficiency: Since our application is both read and write heavy, we need a system that allows for high write loads and efficient read times on heavy read loads.
- Fault Tolerant: The loss of any node in the cluster should not affect the stability of the cluster.
- REST API with JSON: The server should support a REST API using JSON for input and output.
At the time, we looked at Sphinx, Solr and ElasticSearch. The only system that satisfied all of the above requirements was ElasticSearch, and — to sweeten the deal — ElasticSearch provided a way to efficiently ingest and index data in our MongoDB database via the River API so we could get up and running quickly.
…
If you need an outline for building a basic ElasticSearch system, this is it!
It has the advantage of introducing you to a number of other web technologies that will be handy with ElasticSearch.
Enjoy!