Cloudera Search over Apache HBase: A Story of Collaboration by Steven Noels.
Great background story on the development of triggers and indexing updates for Apache HBase by NGDATA (for their Lily product) and that underlies Cloudera Search.
From the post:
In this most recent edition, we introduced an order of magnitude performance improvement: a cleaner, more efficient, and fault-tolerant code path with no write performance penalty on HBase. In the interest of modularity, we decoupled the trigger and indexing component from Lily, making it into a stand-alone, collaborative open source project that is now underpinning both Cloudera Search HBase support as well as Lily.
This made sense for us, not just because we believe in HBase and its community but because our customers in Banking, Media, Pharma and Telecom have unqualified expectations for both the scalability and resilience of Lily. Outsourcing some part of that responsibility towards the infrastructure tier is efficient for us. We are very pleased with the collaboration, innovation, and quality that Cloudera has produced by working with us and look forward to a continued relationship that combines joint development in a community oriented way with responsible stewardship of the infrastructure code base we build upon.
Our HBase Triggering and Indexing software can be found on GitHub at:
https://github.com/NGDATA/hbase-sep
https://github.com/NGDATA/hbase-indexerDo you have any indexing or update side-effect needs for HBase? Tell us your thoughts on this solution.