HBase Coprocessors – Deploy shared functionality directly on the cluster 4 Nomber 2011, 10 AM PT by Lars George.
From the announcement:
The newly added feature of Coprocessors within HBase allows the application designer to move functionality closer to where the data resides. While this sounds like Stored Procedures as known in the RDBMS realm, they have a different set of properties. The distributed nature of HBase adds to the complexity of their implementation, but the client side API allows for an easy, transparent access to their functionality across many servers. This session explains the concepts behind coprocessors and uses examples to show how they can be used to implement data side extensions to the application code.
For background material, you probably want to review:
Advanced HBase by Lars George (Courtesy of Alex Popescu’s myNoSQL site) it takes until slide 72 or so to reach coprocessors but you will learn a lot of stuff along the way.
Extending Query support via Coprocessor endpoints, which summarizes the uses of coprocessors as:
Coprocessors can be used for
a) observing server side operations (like the administrative kinds such as Region splits, major-minor compactions , etc) , and
b) client side operations that are eventually triggered on to the Region servers (like CRUD operations).
Another use case is letting the end user to deploy his own code (some user defined functionality) and directly invoking it from the client interface (HTable). The later functionality is called as Coprocessor Endpoints. [I introduced some paragraphing to make this more readable.]
If you have a copy of HBase: The Definitive Guide, review pages 175-199.