Using Hive to interact with HBase, Part 1 by Nick Dimiduk.
From the post:
This is the first of two posts examining the use of Hive for interaction with HBase tables. Check back later in the week for the concluding article.
One of the things I’m frequently asked about is how to use HBase from Apache Hive. Not just how to do it, but what works, how well it works, and how to make good use of it. I’ve done a bit of research in this area, so hopefully this will be useful to someone besides myself. This is a topic that we did not get to cover in HBase in Action, perhaps these notes will become the basis for the 2nd edition 😉 These notes are applicable to Hive 0.11.x used in conjunction with HBase 0.94.x. They should be largely applicable to 0.12.x + 0.96.x, though I haven’t tested everything yet.
The hive project includes an optional library for interacting with HBase. This is where the bridge layer between the two systems is implemented. The primary interface you use when accessing HBase from Hive queries is called the BaseStorageHandler. You can also interact with HBase tables directly via Input and Output formats, but the handler is simpler and works for most uses.
…
If you want to be on the edge of Hive/HBase interaction, start here.
Be forewarned that you are in a folklore, JIRA issue, etc., place but you will be ahead of the less brave.