New in Cloudera Labs: SparkOnHBase by Ted Malaska.
From the post:
Apache Spark is making a huge impact across our industry, changing the way we think about batch processing and stream processing. However, as we progressively migrate from MapReduce toward Spark, we shouldn’t have to “give up” anything. One of those capabilities we need to retain is the ability to interact with Apache HBase.
In this post, we will share the work being done in Cloudera Labs to make integrating Spark and HBase super-easy in the form of the SparkOnHBase project. (As with everything else in Cloudera Labs, SparkOnHBase is not supported and there is no timetable for possible support in the future; it’s for experimentation only.) You’ll learn common patterns of HBase integration with Spark and see Scala and Java examples for each. (It may be helpful to have the SparkOnHBase repository open as you read along.)
…
Is it too late to amend my wish list to include an eighty-hour week with Spark? 😉
This is an excellent opportunity to follow along with lab quality research on an important technology.
The Cloudera Labs discussion group strikes me as dreadfully under used.
Enjoy!