Crunch for Dummies by Brock Noland
From the post:
This guide is intended to be an introduction to Crunch.
Crunch is used for processing data. Crunch builds on top of Apache Hadoop to provide a simpler interface for Java programmers to process data. In Crunch you create pipelines, not unlike Unix pipelines, such as the command below:
Interesting coverage of Crunch.
I don’t know that I agree with the characterization:
… using Hadoop …. require[s] learning a complex process called MapReduce or a higher level language such as Apache Hive or Apache Pig.
True, to use Hadoop means learning MapReduce or Hive or PIg but I don’t think of them as being all that complex. Besides, once you have learned them, the benefits are considerable.
But, to each his own.
You might also be interested in: Introducing Crunch: Easy MapReduce Pipelines for Hadoop.