Archive for the ‘Scrunch’ Category

A Quick Guide to Hadoop Map-Reduce Frameworks

Thursday, February 7th, 2013

A Quick Guide to Hadoop Map-Reduce Frameworks by Alex Popescu.

Alex has assembled links to guides to MapReduce frameworks:

Thanks Alex!

Hive, Pig, Scalding, Scoobi, Scrunch and Spark

Tuesday, March 27th, 2012

Hive, Pig, Scalding, Scoobi, Scrunch and Spark by Sami Badawi.

From the post:

Comparison of Hadoop Frameworks

I had to do simple processing of log files in a Hadoop cluster. Writing Hadoop MapReduce classes in Java is the assembly code of Big Data. There are several high level Hadoop frameworks that make Hadoop programming easier. Here is the list of Hadoop frameworks I tried:

  • Pig
  • Scalding
  • Scoobi
  • Hive
  • Spark
  • Scrunch
  • Cascalog

The task was to read log files join with other data do some statistics on arrays of doubles. Programming this without Hadoop is simple, but caused me some grief with Hadoop.

This blog post is not a full review, but my first impression of these Hadoop frameworks.

Everyone has a favorite use case.

How does your use case fare with different frameworks for Hadoop? (We won’t ever know if you don’t say.)