Archive for the ‘Scoobi’ Category

A Quick Guide to Hadoop Map-Reduce Frameworks

Thursday, February 7th, 2013

A Quick Guide to Hadoop Map-Reduce Frameworks by Alex Popescu.

Alex has assembled links to guides to MapReduce frameworks:

Thanks Alex!

Why Hadoop MapReduce needs Scala

Thursday, March 29th, 2012

Why Hadoop MapReduce needs Scala – A look at Scoobi and Scalding DSLs for Hadoop by Age Mooij.

Fairly sparse slide deck but enough to get you interested enough to investigate what Scoobi and Scalding have to offer.

It may just be me but I find it easier to download the PDFs if I want to view code. The font/color just isn’t readable with online slides. Suggestion: Always allow for downloads of your slides as PDF files.

FYI:

Scoobi

Scalding

Hive, Pig, Scalding, Scoobi, Scrunch and Spark

Tuesday, March 27th, 2012

Hive, Pig, Scalding, Scoobi, Scrunch and Spark by Sami Badawi.

From the post:

Comparison of Hadoop Frameworks

I had to do simple processing of log files in a Hadoop cluster. Writing Hadoop MapReduce classes in Java is the assembly code of Big Data. There are several high level Hadoop frameworks that make Hadoop programming easier. Here is the list of Hadoop frameworks I tried:

  • Pig
  • Scalding
  • Scoobi
  • Hive
  • Spark
  • Scrunch
  • Cascalog

The task was to read log files join with other data do some statistics on arrays of doubles. Programming this without Hadoop is simple, but caused me some grief with Hadoop.

This blog post is not a full review, but my first impression of these Hadoop frameworks.

Everyone has a favorite use case.

How does your use case fare with different frameworks for Hadoop? (We won’t ever know if you don’t say.)