Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

February 16, 2012

Cascading

Filed under: Cascading,Hadoop,MapReduce — Patrick Durusau @ 7:02 pm

Cascading

Since Cascading got called out today in the graph partitioning posts, thought it would not hurt to point it out.

From the webpage:

Cascading is a Data Processing API, Process Planner, and Process Scheduler used for defining and executing complex, scale-free, and fault tolerant data processing workflows on an Apache Hadoop cluster. All without having to ‘think’ in MapReduce.

Cascading is a thin Java library and API that sits on top of Hadoop’s MapReduce layer and is executed from the command line like any other Hadoop application.

As a library and API that can be driven from any JVM based language (Jython, JRuby, Groovy, Clojure, etc.), developers can create applications and frameworks that are “operationalized”. That is, a single deployable Jar can be used to encapsulate a series of complex and dynamic processes all driven from the command line or a shell. Instead of using external schedulers to glue many individual applications together with XML against each individual command line interface.

The Cascading API approach dramatically simplifies development, regression and integration testing, and deployment of business critical applications on both Amazon Web Services (like Elastic MapReduce) or on dedicated hardware.

1 Comment

  1. […] find since I just mentioned Cascading […]

    Pingback by Scalding « Another Word For It — February 17, 2012 @ 5:08 pm

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress