Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

July 23, 2011

Introduction to Oozie

Filed under: Hadoop,MapReduce,Oozie,Pig — Patrick Durusau @ 3:10 pm

Introduction to Oozie

From the post:

Tasks performed in Hadoop sometimes require multiple Map/Reduce jobs to be chained together to complete its goal. [1] Within the Hadoop ecosystem, there is a relatively new component Oozie [2], which allows one to combine multiple Map/Reduce jobs into a logical unit of work, accomplishing the larger task. In this article we will introduce Oozie and some of the ways it can be used.

What is Oozie ?

Oozie is a Java Web-Application that runs in a Java servlet-container – Tomcat and uses a database to store:

  • Workflow definitions
  • Currently running workflow instances, including instance states and variables

Oozie workflow is a collection of actions (i.e. Hadoop Map/Reduce jobs, Pig jobs) arranged in a control dependency DAG (Direct Acyclic Graph), specifying a sequence of actions execution. This graph is specified in hPDL (a XML Process Definition Language).

Workflow management for Hadoop!

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress