From the post:
Tasks performed in Hadoop sometimes require multiple Map/Reduce jobs to be chained together to complete its goal. [1] Within the Hadoop ecosystem, there is a relatively new component Oozie [2], which allows one to combine multiple Map/Reduce jobs into a logical unit of work, accomplishing the larger task. In this article we will introduce Oozie and some of the ways it can be used.
What is Oozie ?
Oozie is a Java Web-Application that runs in a Java servlet-container – Tomcat and uses a database to store:
- Workflow definitions
- Currently running workflow instances, including instance states and variables
Oozie workflow is a collection of actions (i.e. Hadoop Map/Reduce jobs, Pig jobs) arranged in a control dependency DAG (Direct Acyclic Graph), specifying a sequence of actions execution. This graph is specified in hPDL (a XML Process Definition Language).
Workflow management for Hadoop!