Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 23, 2011

Apache Hadoop 0.23 is Here!

Filed under: Hadoop,MapReduce — Patrick Durusau @ 7:38 pm

Apache Hadoop 0.23 is Here! by Arun Murthy.

Arun isolates two major improvements:

HDFS Federation

HDFS has undergone a transformation to separate out Namespace management from the Block (storage) management to allow for significant scaling of the filesystem – in the current architecture they are intertwined in the NameNode.

More details are available in the HDFS Federation release documentation or in the recent HDFS Federation talk by Suresh Srinivas, a Hortonworks co-founder at Hadoop World, 2011.

NextGen MapReduce aka YARN

MapReduce has undergone a complete overhaul in hadoop-0.23 with the fundamental change to split up the two major functionalities of the JobTracker, resource management and job scheduling/monitoring, into separate daemons. The idea is to have a global ResourceManager (RM) and per-application ApplicationMaster (AM). An application is either a single job in the classical sense of Map-Reduce jobs or a DAG of jobs. Thus, Hadoop becomes a general purpose data-processing platform where we can support MapReduce and other application execution frameworks such as MPI etc.

More details are available in the YARN release documentation or in the recent YARN presentation by Mahadev Konar, a Hortonworks co-founder at Hadoop World, 2011.

Arun also notes that Hadoop 0.23 is an alpha release so don’t use this in a production environment (unless you are feeling lucky. Are you?)

More details at Hadoop World presentation.

So, in addition to a production quality Hadoop ecosystem you are going to need to setup a test Hadoop ecosystem. Well, winter is coming on and a couple of more boxes to heat the office won’t be a bad thing. 😉

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress