Apache Hadoop 0.23 is Here! by Arun Murthy.
Arun isolates two major improvements:
HDFS Federation
HDFS has undergone a transformation to separate out Namespace management from the Block (storage) management to allow for significant scaling of the filesystem – in the current architecture they are intertwined in the NameNode.
…
More details are available in the HDFS Federation release documentation or in the recent HDFS Federation talk by Suresh Srinivas, a Hortonworks co-founder at Hadoop World, 2011.NextGen MapReduce aka YARN
MapReduce has undergone a complete overhaul in hadoop-0.23 with the fundamental change to split up the two major functionalities of the JobTracker, resource management and job scheduling/monitoring, into separate daemons. The idea is to have a global ResourceManager (RM) and per-application ApplicationMaster (AM). An application is either a single job in the classical sense of Map-Reduce jobs or a DAG of jobs. Thus, Hadoop becomes a general purpose data-processing platform where we can support MapReduce and other application execution frameworks such as MPI etc.
…
More details are available in the YARN release documentation or in the recent YARN presentation by Mahadev Konar, a Hortonworks co-founder at Hadoop World, 2011.
Arun also notes that Hadoop 0.23 is an alpha release so don’t use this in a production environment (unless you are feeling lucky. Are you?)
More details at Hadoop World presentation.
So, in addition to a production quality Hadoop ecosystem you are going to need to setup a test Hadoop ecosystem. Well, winter is coming on and a couple of more boxes to heat the office won’t be a bad thing. 😉