Apache Hadoop 2.0.2-alpha Released!

Apache Hadoop 2.0.2-alpha Released! by Arun Murthy.

From the post:

It gives me great pleasure to announce that the Apache Hadoop community has voted to release Apache Hadoop 2.0.2-alpha.

This is the second (alpha) release of the next generation release of Apache Hadoop 2.x and comes with significant enhancements to both the major components of Hadoop:

  • HDFS HA has undergone significant enhancements since the previous release for NameNode High Availability
  • YARN has undergone significant testing and stabilization and validation as is been heavily battle-tested since the previous release.

These are exciting times indeed for the Apache Hadoop community – personally, this is very reminiscent of the period in 2009 when we finally saw the light at the end of the tunnel during the stabilization of Apache Hadoop 1.x (then called Apache Hadoop 0.20.x). A déjà vu, if you will – albeit of the pleasant kind! Yes, we have a few miles to clock, but it feels like the hardest part is already behind us. At the time of release, YARN has already been deployed on super-sized clusters with 2,000 nodes and 3,600 nodes (totaling to nearly 6,000 nodes) at Yahoo alone*.

Exciting times indeed!

Not unlike a star ship fast enough for time dilation to kick in.

Great!

But which way do you go first?

Hadoop 2.0 offers more efficient crunching of data. But efficient crunching of data is a means, not a end.

Which way will you go with Hadoop 2.0?

What questions will you ask that you can’t ask now?

How will you evaluate the answers?

Leave a Reply

You must be logged in to post a comment.