Set Up a Hadoop/HBase Cluster on EC2 in (About) an Hour by George London.
From the post:
I’m going to walk you through a (relatively) simple set of steps that will get you up and running MapReduce programs on a cloud-based, six-node distributed Hadoop/HBase cluster as fast as possible. This is all based on what I’ve picked up on my own, so if you know of better/faster methods, please let me know in comments!
We’re going to be running our cluster on Amazon EC2, and launching the cluster using Apache Whirr and configuring it using Cloudera Manager Free Edition. Then we’ll run some basic programs I’ve posted on Github that will parse data and load it into Apache HBase.
All together, this tutorial will take a bit over one hour and cost about $10 in server costs.
This is the sort of tutorial that I long to write for topic maps.
There is a longer version of this tutorial here.