Big Data Analytics with R and Hadoop by David Smith.
From the post:
The open-source RHadoop project makes it easier to extract data from Hadoop for analysis with R, and to run R within the nodes of the Hadoop cluster — essentially, to transform Hadoop into a massively-parallel statistical computing cluster based on R. In yesterday’s webinar (the replay of which is embedded below), Data scientist and RHadoop project lead Antonio Piccolboni introduced Hadoop and explained how to write map-reduce statements in the R language to drive the Hadoop cluster.
Something to catch up on over the weekend.
BTW, do you know the difference between “massively-parallel” and “parallel?” I would think the “Connection Machine” was “massively-parallel” for its time but that was really specialized hardware. Does “massively” mean anything now or is it just a hold over/marketing term?