A first look at Spark by Joseph Rickert.
From the post:
Apache Spark, the open-source, cluster computing framework originally developed in the AMPLab at UC Berkeley and now championed by Databricks is rapidly moving from the bleeding edge of data science to the mainstream. Interest in Spark, demand for training and overall hype is on a trajectory to match the frenzy surrounding Hadoop in recent years. Next month's Strata + Hadoop World conference, for example, will offer three serious Spark training sessions: Apache Spark Advanced Training, SparkCamp and Spark developer certification with additional spark related talks on the schedule. It is only a matter of time before Spark becomes a big deal in the R world as well.
If you don't know much about Spark but want to learn more, a good place to start is the video of Reza Zadeh's keynote talk at the ACM Data Science Camp held last October at eBay in San Jose that has been recently posted.
After reviewing the high points of Reza Zadeh's presentation, Joseph points out another 4 hours+ of videos on using Spark and R together.
A nice collection for getting started with Spark and seeing how to use a standard tool (R) with an emerging one (Spark).
I first saw this in a tweet by Christophe Lalanne.