Announcing SparkR: R on Spark [Spark Summit next week – free live streaming]

Announcing SparkR: R on Spark by Shivaram Venkataraman.

From the post:

I am excited to announce that the upcoming Apache Spark 1.4 release will include SparkR, an R package that allows data scientists to analyze large datasets and interactively run jobs on them from the R shell.

R is a popular statistical programming language with a number of extensions that support data processing and machine learning tasks. However, interactive data analysis in R is usually limited as the runtime is single-threaded and can only process data sets that fit in a single machine’s memory. SparkR, an R package initially developed at the AMPLab, provides an R frontend to Apache Spark and using Spark’s distributed computation engine allows us to run large scale data analysis from the R shell.

The short news here or go to the Spark Summit to get the full story. (Code Databricks20 gets a 20% discount) (That’s next week, June 15 – 17, San Francisco. You need to act quickly.)

BTW, you can register for free live streaming!

Looking forward to this!

Comments are closed.