Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

May 2, 2015

On The Bleeding Edge – PySpark, DataFrames, and Cassandra

Filed under: Cassandra,Data Frames,Python — Patrick Durusau @ 8:17 pm

On The Bleeding Edge – PySpark, DataFrames, and Cassandra.

From the post:

A few months ago I wrote a post on Getting Started with Cassandra and Spark.

I’ve worked with Pandas for some small personal projects and found it very useful. The key feature is the data frame, which comes from R. Data Frames are new in Spark 1.3 and was covered in this blog post. Till now I’ve had to write Scala in order to use Spark. This has resulted in me spending a lot of time looking for libraries that would normally take me less than a second to recall the proper Python library (JSON being an example) since I don’t know Scala very well.

If you need help deciding whether to read this post, take a look at Spark SQL and DataFrame Guide to see what you stand to gain.

Enjoy!

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress