Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

May 23, 2014

Convert Existing Data into Parquet

Filed under: Data,Parquet — Patrick Durusau @ 7:19 pm

Convert Existing Data into Parquet by Uri Laserson.

From the post:

Learn how to convert your data to the Parquet columnar format to get big performance gains.

Using a columnar storage format for your data offers significant performance advantages for a large subset of real-world queries. (Click here for a great introduction.)

Last year, Cloudera, in collaboration with Twitter and others, released a new Apache Hadoop-friendly, binary, columnar file format called Parquet. (Parquet was recently proposed for the ASF Incubator.) In this post, you will get an introduction to converting your existing data into Parquet format, both with and without Hadoop.

Actually, between Uri’s post and my pointing to it, Parquet has been accepted into the ASF Incubator!

All the more reason to start following this project.

Enjoy!

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress