Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

May 18, 2014

Yahoo Betting on Apache Hive, Tez, and YARN

Filed under: Hadoop YARN,Hive,Tez — Patrick Durusau @ 8:01 pm

Yahoo Betting on Apache Hive, Tez, and YARN

With the usual caveats about test results:

On the other hand, Hive 0.13 query execution times were not only significantly better at higher volumes of data (Fig 3 and 4) but also executed successfully without failing. In our comparisons and observations with Shark, we saw most queries fail with the larger (10TB) dataset. These same queries ran successfully and much faster on Hive 0.13, allowing for better scale. This was extremely critical for us, as we needed a single query and BI solution on the Hadoop grid regardless of dataset size. The Hive solution resonates with our users, as they do not have to worry about learning multiple technologies and discerning which solution to use when. A common solution also results in cost and operational efficiencies from having to build, deploy, and maintain a single solution.

Successful 10TB query times and results should be enough to get your attention. Not that many of us have data in that range, today, but tomorrow, who can say?

Enjoy!

I first saw this in a tweet by Joshua Lande.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress