Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

October 7, 2011

Getting Creative with MapReduce

Filed under: Algorithms,Cascalog,MapReduce — Patrick Durusau @ 6:16 pm

Getting Creative with MapReduce

From the post:

One problem with many existing MapReduce abstraction layers is the utter difficulty of testing queries and workflows. End-to-end tests are maddening to craft in vanilla Hadoop and frustrating at best in Pig and Hive. The difficulty of testing MapReduce workflows makes it scary to change code, and destroys your desire to be creative. A proper testing suite is an absolute prerequisite to doing creative work in big data.

In this blog post, I aim to show how most of the difficulty of writing and testing MapReduce queries stems from the fact that Hadoop confounds application logic with decisions about data storage. These problems are the result of poorly implemented abstractions over the primitives of MapReduce, not problems with the core MapReduce algorithms.

The author advocates the use of Cacaslog and its testing suite. Comments?

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress