Tuple MapReduce: beyond the classic MapReduce by Pere Ferrera Bertran.
From the post:
In this post we’ll review the MapReduce model proposed by Google in 2004 and propound another one called Tuple MapReduce. We’ll see that this new model is a generalization of the first and we’ll explain what advantages it has to offer. We’ll provide a practical example and conclude by discussing when the implementation of Tuple MapReduce is advisable.
In the conclusion:
In this post we have presented a new MapReduce model, Tuple MapReduce, and we have shown its benefits and virtues. We have generalized it in order to allow joins between different data sources (Tuple-Join MapReduce). We have noted that it allows the same things to be done as the MapReduce we already know, while making it much simpler to learn and use.
We believe that an implementation of Tuple MapReduce would be advisable and that it could act as a replacement for the original MapReduce. This implementation, instead of being comparable to existing high-level tools that have been created on top of MapReduce, would be comparable in efficiency to current implementations of MapReduce.
The post promises open source code in the near future.
I have to admit to being interested even without working code but that would quickly change to excitement upon successful testing of Tuple-Join MapReduce. Quite definitely the sort of mapping exercise that needs a standardized mapping language. 😉
[…] Good news but we also know that the Hadoop paradigm is evolving: Tuple MapReduce: beyond the classic MapReduce. […]
Pingback by Talend Open Studio for Big Data w/ Hadoop « Another Word For It — March 11, 2012 @ 8:10 pm