The Limitation of MapReduce: A Probing Case and a Lightweight Solution
From the post:
While we usually see enough papers that deal with the applications of the Map Reduce programming model this one for a change tries to address the limitations of the MR model. It argues that MR only allows a program to scale up to process very large data sets, but constrains a program’s ability to process smaller data items. This ability or inability (depending on how you see it) is what it terms as “one-way scalability”. Obviously this “one-wayness” was a requirement for Google but here the authors turn our attention to how this impacts the application of this framework to other computation forms.
The system they argue based on is a distributed compiler and their solution is a more scaled “down” parallelization framework called MRLite that handles more moderate volumes of data. The workload characteristics of a compiler are bit different from analytical workloads. Primary differences being compilation workloads deal with much more humble volumes of data albeit with much greater intertwining amongst the files.
All models have limits so it isn’t surprising that MapReduce does as well.
It will be interesting to see if the limitations of MapReduce are mapped out and avoided in “good practice,” or if some other model becomes the new darling until limits are found for it. Only time will tell.