MapReduce is Good Enough? If All You Have is a Hammer, Throw Away Everything That’s Not a Nail! by Jimmy Lin.
Abstract:
Hadoop is currently the large-scale data analysis “hammer” of choice, but there exist classes of algorithms that aren’t “nails”, in the sense that they are not particularly amenable to the MapReduce programming model. To address this, researchers have proposed MapReduce extensions or alternative programming models in which these algorithms can be elegantly expressed. This essay espouses a very different position: that MapReduce is “good enough”, and that instead of trying to invent screwdrivers, we should simply get rid of everything that’s not a nail. To be more specific, much discussion in the literature surrounds the fact that iterative algorithms are a poor fit for MapReduce: the simple solution is to find alternative non-iterative algorithms that solve the same problem. This essay captures my personal experiences as an academic researcher as well as a software engineer in a “real-world” production analytics environment. From this combined perspective I reflect on the current state and future of “big data” research.
Following the abstract:
Author’s note: I wrote this essay specifically to be controversial. The views expressed herein are more extreme than what I believe personally, written primarily for the purposes of provoking discussion. If after reading this essay you have a strong reaction, then I’ve accomplished my goal 🙂
The author needs to work on being “controversial.” He gives away the pose “throw away everything not a nail” far too early and easily.
Without the warnings, flashing lights, etc, the hyperbole might be missed, but not by anyone who would benefit from the substance of the paper.
The paper reflects careful thought on MapReduce and its limitations. Merits a careful and close reading.
I first saw this mentioned by John D. Cook.