While we are talking about MapReduce, may as well mention a Riak project, Pipe, that went out in beta in mid-June of this year.
From Bryan Fink’s announcement:
I’m excited to announce the opening of a new beta-status Basho project today: Riak Pipe.
http://github.com/basho/riak_pipe
Riak Pipe is a new way to distribute work around a Riak cluster.
The README explains much more than I can here, but essentially Riak Pipe allows you to specify work in the form of a chain of function pairs. One function of that pair describes how to produce output from input, and the other describes where in the cluster an input should be processed. Riak Pipe handles the details of ferrying data between workers by building atop Riak Core’s distribution power.
At this point in time Riak Pipe is BETA-status software. We’d like anyone who is interested in it to take a look and send us feedback. Please do not put it into production. We will be continuing to improve Riak Pipe toward a future release date.
We have two plans for Riak Pipe. The first is to power Riak’s MapReduce system with it. We think Riak Pipe provides a cleaner, more manageable subsystem that will provide much easier monitoring, debugging, and general use of MapReduce in Riak. You can see our work toward that goal in the “pipe” branch of Riak KV (start at src/riak_kv_mrc_pipe.erl):
https://github.com/basho/riak_kv/tree/pipe
Our second plan for Riak Pipe is to expand Riak’s MapReduce system with more abilities (imagine a keyed-reduce phase, or additional processing languages), possibly to the extent of providing an entirely separate interface (new query syntax? offline/asynchronous processing?). But for this part, we need your help.
We have some ideas about what external client interfaces might look like. We also have some ideas about what an external processing interface might look like. We’re still in the early phases of creating these, though, so if exploring the riak_pipe repository gives you ideas, please don’t hesitate to get in touch.
And, again, Riak Pipe is BETA software. Basho does not support running it in production at this time.