Dataflow Programming:…

Dataflow Programming: Handling Huge Data Loads Without Adding Complexity by Jim Falgout.

From the post:

Because the dataflow operators in a graph work in parallel, the model allows overlapping I/O operations with computation. This is a “whole application” approach to parallelization as opposed to many thread-oriented performance frameworks that focus on hot sections of code such as for loops. This addresses a key problem in processing “big data” for today’s many-core processors: feeding data fast enough to the processors.

While dataflow does this, it scales down easily as well. This distinguishes it from technologies, such as Hadoop and to a lesser extent, Map Reduce, which don’t scale downward well due to their innate complexity.

We’ve discussed how a dataflow architecture exploits multicore. These same principles can be applied to multi-node clusters by extending dataflow queues over networks with a dataflow graph executed on multiple systems in parallel. The compositional model of building dataflow graphs allows for replication of pieces of the graph across multiple nodes. Scaling out extends the reach of dataflow to solve large data problems.

Read the first comment. By J.P. Morrison, author of Flow Based Programming, the inspiration for Pipes. (Shamelessly repeated from a post by Marko Rodriguez on the gremlin-users list, Achim first noticed the article.)

Be aware that Amazon lists the Kindle edition for \$29.00 and a hardback edition for \$69.00. Sadly one reader reports the book has no index?

3 Responses to “Dataflow Programming:…”

  1. […] like DataFlow Programming… or Flow-Based Programming (FBP) to me. In which case the claim that: It’s just Java. Crunch […]

  2. Paul Morrison says:

    Hi guys! Just wanted to say that the first version of the 2nd Edition of Flow-Based Programming indeed didn’t have an index. Ernesto Compatangelo contacted me to say that his version of the book had lots of those little plastic stickies stuck in it, so I promptly asked him to send me a list of them – which he kindly did. So there is now an index! I think the ebook versions all have indices, but if there are some early buyers of the paperback that need an index, maybe we could work something out … :-)

  3. Patrick Durusau says:

    Excellent! I have written to Amazon to see if the index appears in a later printing. Looking forward to reading it!