Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

October 10, 2011

Dataflow Programming:…

Filed under: Flow-Based Programming (FBP),Pipes — Patrick Durusau @ 6:18 pm

Dataflow Programming: Handling Huge Data Loads Without Adding Complexity by Jim Falgout.

From the post:

Because the dataflow operators in a graph work in parallel, the model allows overlapping I/O operations with computation. This is a “whole application” approach to parallelization as opposed to many thread-oriented performance frameworks that focus on hot sections of code such as for loops. This addresses a key problem in processing “big data” for today’s many-core processors: feeding data fast enough to the processors.

While dataflow does this, it scales down easily as well. This distinguishes it from technologies, such as Hadoop and to a lesser extent, Map Reduce, which don’t scale downward well due to their innate complexity.

We’ve discussed how a dataflow architecture exploits multicore. These same principles can be applied to multi-node clusters by extending dataflow queues over networks with a dataflow graph executed on multiple systems in parallel. The compositional model of building dataflow graphs allows for replication of pieces of the graph across multiple nodes. Scaling out extends the reach of dataflow to solve large data problems.

Read the first comment. By J.P. Morrison, author of Flow Based Programming, the inspiration for Pipes. (Shamelessly repeated from a post by Marko Rodriguez on the gremlin-users list, Achim first noticed the article.)

Be aware that Amazon lists the Kindle edition for \$29.00 and a hardback edition for \$69.00. Sadly one reader reports the book has no index?

3 Comments

  1. […] like DataFlow Programming… or Flow-Based Programming (FBP) to me. In which case the claim that: It’s just Java. Crunch […]

    Pingback by Introducing Crunch: Easy MapReduce Pipelines for Hadoop « Another Word For It — October 11, 2011 @ 6:08 pm

  2. Hi guys! Just wanted to say that the first version of the 2nd Edition of Flow-Based Programming indeed didn’t have an index. Ernesto Compatangelo contacted me to say that his version of the book had lots of those little plastic stickies stuck in it, so I promptly asked him to send me a list of them – which he kindly did. So there is now an index! I think the ebook versions all have indices, but if there are some early buyers of the paperback that need an index, maybe we could work something out … 🙂

    Comment by Paul Morrison — October 18, 2011 @ 6:21 pm

  3. Excellent! I have written to Amazon to see if the index appears in a later printing. Looking forward to reading it!

    Comment by Patrick Durusau — October 19, 2011 @ 9:38 am

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress