The Road to Summingbird: Stream Processing at (Every) Scale by Sam Ritchie.
Description:
Twitter’s Summingbird library allows developers and data scientists to build massive streaming MapReduce pipelines without worrying about the usual mess of systems issues that come with realtime systems at scale.
But what if your project is not quite at “scale” yet? Should you ignore scale until it becomes a problem, or swallow the pill ahead of time? Is using Summingbird overkill for small projects? I argue that it’s not. This talk will discuss the ideas and components of Summingbird that you could, and SHOULD, use in your startup’s code from day one. You’ll come away with a new appreciation for monoids and semigroups and a thirst for abstract algebra.
A slide deck that will make you regret missing the presentation.
I wasn’t able to find a video of Sam’s presentation at Data Day Texas 2014, but I did find a collection of his presentations, including some videos, at: http://sritchie.github.io/.
Valuable lessons for startups and others.