Twitter Storm: Open Source Real-time Hadoop by Bienvenido David III.
From the post:
Twitter has open-sourced Storm, its distributed, fault-tolerant, real-time computation system, at GitHub under the Eclipse Public License 1.0. Storm is the real-time processing system developed by BackType, which is now under the Twitter umbrella. The latest package available from GitHub is Storm 0.5.2, and is mostly written in Clojure.
Storm provides a set of general primitives for doing distributed real-time computation. It can be used for “stream processing”, processing messages and updating databases in real-time. This is an alternative to managing your own cluster of queues and workers. Storm can be used for “continuous computation”, doing a continuous query on data streams and streaming out the results to users as they are computed. It can also be used for “distributed RPC”, running an expensive computation in parallel on the fly.
See the post for links, details, quotes, etc.
My bet is that typologies are going to be data set specific. You?
BTW, I don’t think the local coffee shop offers free access to its cluster. Will have to check with them next week.