Archive for the ‘Naiad’ Category

Microsoft Research’s Naiad Project

Wednesday, May 28th, 2014

Solve the Big Data Problems of the Future: Join Microsoft Research’s Naiad Project by Tara Grumm.

From the post:

Over the past decade, general-purpose big data platforms like Hadoop have brought distributed computing into the mainstream. As people have become accustomed to processing their data in the cloud, they have become more ambitious, wanting to do things like graph analysis, machine learning, and real-time stream processing on their huge data sources.

Naiad is designed to solve this more challenging class of problems: it adds support for a few key primitives – maintaining state, executing loops, and reacting to incoming data – and provides high-performance infrastructure for running them in a scalable distributed system.

The result is the best of both worlds. Naiad runs simple programs just as fast as existing general-purpose platforms, and complex programs as fast as specialized systems for graph analysis, machine learning, and stream processing. Moreover, as a general-purpose system, Naiad lets you compose these different applications together, enabling mashups (such as computing a graph algorithm over a real-time sliding window of a social media firehose) that weren’t possible before.

Who should use Naiad?

We’ve designed Naiad to be accessible to a variety of different users. You can get started right away with Naiad by writing programs using familiar declarative operators based on SQL and LINQ.

For power users, we’ve created low-level interfaces to make it possible to extend Naiad without sacrificing any performance. You can plug in optimized data structures and algorithms, and build new domain-specific languages on top of Naiad. For example, we wrote a graph processing layer on top of Naiad that has performance comparable with (and often better than) specialized systems designed only to process graphs.

Big data geeks and open source supporters should take a serious look at the Naiad Project.

It will take a while but the real question in the future will be how well you can build upon a continuous data substrate.

Or as Harvey Logan says in Butch Cassidy and the Sundance Kid,

Rules? In a knife fight? No rules!

I would prepare accordingly.