MOA Massively Online Analysis : Real Time Analytics for Data Streams
From the homepage:
What is MOA?
MOA is an open source framework for data stream mining. It includes a collection of machine learning algorithms (classification, regression, and clustering) and tools for evaluation. Related to the WEKA project, MOA is also written in Java, while scaling to more demanding problems.
What can MOA do for you?
MOA performs BIG DATA stream mining in real time, and large scale machine learning. MOA can be easily used with Hadoop, S4 or Storm, and extended with new mining algorithms, and new stream generators or evaluation measures. The goal is to provide a benchmark suite for the stream mining community. Details.
Short tutorials and a manual are available. Enough to get started but you will need additional resources on machine learning if it isn’t already familiar.
A small niggle about documentation. Many projects have files named “tutorial” or in this case “Tutorial1,” or “Manual.” Those files are easier to discover/save, if the project name, version(?), is prepended to tutorial or manual. Thus “Moa-2012-08-tutorial1” or “Moa-2012-08-manual.”
If data streams are in your present or future, definitely worth a look.