Microsoft to open source a big data framework called REEF by Derrick Harris.
From the post:
Microsoft has developed a big data framework called REEF (a graciously simple acronym for Retainable Evaluator Execution Framework) that the company intends to open source in about a month. REEF is designed to run on top of YARN, the next-generation resource manager for Hadoop, and is particularly well suited for building machine learning jobs.
Microsoft Technical Fellow and CTO of Information Services Raghu Ramakrishnan explained REEF and Microsoft’s plans to open source it during a Monday morning keynote at the International Conference for Knowledge Mining and Data Discovery, taking place in Chicago.
YARN is a resource manager developed as part of the Apache Hadoop project that lets users run and manage multiple types of jobs (e.g., batch MapReduce, stream processing with Storm and/or a graph-processing package) atop the same cluster of physical machines. This makes it possible not only to consolidate the number of systems that an organization has to manage, but also to run different types of analysis on top of the same data from the same place. In some cases, the entire data workflow can be carried out on just one cluster of machines.
This is very good news!
In part because it furthers the development of the Hadoop ecosystem.
If you think of TCP/IP as a roadway, consider the value of good and services moving along it.
Now think of the Hadoop ecosystem as another roadway.
An interoperable and high-speed roadway for data and data analysis.
Who has user facing applications that rely on data and data analysis? 😉
Here’s to hoping that MS doubles down on the Hadoop ecosystem!