Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 1, 2011

A Convenient Framework for Efficient Parallel Multipass Algorithms

Filed under: MapReduce,Parallel Programming — Patrick Durusau @ 3:32 pm

A Convenient Framework for Efficient Parallel Multipass Algorithms by Markus Weimer, Sriram Rao, and Martin Zinkevich.

Abstract:

The amount of data available is ever-increasing. At the same time, the available time to learn from the available data is decreasing in many applications, especially on the web. These two trends together with limited improvements in per-cpu speed and hard disk bandwidth lead to the need for parallel machine learning algorithms. Numerous have been proposed in the past (including [1, 3, 4]). Many of them make use of frameworks like MapReduce [2], as it facilitates easy parallelization and provides fault tolerance and data local computation at the framework level. However, MapReduce also introduces some inherent inefficiencies when compared to message passing systems like MPI.

In this paper, we present a computational framework based on Workers and Aggregators for dataparallel computations that retains the simplicity of MapReduce, while offering a significant speedup for a large class of algorithms. We report experiments based on several implementations of Stochastic Gradient Descent (SGD): The well known sequential variant as well as a parallel version inspired by our recent work in [5] which we implemented both in MapReduce and the proposed framework.

The direct passing of messages reminds me of Storm.

Comments?

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress