Archive for the ‘Bulk Synchronous Parallel (BSP)’ Category

Graph processing platform Apache Giraph reaches 1.0

Friday, May 10th, 2013

Graph processing platform Apache Giraph reaches 1.0

From the post:

Used by Facebook and Yahoo, the Apache Giraph project for distributed graph processing has released version 1.0. This is the first new version since the project left incubation and became a top-level project in May 2012, though for some reason it has yet to make it to the Apache index of top level projects.

Giraph allows social graphs and other richly interconnected data structures with many billions of edges to be analysed using hundreds of machines. It is inspired by the Bulk Synchronous Parallel abstract computer model and the Google Pregel system for large scale graph-processing. The developers of Giraph say that unlike those systems, Giraph is an open source, scalable platform built atop of the Apache Hadoop infrastructure which has no single point of failure by design. The documentation includes an introduction to Giraph’s iterative graph processing and how to implement graph processing functions in Java. The Giraph project has seen contributions from Yahoo!, Twitter, Facebook and LinkedIn and from academic institutions around the world.

It’s a little early to be downloading software for the weekend but why not? ­čśë

Enjoy!

Exploring Apache Hama

Friday, November 16th, 2012

Exploring Apache Hama

From the post:

Apache Hama is one of the under-hyped projects in the Hadoop ecosystem but gaining a lot of traction steadily with the efforts of its committers. “Apache Hama is a pure BSP (Bulk Synchronous Parallel) computing framework on top of HDFS (Hadoop Distributed File System) for massive scientific computations such as matrix, graph and network algorithms.

A summary of resources on Apache Hama.

You won’t learn Hama over the weekend but you can get a start towards a new skill to list at LinkedIn.

Graph Exploration with Apache Hama

Sunday, April 10th, 2011

Graph Exploration with Apache Hama

From the website:

Hey guys,

I’ve been busy for a tiny bit of time, but I finished the graph exploration algorithm with Apache Hama recently. This post is about the BSP portation of this post.
So I already told in this post how BSP basically works. Now I’m going to tell you what you can do with it in terms of graph exploration. Last post I did this with MapReduce, so let’s go and get into Hama!

Thomas Jungblut on graph exploration.

Take the time. It will be time well spent.

Hama

Sunday, April 3rd, 2011

Hama

Apache Incubator project that describes itself as:

Hama is a distributed computing framework based on BSP (Bulk Synchronous Parallel) computing techniques for massive scientific computations.

A little better explanation appears on the Hama blog when answering the question: “How will Hama BSP different from Pregel?:”

Hama BSP is a computing engine, based on BSP model, like a Pregel, and it’ll be compatible with existing HDFS cluster, or any FileSystem and Database in the future. However, we believe that the BSP computing model is not limited to a problems of graph; it can be used for widely distributed software such as Map/Reduce. In addition to a field of graph, there are many other algorithms, which have similar problems with graph processing using Map/Reduce. Actually, the BSP model has been researched for many years in the field of matrix computation, too. http://blogs.apache.org/hama/

Wikipedia has a short article on Bulk synchronous parallel (BSP) computing techniques with some references.