Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

February 9, 2014

Write and Run Giraph Jobs on Hadoop

Filed under: Cloudera,Giraph,Graphs,Hadoop,MapReduce — Patrick Durusau @ 7:52 pm

Write and Run Giraph Jobs on Hadoop by Mirko Kämpf.

From the post:

Create a test environment for writing and testing Giraph jobs, or just for playing around with Giraph and small sample datasets.

Apache Giraph is a scalable, fault-tolerant implementation of graph-processing algorithms in Apache Hadoop clusters of up to thousands of computing nodes. Giraph is in use at companies like Facebook and PayPal, for example, to help represent and analyze the billions (or even trillions) of connections across massive datasets. Giraph was inspired by Google’s Pregel framework and integrates well with Apache Accumulo, Apache HBase, Apache Hive, and Cloudera Impala.

Currently, the upstream “quick start” document explains how to deploy Giraph on a Hadoop cluster with two nodes running Ubuntu Linux. Although this setup is appropriate for lightweight development and testing, using Giraph with an enterprise-grade CDH-based cluster requires a slightly more robust approach.

In this how-to, you will learn how to use Giraph 1.0.0 on top of CDH 4.x using a simple example dataset, and run example jobs that are already implemented in Giraph. You will also learn how to set up your own Giraph-based development environment. The end result will be a setup (not intended for production) for writing and testing Giraph jobs, or just for playing around with Giraph and small sample datasets. (In future posts, I will explain how to implement your own graph algorithms and graph generators as well as how to export your results to Gephi, the “Adobe Photoshop for graphs”, through Impala and JDBC for further inspection.)

The first in a series of posts on Giraph.

This is great stuff!

It should keep you busy during your first conference call and/or staff meeting on Monday morning.

Monday won’t seem so bad. 😉

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress