Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

January 25, 2011

Wukong, Bringing Ruby to Hadoop – Post

Filed under: Hadoop — Patrick Durusau @ 10:53 am

Wukong, Bringing Ruby to Hadoop

From the post:

Wukong is hands down the simplest (and probably the most fun) tool to use with hadoop. It especially excels at the following use case:

You’ve got a huge amount of data (let that be whatever size you think is huge). You want to perform a simple operation on each record. For example, parsing out fields with a regular expression, adding two fields together, stuffing those records into a data store, etc etc. These are called map only jobs. They do NOT require a reduce. Can you imagine writing a java map reduce program to add two fields together? Wukong gives you all the power of ruby backed by all the power (and parallelism) of hadoop streaming. Before we get into examples, and there will be plenty, let’s make sure you’ve got wukong installed and running locally.

Authoring a topic map is more than the final act of assembling the topic map. Any number of pre-assembly steps may be necessary before the final steps. Wukong is one more tool to assist in that process.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress