Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

June 9, 2012

Hadoop Streaming Support for MongoDB

Filed under: Hadoop,Javascript,MapReduce,MongoDB,Python,Ruby — Patrick Durusau @ 7:13 pm

Hadoop Streaming Support for MongoDB

From the post:

MongoDB has some native data processing tools, such as the built-in Javascript-oriented MapReduce framework, and a new Aggregation Framework in MongoDB v2.2. That said, there will always be a need to decouple persistance and computational layers when working with Big Data.

Enter MongoDB+Hadoop: an adapter that allows Apache’s Hadoop platform to integrate with MongoDB.

[graphic omitted]

Using this adapter, it is possible to use MongoDB as a real-time datastore for your application while shifting large aggregation, batch processing, and ETL workloads to a platform better suited for the task.

[graphic omitted]

Well, the engineers at 10gen have taken it one step further with the introduction of the streaming assembly for Mongo-Hadoop.

What does all that mean?

The streaming assembly lets you write MapReduce jobs in languages like Python, Ruby, and JavaScript instead of Java, making it easy for developers that are familiar with MongoDB and popular dynamic programing languages to leverage the power of Hadoop.

I like that, “…popular dynamic programming languages…” 😉

Any improvement to increase usability without religious conversion (using a programming language not your favorite) is a good move.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress