Hadoop Streaming Support for MongoDB
From the post:
MongoDB has some native data processing tools, such as the built-in Javascript-oriented MapReduce framework, and a new Aggregation Framework in MongoDB v2.2. That said, there will always be a need to decouple persistance and computational layers when working with Big Data.
Enter MongoDB+Hadoop: an adapter that allows Apache’s Hadoop platform to integrate with MongoDB.
[graphic omitted]
Using this adapter, it is possible to use MongoDB as a real-time datastore for your application while shifting large aggregation, batch processing, and ETL workloads to a platform better suited for the task.
[graphic omitted]
Well, the engineers at 10gen have taken it one step further with the introduction of the streaming assembly for Mongo-Hadoop.
What does all that mean?
The streaming assembly lets you write MapReduce jobs in languages like Python, Ruby, and JavaScript instead of Java, making it easy for developers that are familiar with MongoDB and popular dynamic programing languages to leverage the power of Hadoop.
I like that, “…popular dynamic programming languages…” 😉
Any improvement to increase usability without religious conversion (using a programming language not your favorite) is a good move.