Netflix open sources its Hadoop manager for AWS be Derrick Harris.
From the post:
Netflix runs a lot of Hadoop jobs on the Amazon Web Services cloud computing platform, and on Friday the video-streaming leader open sourced its software to make running those jobs as easy as possible. Called Genie, it’s a RESTful API that makes it easy for developers to launch new MapReduce, Hive and Pig jobs and to monitor longer-running jobs on transient cloud resources.
In the blog post detailing Genie, Netflix’s Sriram Krishnan makes clear a lot more about what Genie is and is not. Essentially, Genie is a platform as a service running on top of Amazon’s Elastic MapReduce Hadoop service. It’s part of a larger suite of tools that handles everything from diagnostics to service registration.
It is not a cluster manager or workflow scheduler for building ETL processes (e.g., processing unstructured data from a web source, adding structure and loading into a relational database system). Netflix uses a product called UC4 for the latter, but it built the other components of the Genie system.
It’s not very futuristic to say that AWS (or something very close to it) will be your next utility bill.
Like paying for water, gas, cable, electricity, it will be an auto-pay setup on your bank account.
What will you say when clients ask if the service you are building for them is hosted on AWS?
Are you going to say your servers are more reliable? That you don’t “trust” Amazon?
Both of which may be true but how will you make that case?
Without sounding like you are selling something the client doesn’t need?
As the price of cloud computing drops, those questions are going to become common.