5 Pitfalls To Avoid With Hadoop by Syncsort, Inc.
From the registration page:
Hadoop is a great vehicle to extract value from Big Data. However, relying only on Hadoop and common scripting tools like Pig, Hive and Sqoop to achieve a complete ETL solution can hinder success.
Syncsort has worked with early adopter Hadoop customers to identify and solve the most common pitfalls organizations face when deploying ETL on Hadoop.
- Hadoop is not a data integration tool
- MapReduce programmers are hard to find
- Most data integration tools don’t run natively within Hadoop
- Hadoop may cost more than you think
- Elephants don’t thrive in isolation
Before you give up your email and phone number for the “free ebook,” be aware it is a promotional piece for Syncsort DMX-h.
Which isn’t a bad thing but if you are expecting something different, you will be disappointed.
The observations are trivially true and amount to Hadoop not having a user facing interface, pre-written routines for data integration and tools that data integration users normally expect.
OK, but a hammer doesn’t come with blueprints, nails, wood, etc., but those aren’t “pitfalls.”
It’s the nature of a hammer that those “extras” need to be supplied.
You can either do that piecemeal or you can use a single source (the equivalent of Syncsort DMX-h).
Syncsort should be on your short list of data integration options to consider but let’s avoid loose talk about Hadoop. There is enough of that in the uninformed main stream media.