Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

March 25, 2013

5 Pitfalls To Avoid With Hadoop

Filed under: Data Integration,Hadoop,MapReduce — Patrick Durusau @ 3:41 pm

5 Pitfalls To Avoid With Hadoop by Syncsort, Inc.

From the registration page:

Hadoop is a great vehicle to extract value from Big Data. However, relying only on Hadoop and common scripting tools like Pig, Hive and Sqoop to achieve a complete ETL solution can hinder success.

Syncsort has worked with early adopter Hadoop customers to identify and solve the most common pitfalls organizations face when deploying ETL on Hadoop.

  1. Hadoop is not a data integration tool
  2. MapReduce programmers are hard to find
  3. Most data integration tools don’t run natively within Hadoop
  4. Hadoop may cost more than you think
  5. Elephants don’t thrive in isolation

Before you give up your email and phone number for the “free ebook,” be aware it is a promotional piece for Syncsort DMX-h.

Which isn’t a bad thing but if you are expecting something different, you will be disappointed.

The observations are trivially true and amount to Hadoop not having a user facing interface, pre-written routines for data integration and tools that data integration users normally expect.

OK, but a hammer doesn’t come with blueprints, nails, wood, etc., but those aren’t “pitfalls.”

It’s the nature of a hammer that those “extras” need to be supplied.

You can either do that piecemeal or you can use a single source (the equivalent of Syncsort DMX-h).

Syncsort should be on your short list of data integration options to consider but let’s avoid loose talk about Hadoop. There is enough of that in the uninformed main stream media.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress