If I were Gartner, I could get IBM to support my stating the obvious. I would have to dress it up by repeating a lot of other obvious things but that seems to be the role for some “analysts.”
If you need proof of that claim, consider this report: Hadoop Is Not a Data Integration Solution. Really? Did any sane person familiar with Hadoop think otherwise?
The “key” findings from the report:
- Many Hadoop projects perform extract, transform and load workstreams. Although these serve a purpose, the technology lacks the necessary key features and functions of commercially-supported data integration tools.
- Data integration requires a method for rationalizing inconsistent semantics, which helps developers rationalize various sources of data (depending on some of the metadata and policy capabilities that are entirely absent from the Hadoop stack).
- Data quality is a key component of any appropriately governed data integration project. The Hadoop stack offers no support for this, other than the individual programmer’s code, one data element at a time, or one program at a time.
- Because Hadoop workstreams are independent — and separately programmed for specific use cases — there is no method for relating one to another, nor for identifying or reconciling underlying semantic differences.
All true, all obvious and all a function of Hadoop’s design. It never had data integration as a requirement so finding that it doesn’t do data integration isn’t a surprise.
If you switch “commercially-supported data integration tools,” you will be working “…one data element at a time,” because common data integration tools don’t capture their own semantics. Which means you can’t re-use your prior data integration with one tool when you transition to another. Does that sound like vendor lock-in?
Odd that Gartner didn’t mention that.
Perhaps that’s stating the obvious as well.
A topic mapping of your present data integration solution will enable you to capture and re-use your investment in its semantics, with any data integration solution.
Did I hear someone say “increased ROI?”