Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 9, 2012

Matching MDM with Hadoop: Think of the Possibilities [Danger! Danger! Will Robinson!]

Filed under: Hadoop,MDM — Patrick Durusau @ 4:08 pm

Matching MDM with Hadoop: Think of the Possibilities by Loraine Lawson.

From the post:

I’m always curious about use cases with Hadoop, mostly because I feel there’s a lot of unexplored potential still.

For example, could Hadoop make it easier to achieve master data management’s goal of a “single version of the customer” from large datasets? During a recent interview with IT Business Edge, Ciaran Dynes said the idea has a lot of potential, especially when you consider that customer records from, say, banks can have up to 150 different attributes.

Hadoop can allow you to explore as many dimensions and attributes you want, he explained.

“They have every flavor of your address and duplications of your address, for that matter, in that same record,” Dynes, Talend’s senior director of product management and product marketing, said. “What Hadoop allows you to consider is, ‘Let’s put it all up there for the problems that they’re presenting like a single version of the customer.’”

Dynes also thinks we’re still exploring the edges of Hadoop’s potential to change information management.

“We genuinely think it is going to probably have a bigger effect on the industry than the cloud,” he said. “Its opening up possibilities that we didn’t think we could look at in terms of analytics, would be one thing. But I think there’s so many applications for this technology and so many ways of thinking about how you integrate your entire company that I do think it’ll have a profound effect on the industry.”

When I hear the phrase “…single version of the customer…” I think of David Loshin’s “A Good Example of Semantic Inconsistency” (my pointer with comments)

David illustrates that “customer” is a term fraught were complexity.

Having a bigger gun doesn’t make a moving target easier to hit.

Can do more damage unintentionally than good.

January 7, 2012

First Look — Talend

Filed under: Data Integration,Data Management,MDM,Talend — Patrick Durusau @ 4:03 pm

First Look — Talend

From the post:

Talend has been around for about 6 years and the original focus was on “democratizing” data integration – making it cheaper, easier, quicker and less maintenance-heavy. They originally wanted to build an open source alternative for data integration. In particular they wanted to make sure that there was a product that worked for smaller companies and smaller projects, not just for large data warehouse efforts.

Talend has 400 employees in 8 countries and 2,500 paying customers for their Enterprise product. Talend uses an “open core” philosophy where the core product is open source and the enterprise version wraps around this as a paid product. They have expanded from pure data integration into a broader platform with data quality and MDM and a year ago they acquired an open source ESB vendor and earlier this year released a Talend branded version of this ESB.

I have the Talend software but need to spend some time working through the tutorials, etc.

A review from a perspective of subject identity and re-use of subject identification.

It may help me to simply start posting as I work through the software rather than waiting to create an edited review of the whole. Which I could always fashion from the pieces if it looked useful.

Watch for the start of my review of Talend this next week.

December 28, 2011

MDM Goes Beyond the Data Warehouse

Filed under: Master Data Management,MDM — Patrick Durusau @ 9:32 pm

MDM Goes Beyond the Data Warehouse

Rich Sherman writes:

Enterprises are awash with data from customers, suppliers, employees and their operational systems. Most enterprises have data warehousing (DW) or business intelligence (BI) programs, which sometimes have been operating for many years. The DW/BI programs frequently do not provide the consistent information needed by the business because of multiple and often inconsistent lists of customers, prospects, employees, suppliers and products. Master data management (MDM) is the initiative that is needed to address the problem of inconsistent lists or dimensions.

The reality is that for many years, whether people realized it or not, the DW has served as the default MDM repository. This happened because the EDW had to reconcile and produce a master list of data for every data subject area that the business needs for performing enterprise analytics. Years before the term MDM was coined, MDM was referred to as reference data management. But DW programs have fallen short of providing effective MDM solutions for several reasons.

Interesting take on the problems faced in master data management projects. (Yes, I added index entries for MDM and “master data management.” People might look under one and not the other.)

It occurs to me that there may be transitions towards a master data list that includes understanding data systems that will eventually migrate to the master system. Topic maps could play a useful role in creating the mapping to the master system as well as finding commonalities in other systems to be migrated to the master system.

Documenting the master system with a topic map would give such a project one leg up as they say on its eventual migration to some other system.

And there are always alien data systems that have different data systems from the internal MDM system (assuming that comes to pass), which could also be mapped into the master system using topic maps. I say “assuming that comes to pass” about MDM systems because the “reference data management” if implemented, would have already solved the problems that MDM faces today.

IT services are not regarded as a project with a defined end point. After all, users expect IT services every day. And such services are necessary for any enterprise to conduct business.

Perhaps data integration should move from a “project” orientation to a “process” orientation, so that continued investment and management of the integration process is ongoing and not episodic. That would create a base for in-house expertise at data integration and a continual gathering of information and expertise to anticipate data integration issues, instead of trying to solve them in hindsight.

Powered by WordPress