Archive for the ‘Master Data Management’ Category

Master Indexing and the Unified View

Monday, March 25th, 2013

Master Indexing and the Unified View by David Loshin.

From the post:

1) Identity resolution – The master data environment catalogs the set of representations that each unique entity exhibits in the original source systems. Applying probabilistic aggregation and/or deterministic rules allows the system to determine that the data in two or more records refers to the same entity, even if the original contexts are different.

2) Data quality improvement – Linking records that share data about the same real-world entity enable the application of business rules to improve the quality characteristics of one or more of the linked records. This doesn’t specifically mean that a single “golden copy” record must be created to replace all instances of the entity’s data. Instead, depending on the scenario and quality requirements, the accessibility of the different sources and the ability to apply those business rules at the data user’s discretion will provide a consolidated view that best meets the data user’s requirements at the time the data is requested.

3) Inverted mapping – Because the scope of data linkage performed by the master index spans the breadth of both the original sources and the collection of data consumers, it holds a unique position to act as a map for a standardized canonical representation of a specific entity to the original source records that have been linked via the identity resolution processes.

In essence this allows you to use a master data index to support federated access to original source data while supporting the application of data quality rules upon delivery of the data.

It’s been a long day but does David’s output have all the attributes of a topic map?

  1. Identity resolution – Two or more representatives the same subject
  2. Data quality improvement – Consolidated view of the data based on a subject and presented to the user
  3. Inverted mapping – Navigation based on a specific entity into original source records

Comments?

The Ironies of MDM [Master Data Management/Muti-Database Mining]

Saturday, November 24th, 2012

A survey on mining multiple data sources by T. Ramkumar, S. Hariharan and S. Selvamuthukumaran.

Abstract:

Advancements in computer and communication technologies demand new perceptions of distributed computing environments and development of distributed data sources for storing voluminous amount of data. In such circumstances, mining multiple data sources for extracting useful patterns of significance is being considered as a challenging task within the data mining community. The domain, multi-database mining (MDM) is regarded as a promising research area as evidenced by numerous research attempts in the recent past. The methods exist for discovering knowledge from multiple data sources, they fall into two wide categories, namely (1) mono-database mining and (2) local pattern analysis. The main intent of the survey is to explain the idea behind those approaches and consolidate the research contributions along with their significance and limitations.

I can’t reach the full article, yet, but it sounds like one that merits attention.

I was struck by the irony of MDM, which some data types would expand to be “Master Data Management,” is read here to mean, “Multi-Database Mining.”

To be sure, “Master Data Management” can be useful, but be mindful that non-managed data lurks just outside your door.

MDM: It’s Not about One Version of the Truth

Wednesday, October 31st, 2012

MDM: It’s Not about One Version of the Truth by Michele Goetz.

From the post:

Here is why I am not a fan of the “single source of truth” mantra. A person is not one-dimensional; they can be a parent, a friend, a colleague and each has different motivations and requirements depending on the environment. A product is as much about the physical aspect as it is the pricing, message, and sales channel it is sold through. Or, it is also faceted by the fact that it is put together from various products and parts from partners. In no way is a master entity unique or has a consistency depending on what is important about the entity in a given situation. What MDM provides are definitions and instructions on the right data to use in the right engagement. Context is a key value of MDM.

When organizations have implemented MDM to create a golden record and single source of truth, domain models are extremely rigid and defined only within a single engagement model for a process or reporting. The challenge is the master entity is global in nature when it should have been localized. This model does not allow enough points of relationship to create the dimensions needed to extend beyond the initial scope. If you want to now extend, you need to rebuild your MDM model. This is essentially starting over or you ignore and build a layer of redundancy and introduce more complexity and management.

The line:

The challenge is the master entity is global in nature when it should have been localized.

stopped me cold.

What if I said:

“The challenge is a subject proxy is global in nature when it should have been localized.”

Would your reaction be the same?

Shouldn’t subject identity always be local?

Or perhaps better, have you ever experienced a subject identification that wasn’t local?

We may talk about a universal notion of subject but even so we are using a localized definition of universal subject.

If a subject proxy is a container for local identifications, thought to be identifications of the same subject, need we be concerned if it doesn’t claim to be a universal representative for some subject? Or is it sufficient that it is a faithful representative of one or more identifications, thought by some collector to identify the same subject?

I am leaning towards the latter because it jettisons the doubtful baggage of universality.

That is a subject may have more than one collection of local identifications (such collections being subject proxies), none of which is the universal representative for that subject.

Even if we think another collection represents the same subject, merging those collections is a question of your requirements.

You may not want to collect Twitter comments in Hindi about Glee.

Your topic map, your requirements, your call.

PS: You need to read Michele’s original post to discover what could entice management to fund an MDM project. Interoperability of data isn’t it.

30 MDM Customer Use Cases (Master Data Management in action)

Friday, March 16th, 2012

30 MDM Customer Use Cases (Master Data Management in action)

Jakki Geiger writes:

Master Data Management (MDM) has been used by companies for more than eight years to address the challenge of fragmented and inconsistent data across systems. Over the years we’ve compiled quite a cadre of uses cases across industries and strategic initiatives. I thought this outline of the 30 most common MDM initiatives may be of interest to those of you who are just getting started on your MDM journey.

Although these organizations span different industries, face varied business problems and started with diverse domains, you’ll notice that revenue, compliance and operational efficiency are the most common drivers of MDM initiatives. The impetus is to improve the foundational data that’s used for analysis and daily operations. (Click on the chart to make it larger.)

Curious what you make of the “use cases” in the charts?

They are all good goals but I am not sure I would call them “use cases.”

Take HealthCare under Marketing, which reads:

To improve the customer experience and marketing effectiveness with a better understanding of members, their household relationships and plan/policy information.

Is that a use case? For master data management?

The Wikipedia entry on master data management says in part:

At a basic level, MDM seeks to ensure that an organization does not use multiple (potentially inconsistent) versions of the same master data in different parts of its operations, which can occur in large organizations. A common example of poor MDM is the scenario of a bank at which a customer has taken out a mortgage and the bank begins to send mortgage solicitations to that customer, ignoring the fact that the person already has a mortgage account relationship with the bank. This happens because the customer information used by the marketing section within the bank lacks integration with the customer information used by the customer services section of the bank. Thus the two groups remain unaware that an existing customer is also considered a sales lead. The process of record linkage is used to associate different records that correspond to the same entity, in this case the same person.

Other problems include (for example) issues with the quality of data, consistent classification and identification of data, and data-reconciliation issues.

Can you find any “use cases” in the Infomatica post?

BTW, topic maps avoid “inconsistent” data without forcing you to reconcile and update all your data records. (Inquire.)

Structure, Semantics and Master Data Models

Friday, March 9th, 2012

Structure, Semantics and Master Data Models by David Loshin.

From the post:

Looking back at some of my Informatica Perspectives posts over the past year or so, I reflected on some common themes about data management and data governance, especially in the context of master data management and particularly, master data models. As both the tools and the practices around MDM mature, we have seen some disillusionment in attempts to deploy an MDM solution, with our customers noting that they continue to hit bumps in the road in the technical implementation associated with both master data consolidation and then with publication of shared master data.

Almost every issue we see can be characterized into one of three buckets:

What do you think about David’s three buckets? Close? Far away?


David continued this line of postings:

Master Data Model Alternatives – Part 2 March 12, 2012.

Master Data Consolidation Versus Master Data Sharing: Modeling Matters! – Part 3 March 19, 2012.

Considerations for Multi-Domain Master Data Modeling – Part 4 March 26, 2012.

MDM Goes Beyond the Data Warehouse

Wednesday, December 28th, 2011

MDM Goes Beyond the Data Warehouse

Rich Sherman writes:

Enterprises are awash with data from customers, suppliers, employees and their operational systems. Most enterprises have data warehousing (DW) or business intelligence (BI) programs, which sometimes have been operating for many years. The DW/BI programs frequently do not provide the consistent information needed by the business because of multiple and often inconsistent lists of customers, prospects, employees, suppliers and products. Master data management (MDM) is the initiative that is needed to address the problem of inconsistent lists or dimensions.

The reality is that for many years, whether people realized it or not, the DW has served as the default MDM repository. This happened because the EDW had to reconcile and produce a master list of data for every data subject area that the business needs for performing enterprise analytics. Years before the term MDM was coined, MDM was referred to as reference data management. But DW programs have fallen short of providing effective MDM solutions for several reasons.

Interesting take on the problems faced in master data management projects. (Yes, I added index entries for MDM and “master data management.” People might look under one and not the other.)

It occurs to me that there may be transitions towards a master data list that includes understanding data systems that will eventually migrate to the master system. Topic maps could play a useful role in creating the mapping to the master system as well as finding commonalities in other systems to be migrated to the master system.

Documenting the master system with a topic map would give such a project one leg up as they say on its eventual migration to some other system.

And there are always alien data systems that have different data systems from the internal MDM system (assuming that comes to pass), which could also be mapped into the master system using topic maps. I say “assuming that comes to pass” about MDM systems because the “reference data management” if implemented, would have already solved the problems that MDM faces today.

IT services are not regarded as a project with a defined end point. After all, users expect IT services every day. And such services are necessary for any enterprise to conduct business.

Perhaps data integration should move from a “project” orientation to a “process” orientation, so that continued investment and management of the integration process is ongoing and not episodic. That would create a base for in-house expertise at data integration and a continual gathering of information and expertise to anticipate data integration issues, instead of trying to solve them in hindsight.