What is Master Data Management?

by Nicolae Guse 18. December 2011 22:54

Before going to the Master Data Management topic in detail, i think we should make a step back and try to understand how the concept of Master Data appeared, and where does it fit inside a company IT infrastructure.

Since I’ve always liked a good story, let’s think a bit to the following one:

·         There was once a small company, driven by a few enthusiastic people who really believe that can make a great product which can really have an impact. At the beginning, the IT system for the company was very simple, and covering just their basic needs, and allowed them to move fast on the market, since it had to cover just a few scenarios.

Since money was a scares commodity at the beginning, part of the IT tools used were freeware products, part of them bought and part of them build by the very people that put the company together. It's that stage where everyone is doing everything, in order to kick-start the whole business. This means that, at this initial stage, the IT system doesn't have to be perfect, but only good enough, just the bare necessities which you strictly need to cover in order to build and sale a great product.

·         The company starts to grow, since their initial product sales are very good. Then they realize that they can expand their portfolio, in order to manufacture even more great products. And this means hiring more people, more equipment and more management challenges. Because when your company grows, you must make sure that your organization changes in order to accommodate this growth. Which typically means that your IT system will have to accommodate this rapid growth. Since the company now has more money in the bank, they can really start to hire professionals to take care of this IT system.

They can also buy other tools but, typically strictly when they need it, because most of the cash are now going into this expansion of this product portfolio and to sustain the company growth. When you need something new, and you'll need it very often, you'll go to your IT team and ask them to do their magic. It really doesn't matter if some times, they'll have to re-invent the wheel, when there can be more generic tools on the market, because you can't afford those tools yet. This is the stage were you have all those distinct teams formed within the company: Marketing, Accounting, Finance, Production, Sales. Each one of these teams is made by fresh people, with a lot of ideas, which need to be implemented ASAP. And this is when those IT professionals come into the picture: since everyone has a project, you'll start to have one application created more or less for each project initiated by each team. Which means that, at the end, you'll have pretty fast the tools you need to sustain your company growth. And yes, they may not be as consistent as you would like, and the way in which these applications are sharing information could definitely be improved but, at this stage, they do their job.

·         The company has grown to become one of the emerging names on the market. Their products are appreciated by their customers who perceive them as one of the trendsetters. You've went from a small company to a medium sized one, which now has expanded to several countries. You realized that there are a lot more opportunities to exploit, and now you have the cash to go that one step further. It's one of those moments when your organization needs to be resized once more, in order to confront these new challenges. And this is when you start to realized that, the IT tools which helped you expand from small to medium sized are not really prepared to make the next leap. This is when you really start to invest into the IT infrastructure, by buying a respectable ERP and really start to invest in Business Intelligence tools. Because you've reached the stage when you need to make informed decisions, and there are just too many things going.

Still, you have a problem when you realize that the same business concepts, like Customers and Products, do have subtle differences in their implementation in the various tools. Much too often all these tools have their own set of referential, which means that there is no magical way to propagate changes consistently through the whole applications managed by the IT team. When you were smaller, that was considered good enough, only to realize now that the same system is now not enough in order to sustain the company growth. There are a lot of inconsistencies between the actual values of the same referential data between applications. And you waste a lot of time to handle these inconsistencies. Something must be done, since all these problems can really cause a slowdown of the whole company growth.

And this is where this concept of Master Data Management (MDM) comes into place. Master Data Management is all about centralized management of the critical referential data which must have a common vision through the company. Referential like Customers and Products are great examples of such master data. You should be aware that not all the referential data within your company is master data. If some of this data are departmental specific, and doesn't make sense to share it with other departments, then this is not master data. What really separates the master data from other regular referential data is the fact that this data is shared by different departments, and that's really necessary to be perfectly synchronized between all the applications that use it.

When really speaking about implementing a Master Data Management solution within the IT system, we are really speaking about the following components:

1.       Master Data Management Application

2.       Master Data Management Distribution

3.       Master Data Management Consumption

4.       Master Data Management Reporting

Master Data Management Application


It's a specialized software tool which will be used to manage all aspects of your master data. Ideally, it should be the only point of entry for all your master data within your company. Normally, implementing such a tool involves more or less the following stages:

1.       Selecting the Software Vendor and the Implementation company

2.       Determining the referential data which will become master data

3.       Determine who is the business responsible for each hierarchy implemented - there is a lot of work to be performed by these responsible in order to mitigate the differences between each department particular view of the same concept

4.       Clean-up the referential data before the data load - this is where those business responsible will really prove quite handy

5.       Determine which master data will be managed by the Master Data Management Application and which one will continue to be managed by other applications. For example, you might have something like a specialized Contract application, with very complex rules implemented. You'll want that application to remain the single point of entry for the Contract information, but you also want this information into the Master Data Management Application, which will distribute it to all other applications. To handle this scenario, Acquisition Flows with have to be developed.

6.       Implementing Data Validation and Governance rules - this will help you avoid data consistency issues.

7.       Application initialization - make sure that the application supports initialization based on text files for all implemented hierarchies. This will be very useful later on, when you'll need to perform mass updates, when going via a standard application interface and  performing the modifications row by row is really not an option

Master Data Management Distribution


It's the layer from the Master Data Management solution in charge of:

1.       Import the Master Data from the MDM Application and storing it into a Central Master Data Repository used for distribution to all distribution applications. Data will be loaded on an incremental basis.

2.       Adapt Master Data to Destination Application data model - a critical requirement for a successful Master Data Management implementation is to be able to distribute the MDM data with minimal or no modifications to destination applications.

This is essential in order to isolate the destination applications from the complexity of the MDM data model, which, if not handled, would result in dramatic changes to destination applications. Basically, the destination applications would want to receive the data in exactly the same format as before. This means that it's under the direct responsibility of the Distribution Layer to adapt the Master Data Model to the specific of each application. Sometimes, the destination application will accept minimal modifications, in the form of staging tables, as close as possible to the final application tables. These Staging tables will be used as a destination by the Distribution Layer

3.       Export Master Data to the Destination Application - responsible for distribution of the data to the distribution applications. Must be very flexible in order to handle distribution in very different formats:

·         Relational Database systems: Microsoft SQL Server; Oracle

·         Flat text files

·         XML

·         Web Services

This flexibility of the Export layer extends beyond handling different formats, since it also must handle different distribution schedules. Some of the applications will want to receive data as soon at it has been modified in the Master Data Management Application, while others will want to receive the updated data in just one batch, once a day. Additionally, the Export Layer should also integrate a Data Compare Mechanism, which would validate that all data send has really reached the Destination Application. Only after this Data Validation the Destination Application should be notified that an updated set of data is available for integration.

There are a lot of challenges related to the Master Data Management Distribution, and one of the most critical ones is the choice of the distribution method, since it has a very important impact over the data consistency:

1.       PUSH- data is being pushed by the Distribution Layer to all destination applications. No applications have direct access to the Central Master Data Repository.

a.       Advantages: 

                                                   i.      The distributed data has the best consistency, since it's distributed only AFTER the Import from Master Data Management Application has been finished

                                                 ii.      The work of adapting the MDM Data Model to the Destination application is made by the MDM Team, which knows best the logic of the application, based on templates provided by the Destination application teams.

b.      Disadvantages:

                                                   i.      Involves direct access of the Master Data Management Distribution layer to the destination application tables. Creation of Staging Tables in the destination application database and the development of integration flows by the destination application teams will involve some development from their part, even if they receive the data in the same format they've requested. It's the most restrictive distribution model.

2.       PULL - data is being pulled by the Destination Applications directly from Central Master Data Repository.

a.       Advantages:

                                                   i.      You don't have to address the complexity of Adapting Master Data to Destination Application data model in the Distribution Layer. It's up to each application development team to adapt this data, as they know best what's happening in there, while the MDM team does not

b.      Disadvantages:

                                                   i.      The Destination application teams may know very well their application, but they'll have a very limited knowledge of the Master Data Management Data Model. Which constitutes a significant risk over the data quality.

                                                 ii.      Data Consistency issues are very likely to occur, when there is an intersection between the Import the Master Data from the MDM Application into Central Master Data Repository and the import performed by the Destination Applications

3.       PUSH & PULL - data is simultaneously being pushed by the MDM Distribution Layer to some destination applications, while others applications are pulling data directly from Central Master Data Repository.

a.       Advantages: The most flexible distribution model of all

b.      Disadvantages: The most complex distribution model of all

Normally, this type of distribution model would be your target model, since will allow some destination applications to get access very fast to the MDM data, with little or no development work from the MDM team. However, the biggest challenge will be to ensure data consistency, due to intersection of the PULL operations with ongoing Master Data import to the Central Master Data Repository. Data for ongoing Master Data import will have to be hidden to the Destination Applications, until all necessary tables have been refreshed, and only then the final data is being published. Also, all destination applications using PULL approach will have to extract first the Latest Public Version of Master Data, and use only this version to perform their data extracts. Any failure to implement such mechanism will result in inconsistent data.

Master Data Management Consumption


It's the layer from the Master Data Management solution in charge of integrating the Master Data into the destination application final tables. Either way you put it, this is the job handled by the Destination Applications development team, from data either pushed to the Staging Tables by the MDM team, or pulled by the application directly into the final application tables. This is where the rubber really meets the road or where the applications users really see the changes from the Master Data Management Application and say IT WORKS!

Master Data Management Reporting


There are a lot of people within you company which need access of the hierarchies implemented into the Master Data Management Application, and only a small portion of do have the rights of creating or modifying master data. For all those people which need just read access to the Master Data, a reporting solution should be constructed, by using the tools deployed by the BI Department. In this way, you avoid over-complicating the access to the Master Data Management Application, while offering the final users the familiarity and the flexibility of the standard reporting tool.


As a conclusion of this article, I would say that Master Data Management is essential within a company, by ensuring data consistency for the critical referential data thru the entire IT system. Even if sometimes you’ll get nostalgic about the days when things were simple enough to not need Master Data Management in the first place.

And now, back to you: Do you have a Master Data Management solution implemented in your company? Which were the main challenges related to this implementation?

As this is the last article for this year, I’ll wish you a Merry Christmas and a Happy New Year. And don't forget: as long as you have passion in your life, things will work out fine.

See you next year.





Data Warehouse Best Practices

Add comment

  • Comment
  • Preview



The ideas you'll find on this site represent the opinion of their owner. They are not either about or being endorsed by a specific company. This site is a place of sharing ideas and trying to find together best practices on Data Warehouse and Business Intelligence projects.

Some of the technical approaches will be praized, some of them will be criticized, based on the concepts presented by their owners. You shoudn't feel offended if some of the ideas which either you or your company consider best practices, here are being criticized. You should also realize that what some people consider best practices on a particular system and business scenario, can be worst practices on another.  If you don't agree with a particular idea, feel free to comment it in a civilized manner.

Only by crashing different ideas and experiences we will find a better way of doing Data Warehouse and Business Intelligence projects.