In the current zoo of possible data storage and analysis systems within the same organization it is sometimes necessary to collect reference data. This data is reliable, synchronized from different subsystems, normalized, deduplicated and cleaned. This problem arises in organizations where departments have been working independently for a long time, collecting data from their systems in their own format.
The technological solution to such problems is the introduction of the MDM system. MDM system or Master Data Management system is essentially a bunch of processes, standards and rules for working and storing data in a uniform way across the organization. As a result it creates so-called gold records which represent entities (can be anything, depends on business) and their relations. Gold records are the reference records — reliable data that can be use for further analytics.
Of course, an ideal system would be when all of the data from all of the sources goes through the MDM system. And then the data can be used by consumer systems to use that data.
Of course, it is usually not that simple. With little experience in building such systems I can say that there are always difficulties due to business processes in a particular organization, duplicate rules are not as simple as they seem, you often have to make copies of the data which should sort of be the benchmark and so on.
Thank you for reading!
Any questions? Leave your comment below to start fantastic discussions!
Check out my blog or come to say hi 👋 on Twitter or subscribe to my telegram channel.
Plan your best!