The purpose of this article is to introduce the notion of meta model used to build DataGalaxy:
Why is this notion fundamental? How is it used in Data Galaxy?
What is the Data Galaxy meta-model?
One of Data Galaxy's ambitions is to provide a consistent view of the data in their environment. This means being able to link different viewpoints/uses of the data.
Indeed, for many historical and organizational reasons, data management is a highly fragmented field. Several actors legitimately intervene on it:
- People, rather on the IT services side, will manage the storage of data, i.e. how it is implemented and modeled in databases.
- Others will manage the flows: i.e. how the data flows between the databases: format modification, consolidation, linking...
- Quite often, operational departments will have a functional understanding of these data: i.e. their meanings, use cases, validity...
- Finally, data services (BI, Analytics, etc.) manage the provision of data, whether aggregated or not and indicators in data reporting.
However, this multiplication of the number of actors, the increase in the volume and diversity of data reduces the stakeholder's capacity to understand. It is very complex to build a comprehensive and understandable view of the data.
The meta-model will, therefore, allow gateways between these different uses of the data and thus create a coherent view of the data in their environment.
Why link information to each other?
It is relevant to know where the data are stored or to have a mapping of available reports. But a much greater added value lies in the ability to connect these different points of view and combine them with others such as business meaning or transformations.
Thus, knowing that Table T_Prosp in the Third-Party database holds the ID_NAME information is interesting, but knowing how to link the information to others provides greater added value. For example:
- Linked to the glossary module, we could know that it is the Name of the prospect, i.e. a person whose identifiers have been retrieved during a visit to our website.
- Linked to the processing module, we could know that it is consolidated with our clients' database in order to verify the presence of duplicates.
- Linked to the usage module, we would know that this information is used in the context of an extraction made available to our partners.
The objective is, therefore, to be able to link the information through the different modules available to DataGalaxy in order to:
- Have a 360° vision of the data: meaning, storage, processing, exploitation...
- Link the different concepts together.
- Identify management impacts.
How to manage these gateways?
The solution allows managing a very fine granularity for attachments. This granularity must be adapted according to:
- Your level of maturity in data management: If this is your first mapping, starting a field-to-field mapping may be time-consuming for a proportionally lower added value than mapping by "big potato". Moreover, you may quickly be discouraged!
- Your ability to bring these connections to life: The data is in constant evolution. Evolving field-to-field mapping can quickly become a major burden. The ability to maintain is, therefore, a major criterion to be taken into account.
- The need as expressed in the uses of the solution's operating boxes: there is no need to map all fields very finely to flows if the goal is above all to know what the business meaning of a given data is.
- The expected added value: there is no need to link databases that are not very strategic, for example.
Datagalaxy will allow you to link:
- The physical location of the data together with its meaning, flow and uses: i.e. the objects of the dictionary module with those of the other 3 modules.
- Data definitions with their exploitation: i.e. the objects of the usages' module with those of the glossary module.
The relationships between objects are explained in this article and how to do it in this one