In this section we explain what a central facts model is and why you need it. You will find an overview of the strategies to build the Central Facts Layer and the steps you need to follow to create a central facts model.
The main purpose of the central facts model is to enable normalized temporal data storage, offer integration across different data sources and offer options to create derived facts. The central facts model is implemented using the Unified Anchor Model, which is an anchor style model with two main constructs: Anchor and Context. An Anchor can be an independent entity or a dependent entity.
To capture the data history of an entity the UAM splits the natural business key(s) from its attributes. The Anchor keeps the identifier of the entity, i.e., the business key attributes, while the Context captures the descriptive attributes and/or relationships of the entity. A Context is functionally dependent on (“belongs to”) its Anchor.
When modelling the central Facts model, you are supported by a specific i-refactory toolbox embedded in SAP Sybase Powerdesigner. In this toolbox, an Anchor is a hub (independent entity) or a link (dependent entity), and a Context is a satellite.
One important design decision when using the i-refactory with multiple data sources consists of choosing between having one or multiple central facts models in the Central Facts Layer. When i-refactory is initially set-up it is advised to make a conscious choice how to set-up your Central Fact Layer.
In general you have 3 main options how to set up your model(s) within the Central Fact Layer:
Option 1 is suitable for cases when very few sources are in scope, changes are maintained by small central team and it is accepted that all individual source-facts are coupled in one schema and one deployment. Option 2 enables most flexibility and allow for deployments per logical view of the source. Option 3 is a variant of Option 2 where it is obvious that only the grouping of different logical validation models addresses the consumption request.
When you choose having multiple central facts models, then there is one additional design decision concerning the relation between the logical validation models and the central facts models. One possible approach is to build a central facts model for each one of the available logical validation models. In this case, entities from distinct logical validation models will be represented in distinct CFM and for each CFM there will be a distinct generic data access model. However, since one central facts model can be related to several logical validation models, it is also possible to combine two or more logical validation models in the same central facts model.
In both scenarios, you should evaluate if a Concept Integration Model is required. In general, when entities from two or more data sources represent the same concept from the real world then it is advisable to create a Concept Integration Model. A concept integration model is also a central facts model and it is the place where you can find the integration patterns (key roots) of your project. A concept integration model is not mandatory, but it can be helpful when you work with several data sources with overlapping data or when the integration patterns should be shared with the members of a team, for example. When there is integration, it is a best practice to define a separate Concept Integration Model (a model that also exists in the Central Facts Layer).
{example} When to use an integration pattern?
Consider that two different data source systems (A and B) provide facts about suppliers, and in both data sources suppliers share the same identifier (for example, Supplier Nbr). If there is an overlapping between suppliers from source A and suppliers from source B, you can use an integration pattern (key root) to make sure, for example, that supplier '001' in source system A is the same as supplier '001' in source system B.
Independent from the strategy that you choose (one or multiple CFM), you can follow these steps to create a CFM. In both cases, you create the entities of the CFM based on the entities available in the logical validation models.
When you work with multiple data sources and you choose to have a separate model for the integration patterns, then you can create a Concept Integration Model.
After you've created and checked the central facts model:
The main purpose of the Logical Validation Layer (LVL) is to transform the data received from external data sources to fit into the logical data model structure. It is also responsible for validating deliveries. The Logical Validation Layer is also known as the Historical Staging In (HSTGIN) Layer.
A schema is a set of database objects, such as tables, views, triggers, stored procedures, etc. In some databases a schema is called a namespace
. A schema always belongs to one database. However, a database may have one or multiple schema's. A database administrator (DBA) can set different user permissions for each schema.
Each database represents tables internally as <schema_name>.<table_name>
, for example tpc_h.customer
. A schema helps to distinguish between tables belonging to different data sources. For example, two tables in two schema's can share the same name: tpc_h.customer
and complaints.customer
.
A Physical Data Model (PDM) represents how data will be implemented in a specific database.
{note} The i-refactory uses four PDMs: technical staging model, logical validation model, central facts model and generic access model. Each one of these models is implemented as an additional database, which is used to store data from external and internal data sources.