In this section we explain what a logical validation model is and why you need it. You will also find an overview of how to create a logical validation model.
A logical validation model is in line with what the business wants in terms of language, structure, identification and quality. It specifies the required transformations on the source data to adhere to the business requirements, adds timestamps (transaction time) to the data, and adds constraints and business rules to validate the data.
You create a logical validation model for each technical staging model in the technical layer. This way, all data received from the technical layer will be validated and transformed to be in compliance with the business requirements. The logical validation model differs from the technical staging model in four ways:
BUT000
for an entity that represents a concept named as Customers
. In general, these technical terms are meaningless and the business doesn't use them when they talk about their product. In the logical validation model, entities and attributes names should reflect the business terms.{info} It is important to talk with the stakeholders and domain experts, i.e. the business, to identify the terms and names that should be used to describe the data.
Identifying key: each entity in a logical validation model must have an identifier, while in a technical staging model this is not mandatory. A sales order for example is identified by its order number.
Normalization: in a technical staging model, concepts can be hidden in tables. This means that one single table in a technical model can hold information regarding more than one concept. In the TPCH-H technical staging model, for example, the attributes CLERK
and CLERKKEY
are part of the Orders
table. In this case, the table Orders
holds information about orders but also about clerks.
In the logical validation model, to reflect the reality of the business, it might be necessary to split tables and to create new tables. Suppose, for example, that you want to identify Clerk
as an employee, who can exist separate from Orders
. To reflect this decision, you include a new entity, called Clerk
, in the logical validation model.
Constraints: a technical staging model can be a weak typed model, without foreign keys and other relationships between entities. In the logical validation model, you should add constraints to validate the quality of the data according to the business requirements.
{info} Constraint validation is one of the key features of i-refactory, which allows you to ensure that data delivered can be trusted. The results of the constraint validation phase are recorded and can be used in the communication with the delivering tenant or for statistical purposes.
There are two common ways to create a logical validation model:
Generate from a logical data model (LDM): when a logical data model is available, you can generate the logical validation data model from the LDM. This speeds up your work and ensures you use the exact definitions as specified in the LDM. Another advantage is having lineage on all physical data models. For example, if you alter an attribute in one model, you can evaluate the impact of this change in the other model.
Create manually: when there is no LDM available, you need to manually create the logical validation model based on the business requirements. For this, you need to understand the business data request (the Need) and you also need to perform a source data analysis (the Supply).
{info} During the creation of the logical validation model, you can make implementation choices different from the ones made for the LDM. This means, for example, that the LDM can be more explicit in specifying concepts, such as defining subtypes with only inherited attributes. In the logical validation model, the subtype could be implemented as an attribute in an entity.
After you've created and checked the logical validation model:
The main purpose of the Logical Validation Layer (LVL) is to transform the data received from external data sources to fit into the logical data model structure. It is also responsible for validating deliveries. The Logical Validation Layer is also known as the Historical Staging In (HSTGIN) Layer.
A schema is a set of database objects, such as tables, views, triggers, stored procedures, etc. In some databases a schema is called a namespace
. A schema always belongs to one database. However, a database may have one or multiple schema's. A database administrator (DBA) can set different user permissions for each schema.
Each database represents tables internally as <schema_name>.<table_name>
, for example tpc_h.customer
. A schema helps to distinguish between tables belonging to different data sources. For example, two tables in two schema's can share the same name: tpc_h.customer
and complaints.customer
.
A Physical Data Model (PDM) represents how data will be implemented in a specific database.
{note} The i-refactory uses four PDMs: technical staging model, logical validation model, central facts model and generic access model. Each one of these models is implemented as an additional database, which is used to store data from external and internal data sources.