Install i-refactory Server

2023-10-19

In this section we will guide you through the setup and configuration of the i-refactory server, the web application and optionally the OAuth2 server. The Solution Overview briefly introduces what these components do and how they interact.

Before you begin

Before you continue installing i-refactory server please make sure the following prerequisites are met.

1. A host system is available

You should have system administrator access to a host system on which NodeJS is supported. The host system should have at least the following minimal resources.

  • Minimal 2 Gbyte of internal memory. In extreme high load scenario's you should increase the amount of memory.
  • A dedicated fast quad core CPU.
  • A low latency network connection to your database server.

{info} As a best practice you should install a single i-refactory server per (virtual) host.

2. Required SSL certificates are available

Our i-refactory servers run as HTTPS servers. You need to have a valid SSL certificate and SSL key file available. Our servers need read access to these files. You should check with your security officer regarding the policies of generating SSL certificates.

For development and testing you could create a self signed certificate. If you have openssl available here is an example how to create a certificate for the localhost domain.

openssl req \
    -newkey rsa:2048 \
    -x509 \
    -nodes \
    -keyout server.key \
    -new \
    -out server.crt \
    -subj /CN=localhost \
    -reqexts SAN \
    -extensions SAN \
    -config <(cat /System/Library/OpenSSL/openssl.cnf \
        <(printf '[SAN]\nsubjectAltName=DNS:localhost')) \
    -sha256 \
    -days 365

This command generates two files server.key and server.crt.

The client operating system or browser now needs to have the CA certificate added to its list of trusted root authority certificates. The instructions vary by operating system and browser but instructions for a few major clients are listed below. For all these steps the 'certificate' referred to is the 'server.crt'.

Client Instruction
Windows Right click the server.crt certificate file and select 'Install Certificate'. Follow the prompts to add the certificate to the trust store either for the current user only or all users of the computer.
Mac Open KeyChain and drag the file server.crt into KeyChain. Set the certificate to trusted.
Linux - Ubuntu sudo cp ~/server.crt/usr/local/share/ca-certificates/
sudo update-ca-certificates

See Ubunbtu help for more information.

3. You have access to the SQL Server instance

Your DBA should provide you with:

  • The hostname and port of the SQL Server instance where the i-refactory metadata repository is installed.
  • The application username and password with which you can connect to the SQL Server instance.
  • The name of the database where the i-refactory metadata repository is installed.

4. OAuth2 Server

The i-Refactory requires an Identity and Access Management (IAM) system that supports the OAuth2 protocol for the authentication and authorization. In the absence of an external IAM system, the i-Refactory comes with a slim OAuth2 server.

External Identity and Access Management system

When using an external Identity and Access Management system, it needs to support the OAuth2 protocol to integrate it with the i-Refactory. For the configuration you need to require the following information:

  • Authorization endpoint URI. This is the URI of your OAuth2 server we will use in our web application for an authorization request.
  • Token endpoint URI. This is the URI of your OAuth2 server we will use for a token request.
  • The public key file, the location and the encryption method used, or the JSON Web Key Set (JWKS) uri containing the public keys. Our i-refactory server needs access to the public key(s) so it can decrypt the access token and check it's validity.
  • Whether the authorization server mandates the use of the Proof Key for Code Exchange (PKCE) extension in the Authorization code grant flow.
  • The scope value the authorization server expects in order to request the permissions (roles) of the user.
  • Which access token claim the UI server should utilize for displaying the authenticated user
  • (optional) The expected value that the OAuth server sets in the issuer & audience claim of the access token.
  • (optional) The Sign out endpoint URI. This is the URI of the OAuth2 server we will use for a sign out request.

i-Refactory OAuth2 server

For using the i-refactory OAuth2 server we need to have read access to a private and public RSA key pair.

There are many ways to create RSA keys. OpenSSL is one of the most popular libraries for key creation and management:

# Generate a private key
openssl genpkey -algorithm RSA -out private_key.pem -pkeyopt rsa_keygen_bits:2048
# Derive the public key from the private key
openssl rsa -pubout -in private_key.pem -out public_key.pem

The private key file private_key.pem will be used in our OAuth2 server to sign the JWT token. The publicy key file public_key.pem will be used in our i-refactory server to check if the generated JWT token is not tampered with.

5. NodeJS

The i-refactory server, i-refactory OAuth2 server and i-refactory web application are all build in NodeJS. NodeJS applications can be run on almost any platform. All of these components run as HTTPS servers.

The i-refactory servers always run against a Long Term Support release of NodeJS. In our Release Notes Overview you can see which NodeJS version is required for the given i-refactory server release.

You can download the LTS version of NodeJS at Download NodeJS. To download a previous LTS version check this Download Previous Releases of NodeJS.

Follow the install instructions for your platform.

You can check if NodeJS is available by entering the following command in your terminal:

node --version

It should return the installed LTS version, for example: v12.14.1.

6. ODBC driver

The i-refactory uses an ODBC for the file based export functionality.

Microsoft provides a installation guide for the SQL Server ODBC driver (manager) for various platforms Microsoft ODBC Driver for SQL Server.

Follow the install instructions for your platform.

You can check if ODBC Driver Manager and the ODBC driver for SQL Server is available by entering the following command in your terminal:

odbcinst -q -d

It should return the installed ODBC for SQL Server version, for example: ODBC Driver 18 for SQL Server.

7. Configure the DSN

The connection properties for the ODBC driver are specified in the DSN. The i-Refactory will reference to a DSN to establish the connection to the database.

A DSN can be on system, user or file based level defined. The different types are explained in the Microsoft documentation: Documentation DSN

Depended on the Operating System and additional installed tools the DSN can be configured using a GUI. If not installed, it can configured using a text editor. The following command can be used to obtain the location of the DSN config files.

odbcinst -j

The documentation of DSN parameters for SQL Server are specified on Microsoft ODBC DSN parameters.

An example of the DSN configuration,

[dsn-sql-server-db]                     # Name of the DSN
Driver=ODBC Driver 18 for SQL Server    # The used SQL Server driver.
Server=sql-server-db                    # The host address of the SQL Server
APP=iRefactory export                   # Name of the application.
ApplicationIntent=ReadOnly              # Informs SQL Server that it can be routed to a read-only replica (if exists). 
Database=IR_GDAL                        # The database to connect to.
TrustServerCertificate=yes              # To trust self-signed certificates. 

i-Refactory installation steps

1. Create a folder

You should create a folder on your host filesystem where we are going to install the i-refactory server software and where you can store your configuration files and optionally store your SSL certificates and required key files.

The folder structure we suggest:

i-refactory\
    config\
    crypto\
    dist\

In the config folder you should store your configuration files.

In the crypto folder you should store the SSL certificate and key file and the public RSA key file.

2. Unpack i-refactory server folder

From the provided installation zip file unpack the folder i-refactory-engine to the folder i-refactory\dist.

3. Create a configuration file

Our i-refactory server requires a configuration file. The configuration file is a JSON document which contains mandatory and optional configuration settings.

We have provided a sample configuration file which you can find in i-refactory\dist.

  • config.example.json: This configuration file can be used as a template. Copy it to a convenient location and filename.

4. Modify the configuration file

With a text editor of choice edit your configuration file. When you have opened the file you will see something like this:

{
    "boot": {
        "database": {
            "code": "IREFACTORY",
            "description": "i-refactory metadata repository",
            "connection": {
                "driverConnectionProperties": {
                    "server": "",
                    "authentication": {
                        "options": {
                            "userName": "",
                            "password": ""
                        }
                    }
                }
            }
        }
    },
    "httpRestApi": {
        "enabled": true,
        "https": {
            "port": 3000,
            "host": "localhost",
            "key": "crypto/ssl.key",
            "cert": "crypto/ssl.crt"
        },
        "accessToken": {
            "publicKey": "crypto/key_public.pem",
            "signatureAlgorithm": "RS256"
        }
    },
    ...

The configuration file is in JSON format.

Follow the steps below to change your configuration setting.

Step a. Set the connection parameters for SQL Server

In your editor go to the section "boot". You should see something similar like below.

{
    "boot": {
        "database": {
            "code": "IREFACTORY",
            "description": "i-refactory metadata repository",
            "connection": {
                "driverConnectionProperties": {
                    "server": "",
                    "authentication": {
                        "options": {
                            "userName": "",
                            "password": ""
                        }
                    }
                }
            }
        }
    }
}

Change the following configuration settings:

parameter instruction
code Change the default value to the name of the database where the i-refactory metadata repository is installed.
server Set the value to hostname where your SQL Server Instance is running.
userName Set the value to the SQL Server login name who is authorized to access the i-refactory metadata repository.
password Set the value to the SQL Server login password.

Step b. Configure the i-refactory rest API server settings

In your editor go to section "httpRestApi" You should see something similar like below.

{
    "httpRestApi": {
        "https": {
            "host": "localhost",
            "port": 3000,
            "key": "",
            "cert": ""
        },
        "accessToken": {
            "publicKey": "crypto/key_public.pem",
            "signatureAlgorithm": "RS256"
        }
    }
}

Change the following configuration settings:

parameter instruction
host Set the hostname of the i-refactory rest API server. Defaults to localhost.
port Set the port number of the i-refactory server. Defaults to 3000.
key Specify the file location including the filename of the SSL key. You should use your generated or provided SSL key filename.
cert Specify the file location including the filename of the SSL certificate. You should use your generated or provided SSL certificate filename.
publicKey Specify the file location including the filename of the public key filename with which we can verify the validity of the OAuth2 token provided. You should use your generated public key filename for this.
jwksUri* As an alternative for the publickey parameter, specify the JSON Web Key Set (JWKS) url. The i-Refactory will automatically extract and refresh the public key(s) from the provided url.**
issuer* Optional, specify the expected value for the issuer claim in the access token that is provided to the i-Refactory API server. Tokens with an invalid value will be rejected.
audience* Optional, specify the expected value for the audience claim in the access token that is provided to the i-Refactory API server. Tokens with an invalid value will be rejected.
signatureAlgorithm Specify the signature algorithm used. If you use the i-refactory provided OAuth2 server you should keep the default setting RS256.

{note} *Parameters are not applicable for the i-Refactory OAuth2 server and should be left undefined. Use these parameters only in combination with an external OAuth server when applicable. {note} **When the address can only be reached through a proxy server, set the https_proxy environment variable with the address of the proxy server.

Microsoft Azure AD integration

Guidelines for setting the accessToken section of the config file when integrating the i-Refactory server with Microsoft Azure AD.

The example below is based on the v2 common Well-Known Configuration Endpoint of Microsoft, replace with your own values. Remove the comments when used in your config file, JSON does not support comments.

{
    "httpRestApi": {
        ...,
        "accessToken": {
            "jwksUri": "https://login.microsoftonline.com/common/discovery/v2.0/keys", // The value for the key "jwks_uri" in your Well-Known Configuration Endpoint.
            "issuer": "https://login.microsoftonline.com/{tenantid}/v2.0", // The value for the key "issuer", in your Well-Known Configuration Endpoint.
            "audience": "{Application (client) ID}", // The application/client ID of the `i-Refactory API` app registration in Microsoft Azure AD.
            "signatureAlgorithm": "RS256" // The value for the key "id_token_signing_alg_values_supported" in your Well-Known Configuration Endpoint, assuming there is only a single algorithm is listed.
        }
    }
}

For more information visit the Microsoft Azure AD Documentation: https://learn.microsoft.com/en-us/azure/active-directory/develop/access-tokens

Step c. Configure the Swagger server settings

In your editor go to section "httpRestApi" You should see something similar like below.

{
    "swaggerServer": {
        "enabled": true,
        "https": {
            "host": "localhost",
            "port": 3001,
            "key": "",
            "cert": ""
        }
    }
}

Change the following configuration settings:

parameter instruction
enabled If you set this value to true a Swagger server is started. You can access the Swagger documentation of our i-refactory Rest API via the browser. If you set this value to false we do not start a Swagger server and you cannot use your browser to view the i-refactory Rest API specification.
host Set the hostname of the Swagger server. Defaults to localhost.
port Set the port number of the i-refactory server. Defaults to 3001.
key Specify the file location including the filename of the SSL key. You should use your generated or provided SSL key filename. You can use the same SSL key as for the i-refactory Rest API server or opt for a different one.
cert Specify the file location including the filename of the SSL certificate. You should use your generated or provided SSL certificate filename. You can use the same SSL key as for the i-refactory Rest API server or opt for a different one.

Step d. Configure the UI server settings

In your editor go to section "httpRestApi" You should see something similar like below.

{
    "uiServer": {
        "enabled": true,
        "https": {
            "host": "localhost",
            "port": 3002,
            "key": "",
            "cert": ""
        },
        "apiUrl": "https://localhost:3000",
        "authorizationServer": {
            "clientId": "i-refactory-ui",
            "pkceEnabled": false,
            "scopeName": "roles",
            "claimName": "username",
            "authorizationEndPointUri": "https://localhost:3003/authorize",
            "tokenEndPointUri": "https://localhost:3003/token"
        }
    }
}

Change the following configuration settings:

parameter instruction
enabled If you set this value to true a UI server is started. You can access the i-refactory web application via the browser. If you set this value to false we do not start a UI server and you cannot use our i-refactory web application.
host Set the hostname of the UI server. Defaults to localhost.
port Set the port number of the UI server. Defaults to 3002.
key Specify the file location including the filename of the SSL key. You should use your generated or provided SSL key filename. You can use the same SSL key as for the i-refactory Rest API server or opt for a different one.
cert Specify the file location including the filename of the SSL certificate. You should use your generated or provided SSL certificate filename. You can use the same SSL key as for the i-refactory Rest API server or opt for a different one.
apiURL The web application sends request to the i-refactory Rest API server. It needs to know the URL where to reach the i-refactory Rest API server. Set the value to https:\\ + the host you specified in httpRestApi + ':' + the port you specified in httpRestApi.
clientId The web application needs to be identified by a unique id. This is required for OAuth2. Use the default value i-refactory-ui or change the value to a new string value. You will need this unique id later on when configuring OAuth2.
pkceEnabled* Specify whether the authorization server mandates the use of the Proof Key for Code Exchange (PKCE) extension in the Authorization code grant flow. Defaults to false.
scopeName* Specify the scope value that the authorization server requires in order to request the permissions (roles) of the user. Defaults to roles.
claimName Specify the access token claim the UI server should utilize for displaying the authenticated user. Defaults to username.
authorizationEndPointUri Specify the url of the OAuth2 server authorization end point.
tokenEndPointUri Specify the url of the OAuth2 server token end point.
signOutEndPointUri* Specify the url of the OAuth2 server sign out end point.

{note}*Parameters are not applicable for the i-Refactory OAuth2 server and should be left undefined. Use these parameters only in combination with your own OAuth server when applicable.

Microsoft Azure AD integration

Guidelines for setting the authorizationServer section of the config file when integrating the i-Refactory server with Microsoft Azure AD:

  • The clientId is the Application (client) ID of the i-Refactory GUI App registrations page in Microsoft Azure AD.
  • The pkceEnabled should be set to true, since Microsoft mandates the PKCE extension for the authorization code grant flow.
  • The scopeName should be set to {Application ID URI}/.default. The Application ID URI can be retrieved from the Expose an API page of the i-Refactory API App registrations section in Microsoft Azure AD.
  • The claimName should be set to access token claim the UI server should utilize for displaying the authenticated user.
  • The authorizationEndPointUri, tokenEndPointUri and signOutEndPointUri parameters can be obtained from the Well-Known Configuration Endpoint of your tenant in Microsoft. Visit https://login.microsoftonline.com/{tenantid}/v2.0/.well-known/openid-configuration with the tenant ID of your Microsoft Azure AD environment.

The example below is based on the v2 common Well-Known Configuration Endpoint of Microsoft, replace with your own values. Remove the comments when used in your config file, JSON does not support comments.

{
    "uiServer": {
        ...,
        "authorizationServer": {
            "clientId": "{Application (client) ID}", // The application/client ID of the `i-Refactory GUI` app registration in Microsoft Azure AD.
            "pkceEnabled": true, // Microsoft mandates the PKCE extension for the authorization code grant flow.
            "scopeName": "{Application ID URI}/.default", // The application/client ID of the `i-Refactory API` app registration in Microsoft Azure AD.
            "claimName": "email", // As example the claim "email" is used, this does require the `email` API permission of Microsoft Graph.
            "authorizationEndPointUri": "https://login.microsoftonline.com/common/oauth2/v2.0/authorize", // The value for the key "authorization_endpoint", in your Well-Known Configuration Endpoint.
            "tokenEndPointUri": "https://login.microsoftonline.com/common/oauth2/v2.0/token", // The value for the key "token_endpoint", in your Well-Known Configuration Endpoint.
            "signOutEndPointUri": "https://login.microsoftonline.com/common/oauth2/logout" // The value for the key "end_session_endpoint", in your Well-Known Configuration Endpoint.
        }
    }
}

Step e. Configure the export delivery settings

In your editor go to azureCredentials and exportSettings. You should see something similar like below.

{
    "azureCredential": {
        "clientId": "",
        "clientSecret": "",
        "tenantId": ""
    },
    "exportSettings": {
        "dataSourceName": "",
        "options": {
            "authentication": {
                "userName": "",
                "password": ""
            },
            "odbc2parquetBinaryPath": "",
            "maxColumnSize": 16000,
            "maxMemorySize": "2GB",
            "maxRowGroupSize": 1000000
        }
    },
}

The exportResourceSettings are required from Release 3.8, and will be defaulted as shown in the JSON example above. The export functionality uses a multipart-upload of data to files in order to limit the effect on available system resources. The following configuration settings are required :

parameter instruction
dataSourceName The name that is used to request a connection to an ODBC data source. The DSN registration contain the connection details.
userName When the selected authentication method in the DSN requires a username, this property must be specified. Any configured username in the DSN is overwritten.
password When the selected authentication method in the DSN requires a password, this property must be specified. Any configured password in the DSN is overwritten.
odbc2parquetBinaryPath Depending on the operating system and system architecture, the correct binary must be used. The system tries to automatically select the required binary. The available binaries are: darwin x64 & arm64, linux x64 & arm64, and win32 x64. Any other combination needs to be configured, by specifying the path to the required binary.
maxColumnSize Used to restrict size of columns with types of unlimited column size like nvarchar(max), varchar(max) and varbinary(max). The default is set to 16.000.
maxRowGroupSize This settings defines the maximum number of rows per row group of an entity. The default is set to 1.000.000
maxMemorySize The required amount of memory is calculated based on the number of rows to export, or max row group size if exceeded or when row count is disabled. This setting caps the memory allocation by reducing the row group size to fit within the specified memory size. As default set to 2GB, but can be adjusted using SI units.

{note} *Contact i-Refact support how to obtain a binary for other Operating System and system architecture.

Only if you want to use the option to export files to an Azure Container, it is required to configure azureCredentials. The access to Azure Containers is managed through OAuth with Service Principal authentication method. Change the following configuration settings:

parameter instruction
tenantID The identification of the Azure Tenant ID, also known as directory ID
clientId The client that represent the i-refactory application with enough rights to create folders and write files to container(s) within Azure tenant
clientSecret The client secret, if you need some help you can find it here

Step f. Configure the application settings

There are also some application settings not provided in the config example file. The majority of these settings are advanced settings that require in-depth knowledge of how the application works.

We advise to only opt for modifying two specific settings as shown here :

{
    "application": {
        "modifierIdProperty": "sub",
        "disabledRoles": [],
    }
}
parameter default instruction
modifierIdProperty sub The property in the OAuth2 token which should be used as the string to register in acm_modifier_id. The default setting for modifierIdProperty (sub) is by default not a valid claim in a Microsoft Azure Access token : either include the sub as a custom claim or alter this property to a claim that is present in the access token (for instance: oid).
disabledRoles empty array The roles that should be disabled. The disabled roles are ignored when it's checked if a user is authorized to execute a request. Developer role is currently the only allowed role to disable.
Advanced application settings

Below you find additional advanced settings. The default value is generally the best option. When considering changes it is strongly advised to consult i-Refact support.

parameter instruction
name The name of the application.
interfaceCode The code of the interface for which the i-refactory server should be started.
shutdownCheckInterval The amount of time in milliseconds to wait between subsequent checks whether the application can be stopped, when a graceful shutdown of the application is requested.
cache.cachedEntities A list of the entities that are cached in memory.
cache.keepDeletedRecordsTime The amount of time in milliseconds that a deleted record is kept in memory to assure that changes that have been committed shortly after each other are processed in the correct order.
lockManager.lockWaitTimeOut The amount of time in milliseconds to wait for a lock. If the lock cannot be obtained within the specified time period an error is returned to the requestor.
lockManager.lockWaitTimeOutCheckInterval The amount of time in milliseconds to wait between subsequent checks to obtain a lock.
lockManager.lockPendingWarningInterval The amount of time in milliseconds after which a log message is written to the logfile if a pending lock request is not honoured and not timed out

Constraint violation actions are applicable to certain constraint categories. Not all combinations of constraint categories and violation actions are allowed.

An attribute must have a value, whatever that value may be. It must not be NULL.

A data type of an attribute defines what value an attribute can hold. The data type specifies what type of mathematical, relational, or logical operations can be applied to it without causing an error.

An attribute datatype constraint is the most basic constraint type. It checks for the datatypes we support and have implemented.

For example, we check for string, length of string, integer, date, etc. In the following figure you can see the supported data types by PowerDesigner.

Image is omitted: Supported data types

Constraints can be violated and there are some actions that can be performed when a violation occurs. The possible actions are: EMPTY COLUMN, NO ACTION and SKIP ROW.

An attribute value constraint is an expression that is evaluated. The person who writes the expression is responsible for the correctness of it. The expression should be formulated in a positive way and lead to a Boolean answer. If the expression validates to True, than the value is correct.

Examples

  • The values in attribute X has to be bigger than 10: X > 10
  • The email address has to be in a certain pattern: email address LIKE '%_@_%.__%'

A Concept Integration Model is also a central facts model on which you place integration patterns. It is not required to create a concept integration model, but it can be very useful.

Every constraint is assigned to a constraint classification.

The main purposes of the Generic Data Access Layer (GDAL) are to provide logical perspectives for data consumption and to manage CRUD actions.

A generic data access model is a virtual data model that acts as an interface bridge between consumer applications and the central fact storage.

Every attribute is assigned to an attribute classification.

An entity record constraint checks whether an attribute meets the requirements set by another attribute belonging to the same entity.

The main purpose of the Logical Validation Layer (LVL) is to transform the data received from external data sources to fit into the logical data model structure. It is also responsible for validating deliveries. The Logical Validation Layer is also known as the Historical Staging In (HSTGIN) Layer.

The logical validation model is the representation of a single external data source in a logical format. It represent how data delivered by a specific tenant should be transformed, temporalized and validated in the {popup}logical validation layer. The logical validation model is also known as Historical Staging model (HSTGIN).

Multi-active attributes are attributes that contain a business key to provide multiple context records at the same time. For example: a customer has multiple types of phone numbers. “Home”, “Work” and “Mobile”. In that case we add a dependent entity on customer with key “Phone Nbr Type”. This is to prepare for the CFPL multi-active key on customer.

The main purpose of the Technical Staging Layer (TSL) is to create a common starting point for further data processing. It receives data delivered from external data sources and temporally stores them in a database. The input data should be in a tabular format (rows and columns).

Bi-temporal attribute is an attribute that changes over time: they follow a valid timeline. For example, a Part may have a price valid for December and a price valid for January.

Every entity is assigned to an entity classification and to a parent entity classification. The possible values for entity classification are: ALTERNATE KEY CONTEXT, ATTRIBUTE CONTEXT, GENERALIZATION,HELPER, REFERENCE CONTEXT, STABLE, STABLE DEPENDENT and STABLE INDEPENDENT

Entity Set Constraint An entity set constraint can be used to perform a check concerning values of two or more attributes that belong to different entities or to perform a check concerning the value of an attribute with respect to a set of values.

A Set Constraint Helper is a helper in the logical validation model. It is the implementation of a set constraint. The helper returns the records of an entity for a given set constraint, where the instances of this entity do not meet the definition of this set constraint.

The business requirements describe how data should be delivered for the data consumers (end users or applications) in terms of concepts, relationships between concepts and constraints to validate the data. These requirements can be described in a logical data model, for example.

A Business Rule Helper is a helper in the central facts model. It is a set-based calculation of derived facts. You need to use a Business Rule Helper if you want to make a calculation and want to keep a transaction history of the results of this calculation. You use the existing entities from the central facts model as input. The results of the helper must be materialized in 'regular' fact entities, such as Anchors and Contexts, to make them accessible in the Generic Data Access Layer.

Closed Open means that the timeline is valid from (vanaf in Dutch) the supplied valid start date until - but not including - (tot in Dutch) the supplied valid end date. In practice, this means that the start date of a valid time record is equal to the end date of the previous valid time record.

You need to create context-based entities when a set of data may be delivered within the boundaries of a parent context. A context-based entity applies when:

  • At least 2 entities are delivered.
  • A context relationship exists between these 2 entities. One entity is the parent context of the other entity.
  • The parent context entity is delivered as a delta and the child entity is delivered as a full set.

You need to create context-based entities when a set of data may be delivered within the boundaries of a parent context. A context-based entity applies when:

  • At least 2 entities are delivered.
  • A context relationship exists between these 2 entities. One entity is the parent context of the other entity.
  • The parent context entity is delivered as a delta and the child entity is delivered as a full set.

The Management Model contains the PowerDesigner objects for the Unified Anchor Modelling (UAM). When a UAM object is created, a so-called PowerDesigner replica of the corresponding Management Model object is created. This means that certain properties such as metadata columns and column stereotypes are configured in the Management Model and cannot be changed. The replication settings specify which elements of an object can be changed after creating a replica from the template object. It is possible to override the replication settings of an UAM object and change a specific property.

The temporal atomic type describes the datatype of the temporal attributes|

The main purposes of the Central Facts Layer (CFL) is to store data historically. It can also integrate data from different sources. The Central Facts Layer is also known as Central Facts Persistency Layer (CFPL)

The central facts persistence implementation model is the representation of facts in an anchorized data model with the ability to integrate multiple logical models.

In the context of i-refactory, data transformation refers to operations involved in turning raw data readily useful and closer to the business requirements.

Integration patterns are used to integrate entities from different data models. If two or more entities from different data models share the same business key, you can use the Integration Pattern named Key Root. It is a good practice to capture integration patterns in a separate model, named Concept Integration Model.

An attribute is mandatory when its value can not be empty (NULL).

A Physical Data Model (PDM) represents how data will be implemented in a specific database.

{note} The i-refactory uses four PDMs: technical staging model, logical validation model, central facts model and generic access model. Each one of these models is implemented as an additional database, which is used to store data from external and internal data sources.

Reverse engineering is the process of reconstructing a physical and/or Entity Relationship (ER) model from an existing data source. The purpose of reverse engineering is to avoid manual work as much as possible.

Architecture layer

The core of the i-refactory architecture has four layers: TSTGIN, LVL, CFL and GDAL. There are also two auxiliary layers: UCLVL and EXT.

If an entity has one or more attributes that changes over time and you want to keep track of when a attribute is valid at a certain transaction time, then you have a special case of a regular dependent entity, called bi-temporal entity. The bi-temporal entity stores historical data with two timelines. The primary key of the bi-temporal entity is composed by the primary key of the parent entity and the valid start date attribute. The attribute that changes over the valid time is called a bi-temporal attribute.

If an entity has one or more attributes that changes over time and you want to keep track of when a attribute is valid at a certain transaction time, then you have a special case of a regular dependent entity, called bi-temporal entity. The bi-temporal entity stores historical data with two timelines. The primary key of the bi-temporal entity is composed by the primary key of the parent entity and the valid start date attribute. The attribute that changes over the valid time is called a bi-temporal attribute.

A delivery agreement is a contract between a Tenant and a Logical Implementation Model or Generic Data Access model. An agreement has a duration. The delivery agreement set the architecture layer (interface) where the data should be ingested as well as the default settings to be applied to the deliveries.

A dependency mapping is a mapping between a helper (or BR helper) and a source entity used in the query of the helper. The helper and the source entity must belong to the same model.

  • Default dependency is set on entity level (source entity to helper entity)
  • To allow lineage on attribute level, via the Mapping editor, you could manually add the dependency on attribute level.

An Independent Entity is an entity that implements an Anchor for a Business Key that ‘stands alone’ e.g. that does not contain a reference to another Entity.

An Independent Entity is an entity that implements an Anchor for a Business Key that ‘stands alone’ e.g. that does not contain a reference to another Entity.

A Logical Data Model (LDM) matches the language, structure and quality of the business, regardless of the physical data implementation. The Logical Data Model reflects the business requirements.

A delivery may be considered as "untrusted" if deletes of data in the Logical Validation Layer have taken place and the processing of new deliveries cannot 100% rely (trust) on having enough statistics and data available to detect logical deletes, to determine the exact delta and to execute set based validations.

A delivery may be considered as "untrusted" if deletes of data in the Logical Validation Layer have taken place and the processing of new deliveries cannot 100% rely (trust) on having enough statistics and data available to detect logical deletes, to determine the exact delta and to execute set based validations.

A Dependent Entity is an entity that implements an Anchor for a Business Key that ‘depends’ in its existence on another Entity. A Dependent Entity contains Business Key fields of which at least one is a foreign key (FK).

A Dependent Entity is an entity that implements an Anchor for a Business Key that ‘depends’ in its existence on another Entity. A Dependent Entity contains Business Key fields of which at least one is a foreign key (FK).

The transaction time in i-refactory is different from what is commonly understood by transaction time. Transaction time is usually seen as the moment when a fact was stored in the database. In the i-refactory, the transaction time is the time, as dictated by the source system, not by the i-refactory database.

The Attribute type links the attribute to one of the existing interfaces.

Computed columns are columns whose content is computed from values in other columns in the table.

Functional date A functional date or time is a point in time and is defined by a user. An example is an order date or date of birth.

The technical model (also known as Technical Staging In model: TSTGIN) is a representation of how exactly one delivery from a specific data source will be processed in the technical staging layer.

Generalization is the process of extracting shared characteristics from two or more classes (hyponyms), and combining them into a generalized superclass (hypernym). For example: an 'employee' and a 'customer' are both 'persons'.

The Mapping Editor provides a graphical interface for creating and viewing mappings between models. It provides a global view of all the mappings related to the entities of a given model, allowing you to quickly identify those which are mapped and not mapped.

When a certain fact can change over time and you need to capture when that fact is valid in the real world, you can add a valid start date and a valid end date to the entity.

A valid time tells us in which period a record is valid. While a functional date represents just one point in time, the valid time has a begin and an end date, for example:

  • For Order item 123, a Retail price of 10.00 was valid from 2019-01-01 to 2019-06-01.
  • For Order item 123, a Retail price of 12.00 was valid from 2019-06-01 to 2020-01-01.

Alternate key is an attribute or a group of attributes whose values uniquely identify every record in an entity, but which is not the primary key

Candidate key

A candidate key consists of one or more attributes and meets the following requirements:

  • Unique: The value of the key defines uniquely one instance of a concepts. There are no double values.
  • Non-volatile: (Almost) doesn't change.
  • Minimal: Contains only the elements needed.

There are two kinds of candidate keys:

  • primary key
  • alternative key

Normalization is the process of decomposing tables in a database in order to reduce data redundancy and improve data integrity.

A strongly typed model is a model in which each all attributes have a predefined data type, for example: integers, doubles, date.

Surrogate Key A surrogate key is a system generated unique identifier that does not have any contextual or business meaning.

Business Key

A business key is an unique identifier that has business meaning and exists in the real world outside of the database. It consists of a column or a set of columns that already exists in a table. A business key is also known as a natural key

A Key Root Hub is an integration concept that must be used when the exact same business concept or independent business key occurs in different models. The Hubs for this independent business key in the different UAM models are all subtypes of the Keyroot Hub.

A relationship shows how two entities are related to one another. For example, a customer can place an order, and a order can have a customer.

Every Attribute has an atomic type (data type) which is linked to the attribute type of that attribute.

The cardinality shows how many instances of an entity can take place in a relationship.

The cardinality shows how many instances of an entity can take place in a relationship.

An enumeration consists of the list of values that a given attribute should adhere to.

{example} An order can have different statuses, such as shipped,packing,created anddone. Other statuses are not allowed.

Foreign Key

A foreign key is an attribute or a set of attributes that refers to the primary key of another entity. The original entity containing the primary key is called the 'parent' entity and the entity containing the foreign key is called the 'child' entity.

A natural key is an unique identifier that has business meaning and exists in the real world outside of the database. It consists of an column or a set of columns that already exists in a table. A natural key is also known as a business key

The primary key is an assigned key that consists of a minimal set of attributes to uniquely specify an instance of a record. The attribute or a combination of attributes should meet the following characteristics:

  • Unique: The attribute values of the key uniquely identify one instance of a concept. There are no duplicate instances.
  • Non-volatile: The key does not change.
  • Mandatory: All values are filled; there are no NULL values.

It is good practice to choose a primary key that also meet the following characteristic:

  • Safe: Doesn't contain private or sensitive information, such as a social security number.

Constraints are related to the other elements depending of the type of the constraint. Certain constraints are associated to attributes, entities, helper entities, unique keys or relationships between entities.

An attribute may be assigned to one or more entities (ex: acm_exists_ind) and an entity may have several attributes

Each layer may have one or more interfaces. The amount of interfaces depend on how many tenants and delivery agreements have been configured.

Namespace is what in the terminology of SQL Server is called database schema.|

A Delivery is a container that holds the specification of what is actually pushed to the i-refactory platform. This specification consists of a list of entities.

A Delivery is a container that holds the specification of what is actually pushed to the i-refactory platform. This specification consists of a list of entities.

Key Root A Key Root is a central repository for Business Keys. A Key Root ensures that similar records out of different data sources are identified by both the same Business Key as the Surrogated Key.

Context

A Context is a temporal table with a transaction start and end date. The Context tracks all changes of the context attributes related to a business key in the transaction time. This means that every change of an attribute value in a source system leads to a new record in the Context. The old record is end dated with the load date and the new record is start dated with the load date.

Hyponym is a term that denotes a subcategory of a more general class. For example: 'cat' and 'dog' are a hyponyms of 'animal'.

A mapping establishes relationships between concepts of separate data models. It creates a link between entities and attributes from a source model to related entities and attributes in the target model. A source model should precede the target model in the i-refactory architecture.

oasi_bk is an abbreviation for One Attribute Set Interface (OASI) with business keys. A normal view in the generic data access layer (GDAL) consists of the surrogate key, foreign key and attributes. The oasi_bk-view in the GDAL is a view where the business key(s) are also shown.

A subtype is a subgroup of an entity. You can create a subtype if a group of instances share some attributes and relationships that only exist for that group. For example, entity Customer can have a subtype Company and a subtype Person. They share the common attribute customer number, and can have some attributes of their own. Such as birth date for a Person. The entity Customer is called a supertype.

A subtype:

  • inherits all attributes of the supertype
  • inherits all relationships of the supertype
  • usually has one or more own attributes
  • can have subtypes of its own

Anchor: Independent Entity

An Independent Entity is an entity that implements an Anchor for a Business Key that ‘stands alone’ e.g. that does not contain a reference to another Entity.

Anchor: Dependent Entity

A Dependent Entity is an entity that implements an Anchor for a Business Key that ‘depends’ in its existence on another Entity.

A domain will help you to identify the types of information in your model. It defines the set of values for which a column is valid. A domain can specify a data type, length, precision, mandatoriness, check parameters, and business rules. It can be applied to multiple columns, which makes it easier to standardize data characteristics for columns in different tables.

Each interface may have one or more entities and one entity belongs to only one interface. An entity belongs to an i-refactory data model.

Each interface may have one or more entities and one entity belongs to only one interface. An entity belongs to an i-refactory data model.

A helper entity creates a derived entity and can be used when you need to transform, filter, or calculate data. The purpose of a helper differs per model:

  • Technical model: a helper is used to transform data.
  • Logical validation model: a helper is an implementation of a set constraint (Set Constraint Helper).
  • Central facts model: a helper is used for a set-based calculation of derived facts (Business Rule Helper).

HSTGIN is the abbreviation of Historical STaging IN. It is an older term to indicate the Logical Validation Model or Logical Validation Layer.

A schema is a set of database objects, such as tables, views, triggers, stored procedures, etc. In some databases a schema is called a namespace. A schema always belongs to one database. However, a database may have one or multiple schema's. A database administrator (DBA) can set different user permissions for each schema.

Each database represents tables internally as <schema_name>.<table_name>, for example tpc_h.customer. A schema helps to distinguish between tables belonging to different data sources. For example, two tables in two schema's can share the same name: tpc_h.customer and complaints.customer.

A Tenant is a delivering party for a dataset or datarecord as agreed in the Delivery Agreement.

TSTGIN is the abbreviation of Technical STaging IN. It is an older term to indicate the Technical Model or Technical Staging Layer.

An index organizes data in a way that improves the speed of data retrieval from a database. To maintain the index data structure, there is a cost of additional writes and storage space.

An index organizes data in a way that improves the speed of data retrieval from a database. To maintain the index data structure, there is a cost of additional writes and storage space.

The acronym CRUD stands for create, read, update, and delete. These are the four basic functions of persistent storage.

OLAP is a acronym for Online Analytical Processing. OLAP is category of software tools which provide analysis of data for business decisions. It uses complex queries to analyze aggregated historical data from OLTP systems.The primary objective is data analysis and not data processing.

OLTP is a acronym for Online transaction processing. OLTP captures, stores, and processes data from transactions in real time. Its primary objective is data processing and not data analysis.

A hub or independent entity is an entity that implements an Anchor for a business key that ‘stands alone’ e.g. that does not contain a reference to another entity. An independent entity contains business key fields, that show up as alternate key (AK), and the primary key (PK) is its surrogate key (ID).

A key is a combination of one or more attributes of an entity that uniquely defines one instance of that entity.