Feature | Summary |
---|---|
Skipped Row Option | With this option the user can choose to use existing data in the set validation for rows that are rejected on basic constraint checks. |
In the situation that rows of a delivery are rejected for basic constraints, by default the i-refactory tries to fetch the known data of these rejected rows from previous deliveries. This data is then used to perform set validations on rows in the current delivery that did not get rejected.
With the option includeExistingDataForSkippedRowsInSetValidation
this default behavior can be changed for delivies on this interface in the Logical Validation Layer (HSTGIN). If this flag is set to 'NO'
the i-refactory will not fetch the data of skipped rows for set validation.
This option can be manipulated on Interface, Delivery Agreement and Delivery. On Interface level this option can be set to 'YES'
or 'NO'
, by default the option is set to 'YES'
. This setting will be inherited from Interface to Delivery Agreement and from Delivery Agreement to Delivery. However, on both levels the setting can be overridden.
Delivery Agreements and Delivery's have a computed option, includeExistingDataForSkippedRowsInSetValidationConclusion
, that by default will be the same as includeExistingDataForSkippedRowsInSetValidation
's value of the corresponding Interface. This option can be overridden per Delivery Agreement by setting includeExistingDataForSkippedRowsInSetValidation
on a Delivery Agreement to 'YES'
or 'NO'
(see table below). For Delivery's it works the same, includeExistingDataForSkippedRowsInSetValidation
on the Delivery overrides the Delivery Agreement's setting for that particular Delivery.
The table below shows how includeExistingDataForSkippedRowsInSetValidationConclusion
is computed for a Delivery Agreement.
Delivery Agreement value | Interface value | Delivery Agreement Conclusion |
---|---|---|
DEFAULT | YES | YES |
DEFAULT | NO | NO |
YES | NO | YES |
NO | YES | NO |
The table below shows how includeExistingDataForSkippedRowsInSetValidationConclusion
is computed for a Delivery.
Delivery value | Delivery Agreement value | Delivery Conclusion |
---|---|---|
DEFAULT | YES | YES |
DEFAULT | NO | NO |
YES | NO | YES |
NO | YES | NO |
The transactional write interface on a Generic Data Access Model assumed only simple set based insert, update and delete (throug updates) statements were executed on the writable views. Some unit tests however failed because a SQLServer merge statement in some cases was executed on this API with mixed updates and deletes. Due to the complexity of the generated trigger code we prevented mixed updates and deletes.
In this release we fixed this limitation. You now have the ability to issue updates and (logical/physical) deletes in a single statement.
/acmDatadel/deliveryAgreement
and /acmDatadel/Delivery
New properties:
includeExistingDataForSkippedRowsInSetValidation
: Determines if existing rows (from previous deliveries) should be merged into the dataset for set validation in case a row delivered in the current delivery is skipped due to basic constraint violation. Only relevant for logical implementation models. The value by default will be set to YES if no specific value is provided./acmDatadef/interface
, /acmDatadel/deliveryAgreement
and /acmDatadel/Delivery
New properties:
includeExistingDataForSkippedRowsInSetValidation
: Determines if existing rows (from previous deliveries) should be merged into the dataset for set validation in case a row delivered in the current delivery is skipped due to basic constraint violation. Only relevant for logical implementation models. The value by default will be set to YES if no specific value is provided./acmDatadef/interface
and /acmDatadef/deliverableInterface
New properties:
includeExistingDataForSkippedRowsInSetValidation
with default value YES
. Can be set to YES
or NO
.
/acmDatadel/deliveryAgreement
New properties:
includeExistingDataForSkippedRowsInSetValidation
with default value DEFAULT
.includeExistingDataForSkippedRowsInSetValidationConclusion
: the final calculated conclusion based on the setting on interface and agreement./acmDatadel/Delivery
New properties:
includeExistingDataForSkippedRowsInSetValidation
with default value DEFAULT
.includeExistingDataForSkippedRowsInSetValidationConclusion
: the final calculated conclusion based on the setting on agreement and delivery.The i-refactory runtime engine runs on NodeJS LTS version 10, 12 and 14.
None.
When creating a delivery you now can specify to load CSV files automatically into the technical staging tables.
You can set values in the property csvLoaderSpecification
. When this property is set the i-refactory engine will automatically load supplied CSV files from the location specified in uncPath
into the appropriate technical staging tables for this specific delivery.
The property csvLoaderSpecification
has the following schema:
csvLoaderSpecificationschema = {
$id: '/acmDatadel/delivery/csvLoaderSpecification',
type: 'object',
properties: {
uncPath: { type: 'string' },
rowTerminator: { type: 'string', default: '0x0A' },
fieldTerminator: { type: 'string', default: ',' },
codePage: { type: 'string', default: '65001', nullable: false },
firstRowIsHeader: { type: 'boolean', default: true },
datafileType: { enum: ['char', 'widechar'], default: 'char' },
fileExtension: { type: 'string', default: 'csv' },
},
additionalProperties: false,
required: ['uncPath'],
}
For a detailed description of these properties please check the OpenAPI schema specification. If you set a value (the only required value is uncPath) the i-refactory engine will automatically load the data, otherwise you need to load the data into the appropriate tables with your own tooling.
The i-refactory engine will immedatiately start loading the relevant files in parallel when auto-load is applicable. If a required file cannot be read or if a file cannot be loaded with BULK INSERT the task will be set to FAILED (which you can monitor in the web app). The i-refactory engine does not move, remove or otherwise write to the specified uncPath.
You have the ability to REJECT the delivery while still in the loading phase. In this case already running technical load tasks will be completed but other technical load tasks will not be scheduled or processed. The delivery will be set in a REJECTING state and eventually to REJECTED.
This is an example of a delivery with csvLoaderSpecification properties set:
{
"architectureLayerCode": "HSTGIN",
"interfaceCode": "TPC_H_HSTGIN",
"tenantCode": "TPCH",
"receivedDt": "2018-01-01T00:00:01",
"sentDt": "2018-01-01T00:00:00",
"snapshotDt": "2018-01-01T00:00:00",
"isComplete": true,
"constraintsToValidate": "ALL",
"violatedConstraintMaxSampleSize": 50,
"logicalValidationDeliveryDeleteType": "LOGICAL",
"deliveredEntity": [],
"csvLoaderSpecification": {
"uncPath": "//Mac/home/i-refact-demo-tpch/deliveredData/orderSystem/delivery-report-load",
"codePage": "raw",
"fieldTerminator": "|",
"firstRowIsHeader": false,
"fileExtension": "tbl"
}
}
After succesfull acceptance of the delivery the monitor shows the generated load tasks to the technical staging tables. In the example monitor you can see some have them are succeeded, processing and scheduled. The number of parallel load tasks dependents on the concurrency settings.
You can look into the load task and check the generated SQLServer code and csvLoaderSpecification settings:
/**
* i-refactory version : 3.7.0
* Generated for entityUpdate(auditId) : 16
* SourceArchitetureLayer : EXT
* SourceInterface : TPC_H_TSTGIN
* SourceEntity : IR_TSTGIN.tpc_h.LINEITEM
* SourceEntityClassification : STABLE DEPENDENT
* TargetArchitetureLayer : TSTGIN
* TargetInterface : TPC_H_TSTGIN
* TargetEntity : IR_TSTGIN.tpc_h.LINEITEM
* TargetEntityClassification : STABLE DEPENDENT
* CsvLoaderSpecification.uncPath : /var/opt/home/i-refact-demo-tpch/deliveredData/orderSystem/delivery-report-load
* CsvLoaderSpecification.rowTerminator : 0x0A
* CsvLoaderSpecification.fieldTerminator : |
* CsvLoaderSpecification.codePage : raw
* CsvLoaderSpecification.firstRowIsHeader : false
* CsvLoaderSpecification.datafileType : char
* CsvLoaderSpecification.fileExtension : tbl
**/
TRUNCATE TABLE "IR_TSTGIN"."tpc_h"."LINEITEM";
BULK INSERT "IR_TSTGIN"."tpc_h"."LINEITEM"
FROM '/var/opt/home/i-refact-demo-tpch/deliveredData/orderSystem/delivery-report-load/LINEITEM.tbl'
WITH (TABLOCK, DATAFILETYPE='char', FORMAT='CSV', FIRSTROW=1, FIELDTERMINATOR='|', ROWTERMINATOR='0x0A', CODEPAGE='raw')
;
Making use of the autoload option has some restriction:
csvLoaderSpecification
on SQLServer version 2017 or 2016 a functional error is returned.You only need to specify uncPath. The other properties will be initialized with their default values:
If your CSV files are encoded in another codepage, have a different field or row separator you need to set the appropriate values when creating the delivery.
If you your files are UTF-16 encoded files they should have a byte order mark.
You also need to set datafileType
to widechar
and change the setting of rowTerminator
to 0x0A00
or another rowTerminator setting.
Although the SQLServer documentation specified it complies to the RFC 4180 standard there is an issue when UTF-16 files are provided. The last line of a file needs to have a rowTerminator (end of file only does not work). If the last line does not have a rowTerminator this last line will not be loaded.
Loading CSV files on SQLServer running on Linux has additional limitations:
raw
.The Tedious driver which we use to connect to SQLServer supports the following connection types:
If files needs to be accessed on for instance a fileshare there are limitations in how to access these files.
To allow for delegated access to another server a connection based on Kerberos is required. Unfortunately the Tedious driver does not support Kerberos based authentication. We are currently investigating the ability to connect through ODBC with Kerberos authentication support. For bulk loading data we will then use the ODBC based connection while still relying on the Tedious driver for the rest of the application.
The i-refactory engine stores data over time. For the GDAL current views (CRUD) and GDAL business key views which read the data in the fact store we need to apply a filter to only retrieve the latest version which is not logically removed.
The generated query in the view made use of a self join to determine which versioned record is the actual (or latest version). This query is quite expensive (due to the fact it needed a join).
In this release we changed the query such that it uses the value of acm_end_dt
to get the latest version.
Queries (especially when retrieving large sets) now on average consume 75% less CPU, the amount of logical reads is reduced with 90% and the elapsed time is reduced with 78%.
The read performance improvements do come at a small cost of additional storage and write throughput. An additional index is automatically created on each context table which tracks changes over time. This index is automatically created upon importing a fact based model in case this index does not yet exist.
Issue | Summary |
---|---|
datatype truncation | Applying where conditions on datetime columns in the Rest API resulted in truncation to scale 1 when a more precise scale value was supplied (ie milliseconds). |
flush transaction errors | In case SQLServer reported a primary key violation error it also reports the record that caused this violation. When this reported record contains multibyte characters the i-refactory engine was not capable of storing the error message resulting in a flush transaction error. The error message property in the database was of datatype varchar and is now changed to nvarchar where applicable. |
Issue | Summary |
---|---|
Auxiliary key in Generic Data Access Model | Attributes for independent auxiliary relations were added to the auxiliary key when a dependent auxiliary reference was added afterwards. This issue is now fixed. |
Name to code conversion | The name to code conversion was applied for key columns in a dependent anchor which were not a reference to another anchor. This results in unwanted overrules of the code convention of a column. |
Export issues | Export to a temporary directory did not work as expected regarding the timestamp added and used in scripts. |