IBM Support

Poor performance during READ DATA SOURCE step of Cube Build.

Troubleshooting


Problem

With a relatively small about of data the READ DATA SOURCE step takes very long to complete. This occurs regardless of whether the data source is a Cognos Package, an IQD or a a flat file.

Symptom

Compared to other Transformer models with equivalent volumes of data it can take 10 times longer to read same volume of data. The same model may even have large difference in read time between two different sets of data (prod, uat, dev etc). The problem is also demonstrated by only Generating Categories. It could be limited to specific data sources or specific Dimensions. An example of poor data read performance, 100,000 rows take 4 hours to read.

Cause

Data Source containing non-Unique Data populating Levels with Unique and Move properties set will impact read times due to Transformer performing Category Moves to maintain Category Uniqueness. Performance is further impacted by multiple drill down paths and Custom Views which multiplies the effect.

Diagnosing The Problem

Identify the source of poor cube build performance by analyzing the build log file. Check number of records processed and READ DATA SOURCE timing.

Example:
4:21:59 PM Timing, OPEN DATA SOURCE,00:00:00
9:25:03 PM End processing 90004 records from uat.csv
9:25:03 PM Timing, READ DATA SOURCE,05:03:04

Check the Dimensions associated with the slow data source for Levels with properties Unique and Move checked.

Test by unchecking the Move option on the levels and test generating categories for the slow data source. You can right-click on the data source in Transformer and select 'Generating Categories for Selected Data Source'. If the data is read quickly and you get an error indicating attempts to create a category in more than one path, (TR2318 Transformer has detected 101372 attempts to create a category
in more than one path.).

Resolving The Problem

The problem can be attributed to either data that is not properly conformed or the Levels being incorrectly set to Unique. The solution will rely on the business requirement.

If the data should contain data Categories with the same name in differnet paths then the levels should not be set to Unique.

If the Categories should be unique then it needs to be determined why the data source contains non-unique data.

[{"Product":{"code":"SSEP7J","label":"Cognos Business Intelligence"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"Transformer","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF010","label":"HP-UX"},{"code":"PF027","label":"Solaris"},{"code":"PF033","label":"Windows"}],"Version":"10.2;10.2.1;10.2.2","Edition":"All Editions","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
15 June 2018

UID

swg21616176