APAR status
Closed as new function.
Error description
Improve the performance of BigQuery connector read when using a select statement, by providing a new GCS staging option.
Local fix
N/A
Problem summary
Improve performance of BigQuery connector read when using select statement
Problem conclusion
The following properties are added in the source context to improve read performance when "Generate SQL" is set to No. 1. Use GCS staging - Yes/No. 2. Database name - This is an optional property. If it is specified, temporary staging table gets created under this project id. 3. Schema name - This is a mandatory property. The temporary staging table will be created under this schema. 4. Google cloud storage bucket - This is a mandatory property, required to perform read operation using GCS staging option. This bucket is used as staging area to store temporary files created during this read process. 5. File name prefix ? This is an optional property to specify the prefix of the temporary filename created in the Google cloud storage bucket. 6. File part size - This is an optional integer property to specify the part size in MB at which a file gets split. The default value is 50. This property can be adjusted accordingly to achieve higher performance for larger datasets. Heap size property should be modified based on part size used. If a larger file part size is specified, heap size should be increased accordingly. Along with the above properties, an additional property is added on the target side to improve write performance: 1. File part size - This is an optional integer property to specify the part size in MB at which a file gets split. The default value is 50. This property can be adjusted accordingly to achieve higher performance for larger datasets. Heap size property should be modified based on part size used. If a larger file part size is specified, heap size should be increased accordingly. Note: 1. Use GCS staging option is recommended when select statement returns large number of records. 2. Recommended File part size is 50. The value can be tuned accordingly based on the total record size to improve job performance. Limitation: Currently, decimal and numeric datatypes are not supported using GCS staging approach. This APAR also includes the changes for the following: 1. Increase data block size when writing to BigQuery - using File part size property 2. Report total rows modified after the Before/After SQL statement execution.
Temporary fix
Comments
APAR Information
APAR number
JR63409
Reported component name
WIS DATASTAGE
Reported component ID
5724Q36DS
Reported release
B71
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2021-03-05
Closed date
2021-03-18
Last modified date
2021-03-18
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
WIS DATASTAGE
Fixed component ID
5724Q36DS
Applicable component levels
[{"Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSVSEF","label":"IBM InfoSphere DataStage"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"11.7"}]
Document Information
Modified date:
19 March 2021