The following considerations must be noted while optimizing
the Netezza connector jobs:
Action column support
When the Action
Column write mode is used, you can designate a column
in the target Netezza connector as the action column.
Table 1. Action
column support
Action Value |
SQL Operation |
I |
Insert |
U |
Update |
D |
Delete |
R |
Replace (Delete then insert) |
M |
Merge (update or insert if record does not exist) |
When a processing stage or a join stage is pushed into a target
Netezza connector configured with the Action Column write
mode, a sequence of DELETE/UPDATE/INSERT statements is generated in
the After-SQL field in the target connector.
These SQL statements combine the processing logic or the join logic
with the original DELETE/UPDATE/INSERT operations implied by the Action
Column write mode.
Unique key column for update support
When
the Use unique key column option is set to Yes in
the target Netezza connector, Balanced Optimization generates additional SQL expressions
in the target SQL update statement. This ensures that only one row
from each group of rows with the same value in the unique key column
is used in the update statement.
Temporary Work Table support and Staging table management
Netezza
connector uses Temporary Work Tables (TWT) as a staging area for rows
received from the input link of a target Netezza connector. The TWT
tables can be managed in Automatic and Existing modes. In the Automatic
mode, the TWT tables are automatically created by the Netezza connector
at run time. In the Existing mode, the TWT name is explicitly specified
in the TWT Name property in the connector. InfoSphere DataStage Balanced
Optimization supports both Automatic and Existing modes in different
situations.
Scenario 1: The Push all processing
into the database option is selected and applied in the optimization When
the
Push all processing into the database option
is selected and applied in optimization, the Temporary Work Tables
are managed in one of the following method depending on the mode selected:
Table 2. TWT support in Automatic and Existing modes for scenario 1
TWT mode in the original job |
How the Temporary Work Tables are managed |
Existing |
- TWT Name is not modified unless the Balanced Optimization staging table name is
set to a different table name. If the Balanced Optimization staging table name is
set to a different table name, the staging table is used as the new
TWT table in the optimized job. The staging table must have the same
schema as the TWT table specified in the original table. If the Separate
connection for TWT option is used, the staging table must
reside in the same database as the original Temporary Work Table.
- All other fields in TWT table properties area in the target connector
are copied from the original job to the optimized job regardless of
whether a staging table name is set or not.
- The final TWT table must exist before the optimized job is run.
Balanced Optimization does not automatically create or drop it.
|
Automatic |
- Every property in the TWT table properties area in the target
connector is copied from the original job to the optimized job regardless
of whether a Balanced Optimization staging table name is set or not.
- A TWT table is automatically created and dropped by the Netezza
connector. Balanced Optimization does not use the TWT table in the generated SQL queries.
|
Scenario 2: The Push processing to database
targets option and the Push data reduction
processing to database targets option (when available) are selected and applied in the optimizationIn
this scenario, if the Use separate connection for TWT property
is set to Yes in the target connector, a dummy
table name BALOP_DUMMY_TARGET_TABLE is used as the target table name
in the target connector of the optimized job. This dummy table must
exist before the optimized job is run. It can be a table of any schema,
such as a single column of integer type without any rows. All the
optimized jobs generated by Balanced Optimization use this dummy table name when the Use
separate connection for TWT option is set to Yes in
the target connector. No data is inserted into this table when an
optimized job is run.
If the
Push processing to
database targets option and the
Push data reduction
processing to database targets option (when available)
are selected and applied in optimization, the Temporary Work Tables
are managed in one of the following method depending on the mode selected:
Table 3. TWT support in Automatic and Existing modes for scenario 2
TWT mode in the original job |
How the Temporary Work Tables are managed |
Existing |
- TWT Name is copied from the original job to the optimized job
unless the staging table name property is set to a different table
name.
- The TWT table schema must match the input link schema of the target
connector in the optimized job. If the name does not match, an error
might occur at run time. InfoSphere DataStage Balanced Optimization
compares the input link schema of the target connectors between the
original job and the optimized job. If they are different, the TWT
table specified in the original job becomes invalid. If a staging
table name is not specified in the staging table name property, an
error is raised and the particular optimization step is abandoned.
- All other properties in the TWT properties area in the target
connector are copied directly from the original job to the optimized
job regardless of whether a Balanced Optimization staging table name is set or not.
- The resulting TWT table must exist before the optimized job is
run. InfoSphere DataStage Balanced Optimization does not automatically
create or drop it.
|
Automatic |
One of the following situation might occur:
- If the Separate connection for TWT option
is not specified in the target connector in the original job:
- If a staging table name is not specified, a staging table name
is automatically generated and used as the TWT table in the optimized
job. The TWT table is automatically created before the combined SQL
queries are executed, and automatically dropped after the SQL queries
are executed.
- If a staging table name is specified, then the staging table is
used as the TWT table. The TWT table is always automatically created
before the combined SQL queries are executed, and automatically dropped
after the SQL queries are executed.
- If the Separate connection for TWT option
is specified in the target connector in the original job, a staging
table name must be specified in the Staging table name property.
- If the staging table name is not specified, an error is raised,
and the optimization step is abandoned.
- If the staging table name is specified, the staging table is used
as the TWT table, and the TWT mode in the target connector in the
optimized job is set to Existing. It must exist in the separate TWT
database before the optimized job is run. In this scenario, Balanced Optimization
does not automatically create or drop the staging table.
|