IDAX.STD_NORM - Standardize or normalize columns of the input table

Use this stored procedure to standardize or normalize the columns of the input table, and to write the transformed columns to the output table.

Note: This feature is available starting from Db2® version 11.5.4.

Authorization

The privileges held by the authorization ID of the statement must include the IDAX_USER role.

Syntax

IDAX.STD_NORM(in parameter_string varchar(32672))

Parameter descriptions

parameter_string
Mandatory one-string parameter that contains pairs of <parameter>=<value> entries that are separated by a comma.
Data type: VARCHAR(32672)
The following list shows the parameter values:
intable
Mandatory.
The name of the input table.
Data type: VARCHAR(256)
incolumn
Mandatory.
The columns of the input table that are to be considered.
These columns are separated by a semi-colon (;).
You can append the following signs to each column:
<column>:L
Leaves the column unchanged.
The output table contains the same column with the same values.
<column>:S
Standardizes the column.
The output table contains an STD_<column> column that contains the standardized input values. A standardized value S is equal to (x-mean)/stddev, where x is the input value, mean is the mean value of the column, and stddev is the standard deviation of the column. The standardized values range from minus infinity to plus infinity.
<column>:N
Normalizes the column.
<column>:U
Specifies that the column has the length of one unit.
Additionally, you can append the following signs to two columns:
<col1>/<col2>:C
Normalizes the <col1> column and the <col2> column to be a unit vector in the space.
The output table contains an NRC_<rank>_<col1> column and an NRC_<rank>_<col2> column, where rank is the rank of this transformation in the list of :C transformations or :V transformations, beginning with 1.
Both columns contain their own normalized input values. The normalized value U1 for <col1> is equal to x/sqrt(x^2+y^2), where x is the input value of <col1>, and y is the input value of <col2>.
The normalized value U2 for <col2> is equal to y/sqrt(x^2+y^2), where x is the input value of <col1>, and y is the input value of <col2>. The normalized values range from -1 to 1.
<col1>/<col2>:V
Divides the values of the columns by the length of the longest row vector.
Normalizes the <col1> column and the <col2> column according to the longest row vector of the data.
The output table contains an NRM_<rank>_<col1> column and an NRM_<rank>_<col2> column, where rank is the rank of this transformation in the list of :C transformations or :V transformations, beginning with 1.
Both columns contain their own normalized input values. The normalized value V1 for <col1> is equal to x/sqrt(maxssq), where x is the input value of <col1> and maxssq is the maximum value of the square sum of the values of <col1> and <col2>.
The normalized value V2 for <col2> is equal to y/sqrt(maxssq), where y is the input value of <col2>, and maxssq is the maximum value of the square sum of the values of <col1> and <col2>. The normalized values range from -1 to 1.
The output table contains the id column and the output column as specified by the incolumn parameter. If the by parameter is specified, an additional by column is added to indicate for which group the transformation is calculated.
The naming convention of the output columns is limited. One input column with the same suffix :L, :S, :N, or :U must not appear twice in the incolumnparameter.
Data type: VARCHAR(ANY)
id
Mandatory.
The column of the input table that identifies a unique instance ID.
Data type: VARCHAR(128)
outtable
Mandatory.
The output table that contains the modified data.
Data type: VARCHAR(256)
by
Optional.
The column of the input table that splits the data into groups for which transformation is to be done.
Default: none
Data type: VARCHAR(128)
view
Optional.
A flag that indicates whether the output object should be stored as a view (true) or as a table (false).
Default: false
Data type: BOOLEAN

Returned information

VARCHAR(1000) the number of transformed columns in the output table. Columns that are specified with the suffix :L are ignored.

Example

CALL IDAX.STD_NORM('intable=IRIS, incolumn=petallength:S;petallength:L;petallength:N;petallength:U;sepallength/petallength:C, id=id, outtable=IRIS_NORM1');
CALL IDAX.STD_NORM('intable=IRIS, incolumn=petallength:S;petallength:L;petallength:N;petallength:U;sepallength/petallength:C, id=id,outtable=IRIS_NORM2, by=class');