Use this stored procedure to build a linear regression model.
Note: This feature is available starting from Db2®
version 11.5.4.
Authorization
The privileges held by the authorization ID of the statement must include the IDAX_USER role.
Restrictions
- The input columns must contain at least one continuous column.
- The number of input columns must not be greater than 78, that is, the number of continuous
columns added to the number of nominal columns.
- The number of distinct values for each nominal input column must not exceed 25.
- This stored procedure is not available on Linux® on IBM z Systems® and AIX® systems.
Syntax
IDAX.LINEAR_REGRESSION(in parameter_string varchar(32672))
Parameter descriptions
- parameter_string
- Mandatory one-string parameter that contains pairs of
<parameter>=<value> entries that are separated by a comma.
- Data type: VARCHAR(32672)
- The following list shows the parameter values:
-
- model
- Mandatory.
- The name of the linear regression model that is to be built.
- Data type: VARCHAR(64)
- intable
- Mandatory.
- The name of the input table.
- Data type: VARCHAR(128)
- id
- Mandatory.
- The column of the input table that identifies a sequence ID.
- Data type: VARCHAR(128)
- target
- Mandatory.
- The target column that is to be predicted.
- Data type: VARCHAR(128)
- incolumn
- Mandatory.
- The columns of the input table that have specific properties, which are separated by a
semi-colon (;).
- Data type: VARCHAR(32000)
- coldeftype
- Optional.
- The default type of the input table columns.
- Allowed values are
nom
and cont
.
- If this parameter is not specified, numeric columns are continuous, and all other columns are
nominal.
- Default: none
- Data type: VARCHAR(4)
- coldefrole
- Optional.
- The default role of the input table columns.
- Allowed values are
input
and ignore
.
- Default: input
- Data type: VARCHAR(8)
- colPropertiesTable
- Optional.
- The input table where properties of the columns of the input table are stored.
- If this parameter is not specified, the column properties of the input table are detected
automatically.
- Default: none
- Data type: VARCHAR(128)
- intercept
- Optional.
- A flag that indicates whether the model is built with an intercept value.
- Allowed values are true and false.
- Default: true
- Data type: VARCHAR(5)
- usesvdsolver
- Optional.
- A flag that indicates whether singular value decomposition is to be forced for the model
calculation.
- Singular value decomposition presents results even if the input table is a singular matrix.
Therefore, the results might not be correct.
- Set the value to false unless the modeling process for an input table reports a singular matrix
problem.
- Allowed values are true and false.
- Default: false
- Data type: VARCHAR(5)
- calculatediagnostics
- Optional.
- A flag that indicates whether diagnostics information is to be computed.
- Computing diagnostics requires an additional data scan with the computed model. As this process
is expensive, do not activate it if performance has priority.
- Allowed values are true and false.
- Default: false
- Data type: VARCHAR(5)
Returned information
If the linear regression stored procedure is completed successfully, the following string as a
result set, where t means true.
Result set 1
--------------
LINEAR_REGRESSION
-----------------
t
1 record(s) selected.
Return Status = 0
Example
CALL IDAX.LINEAR_REGRESSION('model=adult_linreg, intable=adult_train, id=id, target=age, coldefrole=ignore,
incolumn=WORKCLASS;FNLWGT;EDUCATION;MARITAL_STATUS;OCCUPATION;RELATIONSHIP;SEX;CAPITAL_GAIN;CAPITAL_LOSS;HOURS_PER_WEEK;
INCOME, calculatediagnostics=false, intercept=true');