IDAX.LINEAR_REGRESSION - Build a linear regression model

Use this stored procedure to build a linear regression model.

Note: This feature is available starting from Db2® version 11.5.4.

Authorization

The privileges held by the authorization ID of the statement must include the IDAX_USER role.

Restrictions

  1. The input columns must contain at least one continuous column.
  2. The number of input columns must not be greater than 78, that is, the number of continuous columns added to the number of nominal columns.
  3. The number of distinct values for each nominal input column must not exceed 25.
  4. This stored procedure is not available on Linux® on IBM z Systems® and AIX® systems.

Syntax

IDAX.LINEAR_REGRESSION(in parameter_string varchar(32672))

Parameter descriptions

parameter_string
Mandatory one-string parameter that contains pairs of <parameter>=<value> entries that are separated by a comma.
Data type: VARCHAR(32672)
The following list shows the parameter values:
model
Mandatory.
The name of the linear regression model that is to be built.
Data type: VARCHAR(64)
intable
Mandatory.
The name of the input table.
Data type: VARCHAR(128)
id
Mandatory.
The column of the input table that identifies a sequence ID.
Data type: VARCHAR(128)
target
Mandatory.
The target column that is to be predicted.
Data type: VARCHAR(128)
incolumn
Mandatory.
The columns of the input table that have specific properties, which are separated by a semi-colon (;).
Data type: VARCHAR(32000)
coldeftype
Optional.
The default type of the input table columns.
Allowed values are nom and cont.
If this parameter is not specified, numeric columns are continuous, and all other columns are nominal.
Default: none
Data type: VARCHAR(4)
coldefrole
Optional.
The default role of the input table columns.
Allowed values are input and ignore.
Default: input
Data type: VARCHAR(8)
colPropertiesTable
Optional.
The input table where properties of the columns of the input table are stored.
If this parameter is not specified, the column properties of the input table are detected automatically.
Default: none
Data type: VARCHAR(128)
intercept
Optional.
A flag that indicates whether the model is built with an intercept value.
Allowed values are true and false.
Default: true
Data type: VARCHAR(5)
usesvdsolver
Optional.
A flag that indicates whether singular value decomposition is to be forced for the model calculation.
Singular value decomposition presents results even if the input table is a singular matrix. Therefore, the results might not be correct.
Set the value to false unless the modeling process for an input table reports a singular matrix problem.
Allowed values are true and false.
Default: false
Data type: VARCHAR(5)
calculatediagnostics
Optional.
A flag that indicates whether diagnostics information is to be computed.
Computing diagnostics requires an additional data scan with the computed model. As this process is expensive, do not activate it if performance has priority.
Allowed values are true and false.
Default: false
Data type: VARCHAR(5)

Returned information

If the linear regression stored procedure is completed successfully, the following string as a result set, where t means true.

Result set 1
  --------------

  LINEAR_REGRESSION
  -----------------
  t                

  1 record(s) selected.

  Return Status = 0

Example

CALL IDAX.LINEAR_REGRESSION('model=adult_linreg, intable=adult_train, id=id, target=age, coldefrole=ignore,
incolumn=WORKCLASS;FNLWGT;EDUCATION;MARITAL_STATUS;OCCUPATION;RELATIONSHIP;SEX;CAPITAL_GAIN;CAPITAL_LOSS;HOURS_PER_WEEK;
INCOME, calculatediagnostics=false, intercept=true');