DB2 Version 9.7 for Linux, UNIX, and Windows

Identity column load considerations

The load utility can be used to load data into a table containing an identity column whether or not the input data has identity column values.

If no identity-related file type modifiers are used, the utility works according to the following rules:

If the identity column is GENERATED ALWAYS, an identity value is generated for a table row whenever the corresponding row in the input file is missing a value for the identity column, or a NULL value is explicitly given. If a non-NULL value is specified for the identity column, the row is rejected (SQL3550W).
If the identity column is GENERATED BY DEFAULT, the load utility makes use of user-supplied values, if they are provided; if the data is missing or explicitly NULL, a value is generated.

The load utility does not perform any extra validation of user-supplied identity values beyond what is normally done for values of the identity column's data type (that is, SMALLINT, INT, BIGINT, or DECIMAL). Duplicate values are not reported.

In most cases the load utility cannot guarantee that identity column values are assigned to rows in the same order that these rows appear in the data file. Because the assignment of identity column values is managed in parallel by the load utility, those values are assigned in arbitrary order. The exceptions to this are as follows:

In single-partition databases, rows are not processed in parallel when CPU_PARALLELISM is set to 1. In this case, identity column values are implicitly assigned in the same order that rows appear in the data file parameter.
In multi-partition databases, identity column values are assigned in the same order that the rows appear in the data file if the identity column is in the distribution key and if there is a single partitioning agent (that is, if you do not specify multiple partitioning agents or the anyorder file type modifier).

When loading a table in a partitioned database where the table has an identity column in the partitioning key and the identityoverride modifier is not specified, the SAVECOUNT option cannot be specified. When there is an identity column in the partitioning key and identity values are being generated, restarting a load from the load phase on at least one database partition requires restarting the whole load from the beginning of the load phase, which means that there can't be any consistency points.

Note: A load RESTART operation is not permitted if all of the following criteria are met:

The table being loaded is in a partitioned database environment, and it contains at least one identity column that is either in the distribution key or is referenced by a generated column that is part of the distribution key.
The identityoverride modifier is not specified.
The previous load operation that failed included loading database partitions that failed after the load phase.

A load TERMINATE or REPLACE operation should be issued instead.

There are three mutually exclusive ways you can simplify the loading of data into tables that contain an identity column: the identitymissing, the identityignore, and the identityoverride file type modifiers.

Loading data without identity columns
The identitymissing modifier makes loading a table with an identity column more convenient if the input data file does not contain any values (not even NULLS) for the identity column. For example, consider a table defined with the following SQL statement:

   create table table1 (c1 varchar(30),
                        c2 int generated by default as identity,
                        c3 decimal(7,2),
                        c4 char(1))

If you want to load TABLE1 with data from a file (load.del) that has been exported from a table that does not have an identity column, see the following example:

   Robert, 45.2, J
   Mike, 76.9, K
   Leo, 23.4, I

One way to load this file would be to explicitly list the columns to be loaded through the LOAD command as follows:

   db2 load from load.del of del replace into table1 (c1, c3, c4)

For a table with many columns, however, this syntax might be cumbersome and prone to error. An alternate method of loading the file is to use the identitymissing file type modifier as follows:

   db2 load from load.del of del modified by identitymissing
      replace into table1

This command would result in the three columns in the data file being loaded into c1, c3, and c4 of TABLE1. A value will be generated for each row in c2.

Loading data with identity columns
The identityignore modifier indicates to the load utility that even though the input data file contains data for the identity column, the data should be ignored, and an identity value should be generated for each row. For example, a user might want to load TABLE1, as defined above, from a data file (load.del) containing the following data:

   Robert, 1, 45.2, J
   Mike, 2, 76.9, K   
   Leo, 3, 23.4, I

If the user-supplied values of 1, 2, and 3 are not used for the identity column, you can issue the following LOAD command:

   db2 load from load.del of del method P(1, 3, 4)
      replace into table1 (c1, c3, c4)

Again, this approach might be cumbersome and prone to error if the table has many columns. The identityignore modifier simplifies the syntax as follows:

   db2 load from load.del of del modified by identityignore
      replace into table1

Loading data with user-supplied values
The identityoverride modifier is used for loading user-supplied values into a table with a GENERATED ALWAYS identity column. This can be quite useful when migrating data from another database system, and the table must be defined as GENERATED ALWAYS, or when loading a table from data that was recovered using the DROPPED TABLE RECOVERY option on the ROLLFORWARD DATABASE command. When this modifier is used, any rows with no data (or NULL data) for the identity column are rejected (SQL3116W). You should also note that when using this modifier, it is possible to violate the uniqueness property of GENERATED ALWAYS columns.In this situation, perform a load TERMINATE operation, followed by a subsequent load INSERT or REPLACE operation.