Known issues for Db2 and Db2 Warehouse

Important: IBM Cloud Pak® for Data Version 4.8 will reach end of support (EOS) on 31 July, 2025. For more information, see the Discontinuance of service announcement for IBM Cloud Pak for Data Version 4.X.

Upgrade to IBM Software Hub Version 5.1 before IBM Cloud Pak for Data Version 4.8 reaches end of support. For more information, see Upgrading from IBM Cloud Pak for Data Version 4.8 to IBM Software Hub Version 5.1.

The following known issues apply to the Db2 and Db2 Warehouse services.

Db2 logs using full disk capacity causes Db2 service to be in unusable state

Applies to: 4.8.4 and later

Problem

Even when the cluster is idle, the Db2 disk uses the entire disk capacity, which disrupts all cluster operations such as functional tests, backup and restore, and upgrade.

Symptoms

From within the db2u or etcd pod, running df -k shows that the disk utilization of the archivelogs, activelogs and blumeta0 directories are 100% (or close to it).

Workaround

To bring Db2 deployment to active or healthy state, carry out the following steps:

  1. Delete oldest archive logs in /mnt/logs/archive/db2inst1/BLUDB/NODE0000/LOGSTREAM0000/C0000000 manually.
  2. Delete .dump.bin files in DIAGPATH manually.
  3. Deactivate/activate Db2 db.
  4. Prune transaction logs by following the steps in Managing Db2 transaction logs.

After carrying out the above steps, no directories are at 100%, which allows both the db2u and etcd pods to come up and running.

Unable to authenticate db2u pod connection

Applies to: 4.8.0 and later

Problem

When reinstalling the Db2 service on Cloud Pak for Data, an issue can occur while authenticating a connection. This is due to a mismatch of the issuer ID between certificate secrets used by the db2u pod.

Symptoms

You see an error on the zen-database-core pod that is similar to the following output:

" level=error msg="Service instance 'db2oltp-wkc' database ping check failed" func="zen-databases-core/pkg/impl/operator.(Db2Connection).pingDb" file="/go/src/zen-databases-core/pkg/impl/operator/dbConnection.go:72" error="SQLDriverConnect: {08001} [IBM][CLI Driver] 
SQL30081N A communication error has been detected. Communication protocol being used: "SSL". Communication API being used: "SOCKETS". Location where the error was detected: "SOCKETS".
Communication function detecting the error: "sqlccSSLSocketSetup". Protocol specific error code(s): "414", "", "*". SQLSTATE=08001\n"

The following commands indicate the mismatch of issuer id between internal-tls that uses cs-ca-certificate issuer and db2aaservice-internal-tls that uses zen-ca-certificate:

oc get secret internal-tls -o jsonpath='{.data.ca\.crt}' | base64 -d | openssl x509 -noout -dates -issuer -subject
notBefore=Dec 27 09:31:31 2023 GMT
notAfter=Dec 26 09:31:31 2025 GMT
issuer=CN = cs-ca-certificate
subject=CN = cs-ca-certificate
oc get secret db2aaservice-internal-tls -o jsonpath='{.data.ca\.crt}' | base64 -d | openssl x509 -noout -dates -issuer -subject
notBefore=Aug 22 14:20:46 2022 GMT
notAfter=Aug 21 14:25:46 2025 GMT
issuer=CN = zen-ca-certificate
subject=CN = zen-ca-certificate
Workaround
  1. Update your certificate secret.
    1. Run the following command to edit your certificate:
      oc edit certificate db2aaservice-internal-tls
    2. Add test as a new entry under spec.dnsName.
    3. Run the following command to generate a new certificate request:
      oc get certificaterequest | grep db2
  2. Run the following command to verify your changes:
    oc get secret db2aaservice-internal-tls -o jsonpath='{.data.ca\.crt}' | base64 -d | openssl x509 -noout -dates -issuer -subject
  3. Restart your db2u pod to confirm that the connection is successful.

Connecting as cpadmin fails after IAM is disabled

Applies to: 4.8.0 to 4.8.4

Fixed in: 4.8.8

Problem
After IAM is disabled, some platform admin users fail to connect to the Db2 and Db2 Warehouse deployments as cpadmin.
Symptoms

Initially, the IAM integration for Cloud Pak for Data was enabled, and the Db2 connections worked normally. You then disabled the IAM integration by using these steps Configuring Cloud Pak for Data to use the embedded LDAP integration. Now, you cannot connect to Db2 instances, which include existing deployments before IAM was disabled or new deployments after IAM was disabled.

The connection from terminal shows USERNAME AND/OR PASSWORD INVALID:

$ db2 connect to BLUDB user cpadmin using <password>
SQL30082N  Security processing failed with reason "24" ("USERNAME AND/OR
PASSWORD INVALID").  SQLSTATE=08001

From the db2diaglog, you see an error around this function IBMIAMauth::verify_cp4d_auth_iam:

2024-03-01-02.24.45.239952+000 I4283174E595          LEVEL: Error
PID     : 14632                TID : 140006895118080 PROC : db2sysc 0
INSTANCE: db2inst1             NODE : 000            DB   : BLUDB
APPHDL  : 0-7714
HOSTNAME: c-db2oltp-1707901157071494-db2u-0
EDUID   : 52                   EDUNAME: db2agent (BLUDB) 0
FUNCTION: DB2 UDB, bsu security, sqlexLogPluginMessage, probe:20
DATA #1 : String with size, 170 bytes
IBMIAMauth::verify_cp4d_auth_iam: Fail to get response code, retval = 6, http_code = 400, jwt = Failed to get access token, invalid token request parameters server_error
Workaround
  1. Select your deployment by running the following command:
    oc -n ${PROJECT_CPD_INST_OPERANDS} exec -it ${db2_podname} -- bash
  2. Switch to the db2inst1 profile:
    su - db2inst1
  3. Unset the env value DB2_BEDROCK_ROUTE:
    unset DB2_BEDROCK_ROUTE
  4. Stop wolverine:
    sudo sv stop wolverine
  5. Refresh the db2u security plug-in and restart Db2:
    db2stop force && ipclean -a
    /bin/bash -c "source /db2u/scripts/include/db2_functions.sh && refresh_db2_sec_plugin"
    db2inst1]$ db2start
    You can now connect to your Db2 database as cpadmin.
Note:

Apply this change again if the pod restarts.

auto_rotate_cert.sh causes the Db2 connection to hang

Applies to: 4.8.4

Fixed in: 4.8.8

Problem
This issue happens only on s390x clusters. You encounter situations where auto_rotate_cert.sh hangs. If auto_rotate_cert.sh is hanging, it can cause the Db2 ODBC connection to hang as well. You can then experience difficulties when you try to run any functions.
Workaround

You need to disable auto_rotate_cert.sh by disabling the Administrative Tasks Scheduler (ATS) task. You must also call the scripts to rotate the certificates manually if the certificates are updated or for fresh deployments.

  1. To check whether auto_rotate_cert.sh is hanging, use this command:
    ps -aux | grep auto_rotate_cert.sh
  2. If auto_rotate_cert.sh is hanging, you can halt the process by using this command:
    kill -i <process_id>
  3. To disable the ATS task, exec into your db2u catalog node as db2inst1 and enter:
    db2 -v "CALL SYSPROC.ADMIN_TASK_REMOVE('ROTATE_DB2_SSL_CERTS', NULL)"
    
  4. To rotate the certificates, exec into your db2u catalog node as db2inst1 and enter:

    auto_rotate_cert.sh
Note: The ATS task will be enabled again when the fix is released.

Databases crash due to an npm EACCES error

Applies to: 4.8.0 and 4.8.1

Fixed in: 4.8.2

Problem
Some users have experienced their Db2 and Db2 Warehouse deployments returning CrashLoopBackOff status.
Symptoms
Your database deployment returned CrashLoopBackOff status while completing one of the following procedures:
  • Backing up from an offline backup.
  • While trying to upgrade the database instance or service.
  • While trying to install the service.
  • While your database deployment restarts.
    Running the command oc logs <zen-databases pod name> returns output similar to the following example:
    npm ERR! code EACCES
    npm ERR! syscall mkdir
    npm ERR! path /.npm
    npm ERR! errno -13
    npm ERR!
    npm ERR! Your cache folder contains root-owned files, due to a bug in
    npm ERR! previous versions of npm which has since been addressed.
    npm ERR!
    npm ERR! To permanently fix this problem, please run:
    npm ERR!   sudo chown -R 1000740000:0 "/.npm"
Workaround
  1. Log in to your instance by running the following command:
    ./cpd-cli manage login-to-ocp --server=https://${CPDSERVER}:6443 -u kubeadmin -p ${PASSWORD}
    • Replace CPDSERVER with the name of your instance.
    • Replace PASSWORD with your password for the instance.
  2. Install and enable the resource specification injection (RSI).
    1. Run the following command to install RSI:
      ./cpd-cli manage install-rsi --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS}
    2. Run the following command to enable RSI:
      ./cpd-cli manage enable-rsi --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS}
  3. Create a JSON file and save it.
    1. You can use the vi editor to create the JSON file:
      vi zen-databases.json
    2. Copy the following content to use in your JSON file:
      [{"op":"add","path":"/spec/containers/0/env/-","value":{"name":"npm_config_cache","value":"/tmp"}}]
  4. Apply the RSI patch:
    ./cpd-cli manage create-rsi-patch --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} --patch_name=zen-database-set-env --patch_type=rsi_pod_env_var --patch_spec=/tmp/work/zen-databases.json --spec_format=json --include_labels=component:zen-databases --state=active
  5. Verify that the RSI patch was successfully completed by running the following command:
    ./cpd-cli manage get-rsi-patch-info --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} --patch_name=rsi-zen-database-set-env
    See the following example output:
    [SUCCESS] 2023-10-10T15:34:04.313629Z You may find output and logs in the /usr/local/bin/cpd-cli-tools/cpd-cli-linux-EE-13.0.2-30/cpd-cli-workspace/olm-utils-workspace/work directory. 
    [SUCCESS] 2023-10-10T15:34:04.313692Z The get-rsi-patch-info command ran successfully.

Problem with Q Replication

Important: All Q Replication users on Cloud Pak for Data version 4.8.0 and later must complete this workaround to avoid potential data loss.

Applies to: 4.8.0 and 4.8.1

Fixed in: 4.8.2

Problem
There are 2 missing Db2 registry variables on all deployments of Db2 on Cloud Pak for Data version 4.8.0 and later releases.
Workaround
  1. Select your deployment by running the following command:
    oc -n ${PROJECT_CPD_INST_OPERANDS} exec -it ${db2_podname} -- bash
  2. Switch to the db2inst1 profile:
    su - db2inst1
  3. Add the following registry variables to your deployment, by running the following commands:
    db2set DB2CHECKCLIENTINTERVAL=0 IMMED
    db2set DB2_CDE_ENABLE_FIX_FOR_REPLICATION_ISSUE_37229=true

Db2 instance stuck in pending state

Applies to: 4.8.0 and 4.8.1

Fixed in: 4.8.2

Problem
Users on some AMD processors might experience processing delays with their Db2 instance due to the GSKit tool hanging while configuring an SSL connection. For more information on the GSKit, see GSKit Hot Topics.
Symptoms
  • Progress halts while seeding the random number generator.
  • Db2uCluster remains in NotReady state.
  • The restore-morph job has not appeared for some time.
Workaround
  1. Exec into the pod.
    1. Find your db2ucluster resource name:
      oc get db2ucluster - -all-namespaces
      Assign it to the environment variable DB2_CR:
      DB2_CR=<db2ucluster resource name>
    2. Find the name of your Db2 pod for your deployment:
      db2_podname=$(oc -n ${PROJECT_CPD_INST_OPERANDS} get po --selector name=dashmpp-head-0 | grep ${db2ucluster} | awk {'print $1'})
    3. Run the exec command to exec into the pod:
      oc -n ${PROJECT_CPD_INST_OPERANDS} exec -it ${db2_podname} -- bash
  2. Confirm that your deployment is affected by this issue:
    cd $SUPPORTDIR
    cat db2u_morph*
    Compare your output with the following result. If it is similar, your deployment is affected by this issue.
    echo '[2023-11-15 14:19:16,249] - db2_restoremorph_functions.sh:162(restore_db) - INFO: RESTORE DATABASE BLUDB FROM '''/mnt/blumeta0/db2/backup''' ON '''/mnt/blumeta0/db2/databases''' DBPATH ON '''/mnt/blumeta0/db2/databases''' INTO BLUDB NEWLOGPATH '''/mnt/logs/active/BLUDB''' REDIRECT ENCRYPT WITHOUT ROLLING FORWARD'
  3. Run the following commands to fix the issue:
    touch /db2u/tmp/.pause_probe
    sv stop db2u
    su - db2inst1
    su - db2inst1 -c "db2_kill"
    db2_install_path="/opt/ibm/db2/V*"
    ls -lad ${db2_install_path} 2>/dev/null || db2_install_path="/opt/ibm/db2"
    iccsig_file=$(ls -1 ${db2_install_path}/lib64/gskit/C/icc/icclib/ICCSIG.txt)
    sudo chmod o+w "${iccsig_file}"
    sudo echo "ICC_SHIFT=3" >> "${iccsig_file}"
    sudo -E /db2u/db2u_root_entrypoint.sh
    After the /db2u/db2u_root_entrypoint.sh script completes, the db2ucluster will return to Ready state.

Connect statement on Db2 Warehouse MPP hangs after manage_snapshots --action suspend

After running the manage_snapshots --action suspend command to suspend Db2 write operations, the db2_all db2 connect to dbname or manage_snapshots --action resume commands might hang.

The manage_snapshots --action suspend and manage_snapshots --action resume commands can be executed explicitly while performing a snapshot backup with Db2 Warehouse container commands or as part of Backing up and restoring an entire deployment.

The db2_all db2 connect to dbname command is executed in the manage_snapshots --action resume script.

Symptoms

manage_snapshots --action resume command hangs at connect to BLUDB:

oc exec -it c-db2wh-crd-mpp-2x2-separate-db2u-0 -- manage_snapshots --action resume
Defaulted container "db2u" out of: db2u, init-labels (init), init-kernel (init)

connect to BLUDB
Workaround
  1. Locate the catalog node pod:
    oc get po -l name=dashmpp-head-0
  2. Run the exec command as the db2instance user to access an interactive shell inside of the container:
    oc exec -it c-db2wh-1639609960917075-db2u-0 -- su - db2inst1
  3. Issue a db2 connect command to the database.
    db2 connect to BLUDB
    If the command hangs, repeat steps 1-3 in another terminal.
  4. When the connect command is successful, issue the manage_snapshots --action resume command:
    manage_snapshots --action resume

Unable to connect to Db2 through the Db2Rest add-on

Applies to: 4.8.0, 4.8.1, 4.8.2, 4.8.3

Fixed in: 4.8.4

The GSKit does not create the SSL keystore and keystash files correctly, which causes GSKit commands to fail. This happens because the Db2Rest entrypoint script references a file that does not exist in the rest pod (db2_encryption_functions.sh).

When you try to connect to your Db2 database instance using the Db2Rest add-on pod, you might get an error message from the CLI driver and be unable to connect to your Db2 database instance.

Workaround

To resolve the problem, see Unable to connect to Db2 through Db2Rest.

Enabling archiveToDb impacts performance

Applies to: 4.7.0 and later

When you enable audit logging and set archiveToDb to true, Db2 audit stores loaded images after the LOAD task is finished. Keeping these images requires a large amount of disk space in the /mnt/bludata0/db2/copy or /mnt/bludata0/scratch/db2/copy paths. AUDIT.* tables display repetitive logs.

Workaround
To resolve the problem, you must disable the archiveToDb parameter if you are enabling db2u audit
  1. Update the db2ucluster CR.
    1. Find your db2ucluster resource name.
      oc get db2ucluster -n ${PROJECT_CPD_INST_OPERANDS}
    2. Update your db2ucluster resource
      oc edit db2ucluster <db2u_cluster_name> -n ${PROJECT_CPD_INST_OPERANDS} -o yaml
    3. Check archiveToDb is set to false, for example:
      spec:
        addOns:
          audit:
            archiveToDb: false
  2. Update the Audit setups and stored procedure inside the container.
    1. Select your deployment by running the following command.
      oc -n ${PROJECT_CPD_INST_OPERANDS} exec -it ${db2_podname} -- bash
    2. Switch to the db2inst1 profile.
      su - db2inst1
    3. Update Audit setup with --archive-to-db, for example:
      python3 /db2u/script/installaudit.py --archive-to-db false

Issues when creating a Db2 connection with Cloud Pak for Data credentials

When you create a Db2 connection in the web console, an error can occur if you check the Cloud Pak for Data authentication box. To work around this issue, enter your Cloud Pak for Data credentials in the User name and Password fields, and do not check the Cloud Pak for Data authentication box.

Db2 post restore hook fails during restore operation 1

Symptoms
The backup log indicates the following message:
...
time=2022-06-06T11:00:28.035568Z level=info msg=   status: partially_succeeded
time=2022-06-06T11:00:28.035572Z level=info msg=   nOpResults: 70
time=2022-06-06T11:00:28.035585Z level=info msg=   postRestoreViaConfigHookRule on restoreconfig/analyticsengine-br in namespace wkc (status=succeeded)
time=2022-06-06T11:00:28.035589Z level=info msg=   postRestoreViaConfigHookRule on restoreconfig/lite in namespace wkc (status=succeeded)
time=2022-06-06T11:00:28.035593Z level=info msg=   postRestoreViaConfigHookRule on restoreconfig/db2u in namespace wkc (status=error)
...
time=2022-06-06T11:00:28.035601Z level=info msg=   postRestoreViaConfigHookJob on restoreconfig/wkc in namespace wkc (status=timedout)
...
Either db2u pod c-db2oltp-iis-db2u or c-db2oltp-wkc-db2u does not progress beyond:
....
+ db2licm_cmd=/mnt/blumeta0/home/db2inst1/sqllib/adm/db2licm
+ /mnt/blumeta0/home/db2inst1/sqllib/adm/db2licm -a /db2u/license/db2u-lic
Resolution

Delete the affected db2u pods and then check that the pods are up and running.

oc get pod | grep -E "c-db2oltp-iis-db2u|c-db2oltp-wkc-db2u"

Run the post restore hook again.

cpd-cli oadp restore posthooks --include-namespaces wkc --log-level=debug --verbose

Db2 post restore hook fails during restore operation 2

Symptoms
The restore log indicates the following message:
...
* ERROR: Database could not be activated
Failed to restart write resume and/or active database
...
Resolution

Delete the affected db2u pods and then check that the pods are up and running.

oc get pod | grep -E "c-db2oltp-iis-db2u|c-db2oltp-wkc-db2u"

Run the post restore hook again.

cpd-cli oadp restore posthooks --include-namespaces wkc --log-level=debug --verbose