Known issues and limitations for watsonx Assistant

The following known issues and limitations apply to watsonx Assistant.

The deploy-knative-eventing fails with error: multiNamespace InstallModeType not supported

Applies to: 5.1.x

Problem
This issue arises due to the interaction between the namespace scoping approach used by the deploy-knative-eventing installation and the default behavior of the IBM Namespace Scope Operator when users execute the setup-instance-topology command.
Solution
To remove the error messages, you can modify the setting to allow the MultiNamespace InstallModeType in the operator:
  1. Edit the namespace scope operator csv file:
    oc edit csv ibm-namespace-scope-operator.v{$CPD_VERSION} -n ibm-knative-events
  2. Set the MultiNamespace parameter to true:
      
      installModes:
      - supported: true
        type: MultiNamespace
  3. Delete the old namespace-scope-operator pod.
    
    nss_op_pod=$(oc get pods -n ibm-knative-events -l name=ibm-namespace-scope-operator --no-headers | awk '{print $1}')
    oc delete pod $nss_op_pod
  4. Rerun the following command:
    cpd-cli manage setup-instance-topology --release=${VERSION} --cpd_operator_ns=ibm-knative-events --cpd_instance_ns=knative-eventing --license_acceptance=true

watsonx Assistant webhook is not working

Applies to: 5.1.x

Problem
Unable to invoke webhooks programmatically in watsonx Assistant due to blocked self-signed certificates and private IP restrictions.
Solution
Apply the following patch to allow self-signed certificates and private IP in watsonx Assistant:
oc patch wa $INSTANCE_NAME --type='merge' -p='{"configOverrides":{"dialog":{"extra_vars": {"DIALOG_FEATURE_SELF_SIGNED_CERTIFICATES_IN_WEBHOOKS":true,"BLOCKED_CALLOUT_IPS":"[]"}}}}'

The deploy-knative-eventing fails with error: no matching resources found

Applies to: 5.1.3

Problem
When running the cpd-cli manage deploy-knative-eventing command, it fails with error: no matching resources found after the message deployment.apps/kafka-controller condition met. This issue arises because no pods with the label app=kafka-broker-dispatcher are present.
Solution
  1. Exec into the docker pod running olm utils.
    docker exec -it olm-utils-play-v3 bash
  2. Check for the line that is to be removed.
    cat /opt/ansible/bin/deploy-knative-eventing | grep kafka-broker-dispatcher
    Output:
    cat /opt/ansible/bin/deploy-knative-eventing | grep kafka-broker-dispatcher
    oc wait pods -n knative-eventing --selector app=kafka-broker-dispatcher --for condition=Ready --timeout=60s
  3. Remove the line.
    sed -i '/kafka-broker-dispatcher/d' /opt/ansible/bin/deploy-knative-eventing
  4. Verify whether the line is removed.
    cat /opt/ansible/bin/deploy-knative-eventing | grep kafka-broker-dispatcher
    Output:
    cat /opt/ansible/bin/deploy-knative-eventing | grep kafka-broker-dispatcher
    

Reset test run in Evaluate response settings fails with 500 error

Applies to: 5.1.2 and 5.1.3

Problem
When you reset the test run from the Evaluate response settings page, it fails with a 500 Internal Server Error, and the UI hangs.
Solution
If the page hangs after the error, go to a different page and then return to the same page, or refresh the page.

Preview page not available when the watsonx Assistant is created using the API

Applies to: 5.1.0, 5.1.1 and 5.1.2

Problem
The Preview page is not accessible when watsonx Assistant is created using the API.

wa-system-entities pods fail with Error and ContainerStatusUnknown statuses

Applies to: 5.1.1

Problem
The wa-system-entities pods fail with Error and ContainerStatusUnknown statuses. In both cases, the pods exit with error code 137 (OOMKilled) and the message: "Pod ephemeral local storage usage exceeds the total limit of containers 200Mi".
Solution
  1. Set environment variables.
    export PROJECT_CPD_INSTANCE=<namespace where watsonx Assistant is running>
    export INSTANCE=<watsonx Assistant Instance Name>  # Normally "wa"
  2. Verify the values.
    echo $PROJECT_CPD_INSTANCE
    echo $INSTANCE
  3. Apply the patch using variables.
    cat <<EOF | oc apply -f -
    apiVersion: assistant.watson.ibm.com/v1
    kind: TemporaryPatch
    metadata:
      name: wa-system-entities-ephemeral-storage
      namespace: $PROJECT_CPD_INSTANCE
    spec:
      apiVersion: assistant.watson.ibm.com/v1
      kind: WatsonAssistantClu
      name: $INSTANCE
      patchType: patchStrategicMerge
      patch:
        system-entities:
          deployment:
            spec:
              template:
                spec:
                  containers:
                  - name: system-entities
                    resources:
                      limits:
                        ephemeral-storage: 400Mi
                      requests:
                        ephemeral-storage: 300Mi
    EOF
    

Cronjob gets stuck during the watsonx Assistant upgrade from any prior releases to Version 5.1.0

Applies to: 5.1.0

Problem
The cronjob for EDB Postgres database uses the old config even after the watsonx Assistant upgrade to Version 5.1.0. This old config usage is due to some changes in resource creation. This causes the upgrade to fail.
Solution
  1. Delete the existing cronjob and cronjob created jobs so that it creates a new cronjob with the correct config.
    oc delete job -l component=store-cronjob,service=conversation
    oc delete cronjob -l component=store-cronjob,service=conversation
  2. Delete the cleanup job to restart the cleanup.
    oc delete job -l component=cleanup,service=conversation

watsonx Assistant instance cannot be opened with either Error 404 or 502

Applies to: 5.1.0 and later

Problem

You cannot open an instance of watsonx Assistant with either Error 404 - Not Found or 502 Bad Gateway.

Solution
  1. Get the GW CR item list.
    oc get watsongateway -o yaml
    Example output:
    nginxDirectives:
          - proxy_buffer_size        32k;
          - proxy_busy_buffers_size  32k;
          - proxy_buffers          8 32k;
  2. Get the maximum value out of the three nginxDirectives from nginx pod.
    • Get nginx pod name from the operand namespace.
      export PROJECT_CPD_INST_OPERANDS=cpd  # Change based on where your WA is installed in namespace
      
      oc get po -n ${PROJECT_CPD_INST_OPERANDS} | grep nginx
      
      Example output::
      
      ibm-nginx-788bc899f6-6hv95                                        2/2     Running     0                 17d
      ibm-nginx-788bc899f6-cwsg9                                        2/2     Running     0                 17d
      
      export nginx_pod=ibm-nginx-788bc899f6-6hv95 
    • Get the value of buffer from anyone of the nginx pods.
      
      oc -n ${PROJECT_CPD_INST_OPERANDS} exec -it $nginx_pod -- nginx -T | grep proxy_buffers
      Example output:
      Defaulted container "ibm-nginx-container" out of: ibm-nginx-container, zen-objstore-mirror-container, zen-objstore-init-container (init)
       proxy_buffers  4 256k;
       proxy_buffers  8 512k;
       proxy_buffers  4 256k;
       proxy_buffers  4 256k;
       proxy_buffers  4 256k;
       proxy_buffers 8 64k;
       proxy_buffers 8 64k;
       proxy_buffers 8 64k;
      
      oc -n cpd-instance-grp5 exec -it $nginx_pod -- nginx -T | grep proxy_buffer_size
      Example output:
      
      Defaulted container "ibm-nginx-container" out of: ibm-nginx-container, zen-objstore-mirror-container, zen-objstore-init-container (init)
      proxy_buffer_size 8k;
       proxy_buffer_size  512k;
       proxy_buffer_size  256k;
       proxy_buffer_size  256k;
       proxy_buffer_size  256k;
       proxy_buffer_size  256k;
      proxy_buffer_size 8k;
       proxy_buffer_size 64k;
       proxy_buffer_size 64k;
       proxy_buffer_size 64k;
      proxy_buffer_size 8k;
      oc -n cpd-instance-grp5 exec -it $nginx_pod -- nginx -T | grep proxy_busy_buffers_size
      Example output:
      Defaulted container "ibm-nginx-container" out of: ibm-nginx-container, zen-objstore-mirror-container, zen-objstore-init-container (init)
       proxy_busy_buffers_size 256k;
       proxy_busy_buffers_size 512k;
       proxy_busy_buffers_size 64k;
      
  3. Get the maximum value for all the three directives and substitute in the following file.
    Note: proxy_busy_buffers_size must be equal to or greater than the maximum of the value of proxy_buffer_size and one of the proxy_buffers.
    cat <<EOF | oc apply -f -
    apiVersion: zen.cpd.ibm.com/v1
    kind: ZenExtension
    metadata:
      name: proxy-buffers-extension
    spec:
      proxy-buffer.conf: |
        proxy_buffers 8 512k;
        proxy_buffer_size 512k;
        proxy_busy_buffers_size 512k;
      extensions: | 
        [
          {
            "extension_point_id": "zen_front_door",
            "extension_name": "proxy-buffers-extension",
            "details": {
              "upstream_conf": "proxy-buffer.conf"
            }
          }
        ]
    EOF
  4. Check whether all the zenextension are in completed state. This process takes approximately 15 minutes to complete.
    oc get zenextension -n $PROJECT_CPD_INST_OPERANDS
  5. Check whether the zenservice is in completed state.
    oc get zenservice -n $PROJECT_CPD_INST_OPERANDS
  6. Check nginx pod and that must be in running state.
    oc get po -n $PROJECT_CPD_INST_OPERANDS | grep nginx

Opening bookmarked URL of watsonx Assistant tooling page returns error

Applies to: 5.1.0

Problem
When you open watsonx Assistant tooling page from bookmarked URL, it returns an error with code 500.
Solution
Login to IBM Software Hub and access the watsonx Assistant tooling page.

Preview page not available for Watson Discovery integration

Applies to: 5.1.0

Problem
The Preview page does not appear when you integrate Watson Discovery in watsonx Assistant.

Postgres pod goes to CrashLoopBackOff status after the upgrade

Applies to: 5.1.0 or later

Problem
When you upgrade watsonx Assistant, one of the Postgres pods goes to CrashLoopBackOff. This issue occurs because your data is corrupted.
Solution
  1. Run the following command to find the watsonx Assistant Postgres pod in the CrashLoopBackOff state.
    oc get pods --no-headers | grep -Ev "Comp|0/0|1/1|2/2|3/3|4/4|5/5|6/6|7/7|8/8" | grep wa-postgres
    The output looks like this:
    wa-postgres-3 0/1 CrashLoopBackOff 115 (2m30s ago) 9h
  2. Run the following command to identify if the Postgres pod is the primary pod:
    oc get cluster | grep wa-postgres
    The output looks like this:
    oc get cluster | grep wa-postgres
    wa-postgres         2d20h   3           3       Cluster in healthy state   wa-postgres-1
    Where wa-postgres-1 is the primary pod.
    Tip: If the primary instance is in CrashLoopBackOff status, do the steps in the Postgres cluster in bad state topic.
  3. Delete the non-primary pod and its PersistentVolumeClaim (PVC) to create a new pod that syncs with the primary pod.
    Warning: Do not delete a primary pod because it can lead to database downtime and potential data loss.
    oc delete pod/wa-postgres-3 pvc/wa-postgres-3
    Important: Ensure that the EDB operator is running before deleting the pod and its PVC.