Online upgrade of IBM Cloud Pak for AIOps (CLI method)

Use these instructions to upgrade an online starter or production depolyment of IBM Cloud Pak® for AIOps 4.6.0 or later to 4.7.0.

This procedure can be used on an online deployment of IBM Cloud Pak for AIOps 4.6.0 or later, and can still be used if the deployment has had hotfixes applied. If you have an offline deployment, follow the instructions in Upgrading IBM Cloud Pak for AIOps (offline).

Before you begin

Notes:

  • Ensure that you are on a version of Red Hat OpenShift that your current and target versions of IBM Cloud Pak for AIOps both support. If you already have a qualifying version of Red Hat OpenShift but you want to upgrade it, then complete the IBM Cloud Pak for AIOps upgrade first. For more information, see Guidance for upgrades that require an Red Hat OpenShift upgrade.
  • Ensure that you are logged in to your Red Hat OpenShift cluster with oc login for any steps that use the Red Hat OpenShift command-line interface (CLI).
  • Red Hat OpenShift requires a user with cluster-admin privileges for the following operations:

Warnings:

  • Custom patches, labels, and manual adjustments to IBM Cloud Pak for AIOps resources are lost when IBM Cloud Pak for AIOps is upgraded, and must be manually reapplied after upgrade. For more information, see Manual adjustments are not persisted.
  • If you previously increased the size of a PVC directly, then you must follow the correct procedure that is supplied in Resizing storage to ensure that the size is updated by the operator. Failure to do so before upgrading IBM Cloud Pak for AIOps causes the operator to attempt to restore a lower default value for the PVC, and causes an error in your IBM Cloud Pak for AIOps deployment.

Restrictions:

  • You cannot use these instructions to upgrade deployments of IBM Cloud Pak for AIOps 4.5.1 or earlier. For more information, see Upgrade paths.
  • The upgrade cannot be removed or rolled back.

Upgrade procedure

Follow these steps to upgrade your online IBM Cloud Pak for AIOps deployment.

  1. Ensure cluster readiness
  2. Configure the new catalogs
  3. Maintain custom IR Flink configuration
  4. Maintain custom Flink task manager configuration
  5. Update the operator subscription
  6. Verify the deployment
  7. Post upgrade actions

1. Ensure cluster readiness

Recommended: Take a backup before upgrading. For more information, see Backup and restore.

  1. Ensure that your cluster still meets all of the prerequisites for deployment. For more information, see Planning.

    Note: IBM Cloud Pak for AIOps 4.7.0 requires 6 CPU more than IBM Cloud Pak for AIOps v4.6 and v4.6.1.

  2. If you still have waiops_var.sh from when you installed IBM Cloud Pak for AIOps, then run the following command from the directory that the script is in, to set the environment variables that are used later.

    . ./waiops_var.sh
    

    If you do not have waiops_var.sh, then run the following commands to set the environment variables that you need for upgrade.

    export PROJECT_CP4AIOPS=<project>
    export INSTALL_MODE_NAMESPACE=<install_namespace>
    

    Where

    • <project> is the namespace (project) that your IBM Cloud Pak for AIOps subscription is deployed in.
    • <install_namespace> is ${PROJECT_CP4AIOPS} if your deployment is namespace scoped, or openshift-operators if your deployment has a cluster-wide scope.
  3. Run the IBM Cloud Pak for AIOps prerequisite checker script.

    Run the prerequisite checker script to ensure that your Red Hat OpenShift Container Platform cluster is correctly set up for an IBM Cloud Pak for AIOps upgrade.

    Download the prerequisite checker script from github.com/IBM Opens in a new tab, and run it with the following command:

    ./prereq.sh -n ${PROJECT_CP4AIOPS} --ignore-allocated
    

    Important: The prerequisite checker script might show inadequate resources in the Resource Summary because the script does not account for resources already being in use by the upgrading deployment. This can be ignored, as can the following message: [ FAIL ] Small or Large Profile Install Resources.

    Example output:

    # ./prereq.sh -n cp4aiops --ignore-allocated
    [INFO] Starting IBM Cloud Pak for AIOps prerequisite checker v4.7...
    
    CLI: oc
    
    [INFO] =================================Platform Version Check=================================
    [INFO] Checking Platform Type....
    [INFO] You are using Openshift Container Platform
    [INFO] OCP version 4.16.7 is compatible but only nodes with AMD64 architectures are supported at this time. 
    [INFO] =================================Platform Version Check=================================
    
    [INFO] =================================Storage Provider=================================
    [INFO] Checking storage providers
    [INFO] No IBM Storage Fusion Found... Skipping configuration check.
    
    [INFO] No Portworx StorageClusters found with "Running" or "Online" status. Skipping configuration check for Portworx.
    [INFO] Openshift Data Foundation found.
    [INFO] No IBM Cloud Storage found... Skipping configuration check for IBM Cloud Storage Check.
    
    Checking Openshift Data Foundation Configuration...
    Verifying if Red Hat Openshift Data Foundation pods are in "Running" or "Completed" status
    [INFO] Pods in openshift-storage project are "Running" or "Completed"
    [INFO] ocs-storagecluster-ceph-rbd exists.
    [INFO] ocs-storagecluster-cephfs exists.
    [INFO] No warnings or failures found when checking for Storage Providers.
    [INFO] =================================Storage Provider=================================
    
    [INFO] =================================Cert Manager Check=================================
    [INFO] Checking for Cert Manager operator
    
    [INFO] Successfully functioning cert-manager found.
    
    CLUSTERSERVICEVERSION             NAMESPACE
    ibm-cert-manager-operator.v4.2.8  ibm-cert-manager
    
    [INFO] =================================Cert Manager Check=================================
    
    [INFO] =================================Licensing Service Operator Check=================================
    [INFO] Checking for Licensing Service operator
    
    [INFO] Successfully functioning licensing service operator found.
    
    CLUSTERSERVICEVERSION          NAMESPACE
    ibm-licensing-operator.v4.2.8  ibm-licensing
    
    [INFO] =================================Licensing Service Operator Check=================================
    
    [INFO] =================================Starter or Production Install Resources=================================
    [INFO] Checking for cluster resources
    
    [INFO] ==================================Resource Summary=====================================================
    [INFO]                                                     Nodes     |     vCPU       |  Memory(GB)
    [INFO] Starter (Non-HA) Base (available/required)       [  9 / 3 ]   [  144 / 47 ]    [  289 / 123 ]
    [INFO]     (+ Log Anomaly Detection & Ticket Analysis)  [  9 / 3 ]   [  144 / 55 ]    [  289 / 136 ]
    
    [INFO] Production (HA) Base (available/required)        [  9 / 6 ]   [  144 / 136 ]   [  289 / 310 ]
    [INFO]     (+ Log Anomaly Detection & Ticket Analysis)  [  9 / 6 ]   [  144 / 162 ]   [  289 / 368 ]
    [INFO] ==================================Resource Summary=====================================================
    [INFO] Cluster currently has resources available to create a Starter (Non-HA) install of Cloud Pak for AIOps
    
    [INFO] =================================Prerequisite Checker Tool Summary=================================
          [  PASS  ] Platform Version Check 
          [  PASS  ] Storage Provider
          [  PASS  ] Starter (Non-HA) Base Install Resources
          [  FAIL  ] Production (HA) Base Install Resources
          [  PASS  ] Cert Manager Operator Installed
          [  PASS  ] Licensing Service Operator Installed
    [INFO] =================================Prerequisite Checker Tool Summary=================================
    

    Note: If you are not using IBM Cloud Pak® foundational services Cert Manager, then ignore any errors that are returned by the Cert Manager check.

  4. Delete any evicted connector-orchestrator pods.

    1. Run the following command to check if there are any evicted connector-orchestrator pods.

      oc get pods -n ${PROJECT_CP4AIOPS} | grep connector-orchestrator
      
    2. Cleanup any evicted connector-orchestrator pods.

      If the previous command returned any pods with a STATUS of Evicted, then run the following command to delete each of them.

      oc delete pod -n ${PROJECT_CP4AIOPS} <connector_orchestrator>
      

      Where <connector_orchestrator> is a pod returned in the previous step.

2. Configure the new catalogs

From IBM Cloud Pak for AIOps 4.7.0, the ibm-operator-catalog is no longer used. IBM Cloud Pak® foundational services Cert Manager and License Service are deployed using their own independent catalogs, and a new catalog called ibm-aiops-catalog is used for the remaining IBM Cloud Pak for AIOps operators. For more information, see Whats New.

  1. Run the following command to create the new IBM Cloud Pak for AIOps, IBM Cloud Pak® foundational services Cert Manager and License Service catalog sources in the openshift-marketplace namespace.

    cat << EOF | oc apply -f -
    apiVersion: operators.coreos.com/v1alpha1
    kind: CatalogSource
    metadata:
      name: ibm-aiops-catalog
      namespace: openshift-marketplace
    spec:
      displayName: ibm-aiops-catalog
      publisher: IBM Content
      sourceType: grpc
      image: icr.io/cpopen/ibm-aiops-catalog@sha256:9c3abcfcff17f2dfb28efa697de35a7b6bdbade58f1412a81d90283c6b93fe38
    ---
    apiVersion: operators.coreos.com/v1alpha1
    kind: CatalogSource
    metadata:
      name: ibm-cert-manager-catalog
      namespace: openshift-marketplace
    spec:
      displayName: ibm-cert-manager
      publisher: IBM
      sourceType: grpc
      image: icr.io/cpopen/ibm-cert-manager-operator-catalog
    ---
    apiVersion: operators.coreos.com/v1alpha1
    kind: CatalogSource
    metadata:
      name: ibm-licensing-catalog
      namespace: openshift-marketplace
    spec:
      displayName: IBM License Service Catalog
      publisher: IBM
      sourceType: grpc
      image: icr.io/cpopen/ibm-licensing-catalog
    EOF
    
  2. Verify that the ibm-aiops-catalog, ibm-cert-manager-catalog and ibm-licensing-catalog CatalogSource objects are in the output that is returned by the following command:

    oc get CatalogSources -n openshift-marketplace
    

    Example output:

    oc get CatalogSources -n openshift-marketplace
    NAME                     DISPLAY                      TYPE   PUBLISHER   AGE
    ibm-aiops-catalog        ibm-aiops-catalog            grpc   IBM         2m 
    ibm-cert-manager-catalog ibm-cert-manager             grpc   IBM         2m 
    ibm-licensing-catalog    IBM License Service Catalog  grpc   IBM         2m  
    

  3. Migrate your subscriptions to use the new catalogs.

    Run the following script from the command line:

    #!/bin/bash
    
    CLI="oc"
    AIOPS_SUB_PACKAGES=(
        "aimanager-operator"
        "ibm-common-service-operator"
        "aiopsedge-operator"
        "asm-operator"
        "cloud-native-postgresql"
        "ibm-aiops-orchestrator"
        "ibm-automation-elastic"
        "ibm-automation-flink"
        "ibm-events-operator"
        "ibm-commonui-operator-app"
        "ibm-iam-operator"
        "ibm-zen-operator"
        "ibm-redis-cp"
        "ibm-secure-tunnel-operator"
        "ibm-watson-aiops-ui-operator"
        "ibm-aiops-ir-ai"
        "ibm-aiops-ir-core"
        "ibm-aiops-ir-lifecycle"
        "ibm-odlm"
    )
    
    AIOPS_SUBS=($(${CLI} get subscriptions.operators.coreos.com -n "${PROJECT_CP4AIOPS}" --no-headers=true -o jsonpath="{.items[*].metadata.name}"))
    
    # Update AIOps
    for sub in "${AIOPS_SUBS[@]}"; do
        sub_package="$(${CLI} get subscriptions.operators.coreos.com -n "${PROJECT_CP4AIOPS}" "${sub}" -o jsonpath="{.spec.name}")"
        if [[ " ${AIOPS_SUB_PACKAGES[*]} " =~ [[:space:]]${sub_package}[[:space:]] ]]; then
            ${CLI} patch subscriptions.operators.coreos.com -n "${PROJECT_CP4AIOPS}" "${sub}" --type='json' -p='[{"op": "replace", "path": "/spec/source", "value": "'ibm-aiops-catalog'"}]'
        fi
    done
    
    # Update Cert Manager
    ${CLI} patch subscriptions.operators.coreos.com -n "ibm-cert-manager" "ibm-cert-manager-operator" --type='json' -p='[{"op": "replace", "path": "/spec/source", "value": "'ibm-cert-manager-catalog'"}]'
    
    # Update License Service
    ${CLI} patch subscriptions.operators.coreos.com -n "ibm-licensing" "ibm-licensing-operator-app" --type='json' -p='[{"op": "replace", "path": "/spec/source", "value": "'ibm-licensing-catalog'"}]'
    

    Note: During the upgrade of Cloud Pak for AIOps, Kubernetes jobs might fail and re-run. If a job succeeds on the second or third attempt, there can be one or two pods in Error state and one pod in the Completed state. If the job fails repeatedly, the attempt is abandoned, and the logs from failed pods are used to determine the cause of the failure. When you determine the cause for the failure, you can delete the job, and the operator can recreate it to reattempt the operations.

3. Maintain custom IR Flink configuration

If IBM Sales representatives and Business Partners used the custom sizing tool to supply you with a custom profile ConfigMap that customizes IR FlinkCluster, then use these steps to update it. Otherwise, skip the rest of this step and proceed to section 4, Maintain custom Flink task manager configuration.

Before v4.7.0, custom profiles modified the IR FlinkCluster resources through the overrides and statefulsets fields. From v4.7.0, a new FlinkDeployment resource is used, and you must add new configuration to your custom profile in addition to the old configuration before you upgrade.

  1. Run the following command to copy the contents of your custom profile ConfigMap into a temporary file called profiles.yaml.

    oc get configmap $(oc get installation.orchestrator.aiops.ibm.com -n ${INSTALL_MODE_NAMESPACE} -o jsonpath='{.items[0].status.customProfileConfigmap}') -o jsonpath='{.data.profiles}' > profiles.yaml
    
  2. Run the command vi profiles.yaml to open the file in CLI. If your YAML file contains the name: ir-lifecycle-operator property, then proceed with the following steps. If you do not see this property listed in the YAML file, then skip the rest of the steps and proceed to Update the operator subscription section.

  3. Under the name: ir-lifecycle-operator property, the custom sizing for the IR FlinkCluster is defined. There might be other custom sizing along with other properties, which might be listed. The name property within the statefulSets section controls the StatefulSet that is affected. The name property in the containers section controls the container within the StatefulSet that is affected. The values shown in the following example YAML are the default values. The jobManager and taskManager properties provide limited customization options. These two properties duplicate value in statefulSets and might not be listed at the same time.

    - name: ir-lifecycle-operator
      spec:
        lifecycleservice:
          customSizing:
            jobManager:
              resourceLimitsCPU: "1"
              resourceLimitsMemory: 768Mi
              resourceRequestsCPU: 100m
              resourceRequestsMemory: 350Mi
            statefulSets:
            - containers:
              - limits:
                  cpu: "1"
                  ephemeral-storage: 500Mi
                  memory: 768Mi
                name: jobmanager
                requests:
                  cpu: 100m
                  ephemeral-storage: 50Mi
                  memory: 350Mi
              - limits:
                  cpu: "1"
                  ephemeralstorage: 500Mi
                  memory: 512Mi
                name: tls-proxy
                requests:
                  cpu: 100m
                  ephemeralstorage: 50Mi
                  memory: 256Mi
              name: eventprocessor-ep-jobmanager
              replicas: 2
            - containers:
              - limits:
                  cpu: "2"
                  ephemeralstorage: 3Gi
                  memory: 4Gi
                name: taskmanager
                requests:
                  cpu: "1"
                  ephemeralstorage: 3Gi
                  memory: 4Gi
              - limits:
                  cpu: "1"
                  ephemeralstorage: 3Gi
                  memory: 512Mi
                name: ibm-lifecycle-policy-sync
                requests:
                  cpu: 100m
                  ephemeralstorage: 3Gi
                  memory: 256Mi
              name: eventprocessor-ep-taskmanager
              replicas: 6
            taskManager:
              resourceLimitsCPU: "2"
              resourceLimitsMemory: 4Gi
              resourceRequestsCPU: "1"
              resourceRequestsMemory: 4Gi
          overrides:
            eventprocessor:
              flink:
                properties:
                  taskmanager.memory.task.heap.size: 1638M
    
  4. The custom sizing structure in the preceding step needs to be converted to the following structure:

    • Copy the eventprocessor-ep-jobmanager and eventprocessor-ep-taskmanager in statefulSets to deployments. Change the names to flink and flink-taskmanager respectively. Change the container names jobmanager and taskmanager to flink-main-container.
    • If the jobManager and taskManager properties exist, then use the values to set the cpu and memory properties in the next step.
    • In the case that jobManager and taskManager properties exist, copy the overrides.eventprocessor.flink.properties properties into overrides.flinkdeployment.properties.

    The default values are showing in the following example:

    • Deployment: flink
    • container: flink-main-container
    • requests.memory: increased from 350Mi to 768Mi. If you have a custom value for this setting, then you can increase the memory
    - name: ir-lifecycle-operator
      spec:
        lifecycleservice:
          customSizing:
            jobManager:
              ...
            deployments:
            - containers:
              - limits:
                  cpu: "1"                  # Could be from jobManager.resourceLimitsCPU
                  ephemeral-storage: 500Mi
                  memory: 768Mi             # Could be from jobManager.resourceLimitsMemory
                name: flink-main-container  # This was jobmanager
                requests:
                  cpu: 100m                 # Could be from jobManager.resourceRequestsCPU
                  ephemeral-storage: 50Mi
                  memory: 768Mi             # Could be frrom jobManager.resourceRequestsMemory
              - limits:
                  cpu: "1"
                  ephemeralstorage: 500Mi
                  memory: 512Mi
                name: tls-proxy
                requests:
                  cpu: 100m
                  ephemeralstorage: 50Mi
                  memory: 256Mi
              name: flink                    # This was eventprocessor-ep-jobmanager
              replicas: 2
            - containers:
              - limits:
                  cpu: "2"                   # Could be from taskManager.resourceLimitsCPU
                  ephemeralstorage: 3Gi
                  memory: 4Gi                # Could be from taskManager.resourceLimitsMemory
                name: flink-main-container   # This was taskmanager
                requests:
                  cpu: "1"                   # Could be from taskManager.resourceRequestsCPU
                  ephemeralstorage: 3Gi
                  memory: 4Gi                # Could be from taskManager.resourceRequestsMemory
              - limits:
                  cpu: "1"
                  ephemeralstorage: 3Gi
                  memory: 512Mi
                name: ibm-lifecycle-policy-sync
                requests:
                  cpu: 100m
                  ephemeralstorage: 3Gi
                  memory: 256Mi
              name: flink-taskmanager        # This was eventprocessor-ep-taskmanager
              replicas: 6
            statefulSets:
              ...
            taskManager:
              ...
          overrides:
            eventprocessor:
              ...
            flinkdeployment:
              properties:
                taskmanager.memory.task.heap.size: 1638M
    
  5. Update your custom profile ConfigMap with the contents of profiles.yaml.

    oc set data configmap $(oc get installation.orchestrator.aiops.ibm.com -n ${INSTALL_MODE_NAMESPACE} -o jsonpath='{.items[0].status.customProfileConfigmap}')  --from-file=profiles=profiles.yaml
    

4. Maintain custom Flink task manager configuration

Use the following steps to determine if your deployment has a custom configuration for the Flink task manager replica count. This would have been configured after installation using the steps in Increasing data streaming capacity.

  1. Run the following command to determine if your deployment has a custom configuration for the Flink task manager replica count.

    oc get subscriptions.operators.coreos.com ibm-aiops-orchestrator -n ${INSTALL_MODE_NAMESPACE} -o jsonpath='{.spec.config.env[?(@.name=="FLINK_TASK_MGR_REPLICAS")].value}' 
    

    If the command returns nothing, then skip the rest of this step and proceed to section 5, Update the operator subscription.

  2. Run the following steps to maintain your custom Flink task manager replica count during upgrade.

    If IBM Sales representatives and Business Partners did not use the custom sizing tool to supply you with a custom profile ConfigMap, then run the following command to apply one:

    export FLINK_TASK_MGR_REPLICAS=$(oc get subscriptions.operators.coreos.com ibm-aiops-orchestrator -n ${INSTALL_MODE_NAMESPACE} -o jsonpath='{.spec.config.env[?(@.name=="FLINK_TASK_MGR_REPLICAS")].value}')
    
    cat << EOF | oc apply -f -
    apiVersion: v1
    kind: ConfigMap
    metadata:
    labels:
       app.kubernetes.io/instance: aiops
       app.kubernetes.io/managed-by: ibm-aiops-orchestrator
       app.kubernetes.io/name: custom-sized-profiles
       app.kubernetes.io/part-of: ibm-aiops
       aiops-custom-size-profile-version: 4.7.0
    name: aiops-custom-size-profile
    namespace: ${INSTALL_MODE_NAMESPACE}
    data:
    # WARNING: Modifications to this ConfigMap may cause your AIOps installation to become unstable.
    profiles: |
       generatedfor: HA
       cp4waiops-eventprocessor:
          flink:
          taskmanager:
             replicas: ${FLINK_TASK_MGR_REPLICAS}
    EOF
    

    If IBM Sales representatives and Business Partners used the custom sizing tool to supply you with a custom profile ConfigMap, then run the following commands to update it:

    1. Determine the required number of Flink task manager replicas.

      oc get subscriptions.operators.coreos.com ibm-aiops-orchestrator -n ${INSTALL_MODE_NAMESPACE} -o jsonpath='{.spec.config.env[?(@.name=="FLINK_TASK_MGR_REPLICAS")].value}'
      
    2. Run the following command to copy the contents of your custom profile ConfigMap into a temporary file called profiles2.yaml.

      oc get configmap $(oc get installation -n ${INSTALL_MODE_NAMESPACE} -o jsonpath='{.items[0].status.customProfileConfigmap}') -o jsonpath='{.data.profiles}' > profiles2.yaml
      
    3. Edit profiles2.yaml and adjust the value of flink.taskmanager.replicas to match the value obtained in step 2a.

      Example excerpt:

      cp4waiops-eventprocessor:
            flink:
            taskmanager:
               replicas: <Flink task manager replica count>
      

      Where <Flink task manager replica count> is the value obtained in step 2a.

    4. Update your custom profile ConfigMap with the contents of profiles2.yaml.

      oc set data configmap $(oc get installation -n ${INSTALL_MODE_NAMESPACE} -o jsonpath='{.items[0].status.customProfileConfigmap}')  --from-file=profiles=profiles2.yaml
      

5. Update the operator subscription

  1. Update the spec.channel value of the IBM Cloud Pak for AIOps subscription to the release that you want to upgrade to, v4.7.

    oc patch subscription.operators.coreos.com ibm-aiops-orchestrator -n ${INSTALL_MODE_NAMESPACE} --type=json -p='[{'op': 'replace', 'path': '/spec/channel', 'value': 'v4.7'}]'
    
  2. If you are installing in AllNamespaces mode, run the following command to refresh the connectors' secret:

    oc delete secret cp4waiops-connectors-deploy-cert-secret -n "${PROJECT_CP4AIOPS}"
    

    For more information about installation modes, see Operator installation mode.

6. Verify the deployment

6.1 Check the deployment

Run the following command to check that the PHASE of your deployment is Updating.

oc get installations.orchestrator.aiops.ibm.com -n ${PROJECT_CP4AIOPS}

Example output:

NAME           PHASE     LICENSE    STORAGECLASS   STORAGECLASSLARGEBLOCK   AGE
ibm-cp-aiops   Updating  Accepted   rook-cephfs    rook-ceph-block          3m

It takes around 60-90 minutes for the upgrade to complete (subject to the speed with which images can be pulled). When installation is complete and successful, the PHASE of your installation changes to Running. If your installation phase does not change to Running, then use the following command to find out which components are not ready:

oc get installation.orchestrator.aiops.ibm.com -o yaml -n ${PROJECT_CP4AIOPS} | grep 'Not Ready'

Example output:

lifecycleservice: Not Ready
zenservice: Not Ready

To see details about why a component is Not Ready run the following command, where <component> is the component that is not ready, for example zenservice.

oc get <component> -o yaml -n ${PROJECT_CP4AIOPS}

(Optional) You can also download and run a status checker script to see information about the status of your deployment. For more information about how to download and run the script, see github.com/IBMOpens in a new tab.

If the installation fails, or is not complete and is not progressing, then see Troubleshooting installation and upgrade and Known Issues to help you identify any installation problems.

Important: Wait for the deployment to enter a Running phase before continuing to the next step.

6.2 Check the version

Run the following command and check that the VERSION that is returned is 4.7.0.

oc get csv -l operators.coreos.com/ibm-aiops-orchestrator.${INSTALL_MODE_NAMESPACE} -n ${INSTALL_MODE_NAMESPACE}

Example output:

oc get csv -l operators.coreos.com/ibm-aiops-orchestrator.cp4aiops -n cp4aiops

NAME                           DISPLAY                  VERSION  REPLACES                       PHASE
ibm-aiops-orchestrator.v4.7.0  IBM Cloud Pak for AIOps  4.7.0    ibm-aiops-orchestrator.v4.6.1  Succeeded

7. Post upgrade actions

  1. If you previously set up backup or restore on your deployment, then you must follow the instructions in Upgrading IBM Cloud Pak for AIOps backup and restore artifacts.

  2. If the EXPIRY_SECONDS environment variable was set for configuring log anomaly alerts, the environment variable was not retained in the upgrade. After the upgrade is completed, set the environment variable again. For more information about setting the variable, see Configuring expiry time for log anomaly alerts.

  3. If you have a metric integration configured that stops working after upgrade, then you must follow the instructions in After upgrade, a metric integration goes into a failed state.

  4. If upgrade does not complete because the lifecycletrigger component is stuck, then follow the instructions in Upgrade does not complete because the lifecycletrigger component is stuck.

  5. (Optional) You can use the following steps to remove unneccesary data from your Cloud Pak for AIOps environment:

    Note: Use the following steps if high availability (HA) is enabled for your Cloud Pak for AIOps deployment.

    1. Switch to the project (namespace) where Cloud Pak for AIOps is deployed.

      oc project <namespace>
      
    2. Verify the health of your Cloud Pak for AIOps deployment:

      oc get installation.orchestrator.aiops.ibm.com -o go-template='{{ $i:=index .items 0 }}{{ range $c,$s := $i.status.componentstatus }}{{ $c }}{{ ": "}}{{ $s }}{{ "\n" }}end'
      

      All the components need to be in Ready status.

    3. Delete the zookeeper data by running the following four commands:

      oc exec iaf-system-zookeeper-0 – /opt/kafka/bin/zookeeper-shell.sh 127.0.0.1:12181 deleteall /flink/aiops/ir-lifecycle
      
      oc exec iaf-system-zookeeper-0 – /opt/kafka/bin/zookeeper-shell.sh 127.0.0.1:12181 deleteall /flink/aiops/ir-lifecycle2
      
      oc exec iaf-system-zookeeper-0 – /opt/kafka/bin/zookeeper-shell.sh 127.0.0.1:12181 deleteall /flink/aiops/ir-lifecycle3
      
      oc exec iaf-system-zookeeper-0 – /opt/kafka/bin/zookeeper-shell.sh 127.0.0.1:12181 deleteall /flink/aiops/cp4waiops-eventprocessor
      
    4. Delete the Issue Resolution (IR) lifecycle metadata by running the following three commands:

      img=$(oc get csv -o jsonpath='{.items[?(@.spec.displayName=="IBM AIOps AI Manager")].spec.install.spec.deployments[?(@.name=="aimanager-operator-controller-manager")].spec.template.metadata.annotations.olm\.relatedImage\.opencontent-minio-client}')
      
      minio=$(oc get flinkdeployment aiops-ir-lifecycle-flink -o jsonpath='{.spec.flinkConfiguration.s3\.endpoint}')
      
      oc delete job --ignore-not-found aiops-clean-s3
      cat <<EOF | oc apply --validate -f -
      apiVersion: batch/v1
      kind: Job
      metadata:
        name: aiops-clean-s3
      spec:
        backoffLimit: 6
        parallelism: 1
        template:
          metadata:
            labels:
              component: aiops-clean-s3
            name: clean-s3
          spec:
            affinity:
              nodeAffinity:
                requiredDuringSchedulingIgnoredDuringExecution:
                  nodeSelectorTerms:
                  - matchExpressions:
                    - key: kubernetes.io/arch
                      operator: In
                      values:
                      - amd64
            containers:
            - command:
              - /bin/bash
              - -c
              - |-
                echo "Connecting to Minio server: $minio"
                try=0
                while true; do
                  mc alias set aiopss3 $minio \$(cat /config/accesskey) \$(cat /config/secretkey)
                  if [ \$? -eq 0 ]; then break; fi
                  try=\$(expr \$try + 1)
                  if [ \$try -ge 30 ]; then exit 1; fi
                  sleep 2
                done
                /workdir/bin/mc rm -r --force aiopss3/aiops-ir-lifecycle/high-availability/ir-lifecycle
                x=$?
                /workdir/bin/mc ls aiopss3/aiops-ir-lifecycle/high-availability
                exit $x
              image: $img
              imagePullPolicy: IfNotPresent
              name: clean-s3
              resources:
                limits:
                  cpu: 500m
                  memory: 512Mi
                requests:
                  cpu: 200m
                  memory: 256Mi
              securityContext:
                allowPrivilegeEscalation: false
                capabilities:
                  drop:
                  - ALL
                privileged: false
                readOnlyRootFilesystem: false
                runAsNonRoot: true
              volumeMounts:
              - name: s3-credentials
                mountPath: /config
              - name: s3-ca
                mountPath: /workdir/home/.mc/certs/CAs
            volumes:
            - name: s3-credentials
              secret:
                secretName: aimanager-ibm-minio-access-secret
            - name: s3-ca
              secret:
                items:
                - key: ca.crt
                  path: ca.crt
                secretName: aimanager-certificate-secret
            restartPolicy: Never
            serviceAccount: aimanager-workload-admin
            serviceAccountName: aimanager-workload-admin
      EOF
      
    5. Check the status of the job:

      oc get po -l component=aiops-clean-s3
      

      Verify that the status shows as Completed.