Upgrading subsystems on native Kubernetes

Upgrade your API Connect subsystems to the latest version on native Kubernetes.

Before you begin

  • To ensure operator compatibility, upgrade API Connect management subsystem first and DataPower gateway subsystem second. This requirement applies to all upgrade scenarios.

    The Gateway subsystem remains available during the upgrade of the Management, Portal, and Analytics subsystems.

  • Verify that your installed Kubernetes version is supported by the API Connect version that you are upgrading to. See IBM API Connect Version 10 software product compatibility requirements. If your Kubernetes version is older than the minimum supported version for your target API Connect version, then upgrade Kubernetes first.
    Note: If you are upgrading from an old API Connect version, it is possible that it does not support the minimum supported Kubernetes version that is required by the latest API Connect release. If so, one or more intermediate upgrades of API Connect might be required to ensure that API Connect is always running on a supported Kubernetes version.

Procedure

  1. Complete the prerequisites:
    1. Ensure you are upgrading from a supported version and reviewed the upgrade requirements and limitations. See Upgrade considerations on native Kubernetes.
    2. Verify that the pgcluster is healthy:
      1. Get the name of the pgcluster:
        kubectl get pgcluster -n <APIC_namespace>

        The response displays the name of the postgres cluster running in the specified namespace.

      2. Check the status of the pgcluster:
        kubectl get pgcluster <pgcluster_name> -n <APIC_namespace> -o yaml | grep status -A 2 | tail -n3

        The response for a healthy pgcluster looks like the following example, where the state is Initialized:

        status:
            message: Cluster has been initialized
            state: pgcluster Initialized
      Important: If the pgcluster returns any other state, it is not healthy and an upgrade will fail.
      • If there are any on-going backup or restore jobs, wait until they complete and then check the status again. Do not proceed with the upgrade until the status is Initialized.
      • If all of the background jobs complete but the pgcluster remains unhealthy, contact IBM Support for assistance.
    3. Backup the current deployment. Wait until the backup completes before starting the upgrade.
      • Do not start an upgrade if a backup is scheduled to run within a few hours.
      • Do not perform maintenance tasks such as rotating key-certificates, restoring from a backup, or starting a new backup, at any time while the upgrade process is running.
    4. If you used any microservice image overrides in the management CR during a fresh install, you must remove the image overrides prior to upgrade.
      Important: When upgrading, if you used any microservice image overrides in the management CR during a fresh install then prior to upgrade these image overrides will be automatically removed by the operator during upgrade. You can apply them again after the upgrade is complete.
  2. Run the pre-upgrade health check:
    Attention: This is a required step. Failure to run this check before upgrading could result in problems during the upgrade.

    If you have a 2DCDR deployment, run this check on the active site only. You can run the check on the warm-standby after it is made stand-alone. See Steps for Management upgrades.

    1. Verify that the apicops utility is installed by running the following command to check the current version of the utility:
      apicops --version

      If the response indicates that apicops is not available, install it now. See The API Connect operations tool: apicops in the API Connect documentation.

    2. Run the following command to set the KUBECONFIG environment.
      export KUBECONFIG=</path/to/kubeconfig>
    3. Run the following command to execute the pre-upgrade script:
      apicops version:pre-upgrade -n <namespace>

      If the system is healthy, the results will not include any errors.

      Note: This command asks for the Cloud Manager admin password in order to make a call to the platform API. If you do not have this password, or the platform API URL is not accessible from your apicops location, then this part of the pre-upgrade check can be skipped by adding the argument --no-topology. The topology check output that is produced by this API call is typically only required for debugging purposes, when the apicops version:pre-upgrade returns output that suggests the system is unhealthy.
  3. Obtain the API Connect files from IBM Fix Central.

    From the IBM Fix Central site, download the Docker image-tool file of the API Connect subsystems. Next, you will upload the image-tool file to your Docker local registry. If necessary, you can populate a remote container registry with repositories. Then you can push the images from the local registry to the remote registry.

    You will also download the Kubernetes operators, API Connect Custom Resource (CR) templates, and Certificate Manager, for use during deployment configuration.

    The following files are used for deployment on native Kubernetes:

    IBM® API Connect <version> for Containers
    Docker images for all API Connect subsystems
    IBM® API Connect <version> Operator Release Files for Containers
    Kubernetes operators and API Connect Custom Resource (CR) templates
    IBM® API Connect <version> Toolkit for <operating_system_type>
    Toolkit command line utility. Packaged standalone, or with API Designer or Loopback:
    • IBM® API Connect <version> Toolkit for <operating_system_type>
    • IBM® API Connect <version> Toolkit with Loopback for <operating_system_type>
    • IBM® API Connect <version> Toolkit Designer with Loopback for <operating_system_type>

    Not required during initial installation. After installation, you can download directly from the Cloud Manager UI and API Manager UI. See Installing the toolkit.

    IBM® API Connect <version> Local Test Environment
    Optional test environment. See Testing an API with the Local Test Environment
    IBM® API Connect <version> Security Signature Bundle File
    Checksum files that you can use to verify the integrity of your downloads.
  4. Next, upload the image files that you obtained from Fix Central in Step 3.
    1. Load the image-tool image for the new version into your Docker local registry:
      docker load < apiconnect-image-tool-<version>.tar.gz 

      Ensure that the registry has sufficient disk space for the files.

    2. If your Docker registry requires repositories to be created before images can be pushed, create the repositories for each of the images listed by the image tool. If your Docker registry does not require creation of repositories, skip this step and go to Step 4.c.
      1. Run the following command to get a list of the images from image-tool:
        docker run --rm apiconnect-image-tool-<version> version --images
      2. From the output of each entry of the form <image-name>:<image-tag>, use your Docker registry repository creation command to create a repository for <image-name>.
        For example in the case of AWS ECR the command would be for each <image-name>:
        aws ecr create-repository --repository-name <image-name>
    3. Upload the image:
      • If you do not need to authenticate with the docker registry, use:
        docker run --rm apiconnect-image-tool-<version> upload <registry-url>
      • Otherwise, if your docker registry accepts authentication with username and password arguments, use:
        docker run --rm apiconnect-image-tool-<version> upload <registry-url> --username <username> --password <password>
      • Otherwise, such as with IBM Container Registry, if you need the image-tool to use your local Docker credentials, first authenticate with your Docker registry, then upload images with the command:
        docker run --rm -v ~/.docker:/root/.docker --user 0 apiconnect-image-tool-<version> upload <registry-url>

        Review the following installation notes as appropriate for your environment:

  5. If needed, delete old Postgres client certificates.
    If you are upgrading from 10.0.1.x or 10.0.4.0-ifix1, or if you previously installed any of those versions before upgrading to 10.0.5.x, there might be old Postgres client certificates. To verify, run the following command:
    kubectl -n <namespace> get certs | grep db-client

    For example, if you see that both -db-client-apicuser and apicuser exist, apicuser is no longer in use. Remove the old certificates by running one of the following commands, depending on how many old certifications left in your system:

    kubectl -n <namespace> delete certs  apicuser pgbouncer primaryuser postgres replicator

    or:

    kubectl -n <namespace> delete certs  apicuser pgbouncer postgres replicator
  6. If upgrading from v10.0.4-ifix3, or upgrading from v10.0.1.7-eus (or higher) and you want to retain your analytics data, then you must export it before running the upgrade.
  7. If upgrading from v10.0.4-ifix3, or upgrading from v10.0.1.7-eus (or higher): Disassociate and delete your Analytics services.
    1. In Cloud Manager, click Topology.
    2. In the section for the Availability Zone that contains the Analytics service, locate the Gateway service that the Analytics service is associated with.
    3. Click the actions menu, and select Unassociate analytics service.
      Remember to disassociate each Analytics service from all Gateways.
    4. In the section for the Availability Zone that contains the Analytics services, locate each Analytics service and click Delete.
  8. Download and decompress IBM API Connect <version> Operator Release Files for Containers.

    Make a directory called helper_files and extract the contents of helper_files.zip from the release_files.zip into helper_files.

  9. Apply the new CRDs from the version you just extracted:
    kubectl apply -f ibm-apiconnect-crds.yaml
  10. Apply the new DataPower Operator YAML into the namespace where the DataPower Operator is running.
    1. Verify that the correct namespace is referenced in the ibm-datapower.yaml file:

      If your product was not deployed in the namespace called default, open the ibm-datapower.yaml file in a text editor and replace all instances of default with the appropriate name of the namespace where you deployed the product.

    2. Specify the location of the datapower-operator image:

      Open the ibm-datapower.yaml file in a text editor and locate the image: key in the containers section of the deployment file (immediately after imagePullSecrets:). Replace the value of the image: key with the location of the datapower-operator image, either uploaded to your own registry or pulled from a public registry.

    3. Run the following command:
      kubectl apply -f ibm-datapower.yaml -n <namespace>

      The Gateway CR goes to Pending state when the operator is updated, and then changes to Running after installation of the API Connect operator in the next step.

      Note: There is a known issue on Kubernetes version 1.19.4 or higher that can cause the DataPower operator to fail to start. In this case, the DataPower Operator pods can fail to schedule, and will display the status message: no nodes match pod topology spread constraints (missing required label). For example:
      0/15 nodes are available: 12 node(s) didn't match pod topology spread constraints (missing required label), 
      3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.

      You can workaround the issue by editing the DataPower operator deployment and re-applying it, as follows:

      1. Delete the DataPower operator deployment, if deployed already:
        kubectl delete -f ibm-datapower.yaml -n <namespace>
      2. Open ibm-datapower.yaml, and locate the topologySpreadConstraints: section. For example:
        topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: zone
          whenUnsatisfiable: DoNotSchedule
      3. Replace the values for topologyKey: and whenUnsatisfiable: with the corrected values shown in the example below:
        topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: ScheduleAnyway
      4. Save ibm-datapower.yaml and deploy the file to the cluster:
        kubectl apply -f ibm-datapower.yaml -n <namespace>
  11. Upgrade cert-manager to version 1.12.9.
    1. Run the following command to back up the existing certificates and issuers to a file called backup.yaml:
      kubectl get --all-namespaces -oyaml issuer,clusterissuer,cert,secret > backup.yaml
    2. Run the following command to determine the version of the existing cert-manager:
      kubectl get crds certificates.cert-manager.io -o jsonpath='{.metadata.labels.app\.kubernetes\.io\/version}'
    3. If the current version is older than 1.12.9, run the following command to upgrade cert-manager to version 1.12.9:
      kubectl apply -f helper-files/cert-manager-1.12.9.yaml
  12. If you used customized internal certificates and are upgrading from 10.0.5.1: The helper_files/custom-certs-internal.yaml includes a new certificate called dbClientPrimaryuser. This new certificate must be created before you update the Management CR.
    Follow the instructions in Generate custom internal certificates to update the helper_files/custom-certs-internal.yaml file to match your namespace and site name, and then apply the new yaml file with the following command:
    kubectl apply -f custom-certs-internal.yaml -n <namespace>
    Alternatively, add the new certificate to your existing custom-certs-internal.yaml and apply it, updating the namespace to match your deployment:
    ---
    apiVersion: cert-manager.io/v1
    kind: Certificate
    metadata:
      name: db-client-primaryuser
      labels: {
        app.kubernetes.io/instance: "management",
        app.kubernetes.io/managed-by: "ibm-apiconnect",
        app.kubernetes.io/name: "db-client-primaryuser"
      }
    spec:
      commonName: primaryuser
      secretName: db-client-primaryuser
      dnsNames:
        - "*.<namespace>"
        - "*.<namespace>.svc"
        - "primaryuser.<namespace>.svc"
        - "primaryuser"
      issuerRef:
        name: ingress-issuer
      usages:
      - "client auth"
      - "signing"
      - "key encipherment"
      duration: 17520h # 2 years
      renewBefore: 720h # 30 days
      privateKey:
        rotationPolicy: Always
    ---
    
  13. Clean up the webhook configuration before deploying the newer API Connect operator:
    1. Run the following command to get the webhook configuration:
      kubectl get mutatingwebhookconfiguration,validatingwebhookconfiguration | grep ibm-apiconnect
    2. Use the kubectl delete command to delete the webhook configuration; for example:
      kubectl delete mutatingwebhookconfiguration ibm-apiconnect-mutating-webhook-configuration
      kubectl delete validatingwebhookconfiguration ibm-apiconnect-validating-webhook-configuration
      
  14. Apply the new API Connect operator YAML into the namespace where the API Connect operator is running.
    • For single namespace deployment:
      1. If the operator is not running in the default namespace, open the ibm-apiconnect.yaml file in a text editor, and then replace all references to default with the name of the namespace where you deployed API Connect.
        Note: Skip this step if you are using Operator Lifecycle Manager (OLM).
      2. Open ibm-apiconnect.yaml in a text editor. Replace the value of each image: key with the location of the apiconnect operator images (from the ibm-apiconnect container and the ibm-apiconnect-init container), either uploaded to your own registry or pulled from a public registry.
      3. Run the following command:
        kubectl apply -f ibm-apiconnect.yaml -n <namespace>
    • For multi-namespace deployment:
      1. Locate and open the newly downloaded ibm-apiconnect-distributed.yaml in a text editor of choice. Then, find and replace each occurrence of $OPERATOR_NAMESPACE with the desired namespace for the deployment.
      2. Also in ibm-apiconnect-distributed.yaml, locate the image: keys in the containers sections of the deployment yaml right below imagePullSecrets:. Replace the placeholder values REPLACE-DOCKER-REGISTRY of the image: keys with the docker registry host location of the API Connect operator image (either uploaded to own registry or pulled from public registry).
      3. Install ibm-apiconnect-distributed.yaml with the following command
        kubectl apply -f ibm-apiconnect-distributed.yaml
  15. Verify that the ibm-datapower-operator and the ibm-apiconnect operators are restarted.
  16. Ensure that the apiconnect operator recreated the necessary microservices:

    Be sure to complete this step before attempting to upgrade the operands.

    kubectl get apic -n <namespace>
  17. Upgrade the operands (subsystems):
  18. Verify that the upgraded subsystems report as Running and the RECONCILED VERSION displays the new version of API Connect.

    Run the following command:

    kubectl get apic --all-namespaces

    Example response:

    NAME                                                READY   STATUS    VERSION              RECONCILED VERSION      AGE
    analyticscluster.analytics.apiconnect.ibm.com/analytics      8/8     Running   10.0.5.7   10.0.5.7-1074   121m
    
    NAME                                     PHASE     READY   SUMMARY                           VERSION    AGE
    datapowerservice.datapower.ibm.com/gw1   Running   True    StatefulSet replicas ready: 1/1   10.0.5.7   100m
    
    NAME                                     PHASE     LAST EVENT   WORK PENDING   WORK IN-PROGRESS   AGE
    datapowermonitor.datapower.ibm.com/gw1   Running                false          false              100m
    
    NAME                                            READY   STATUS    VERSION              RECONCILED VERSION      AGE
    gatewaycluster.gateway.apiconnect.ibm.com/gw1   2/2     Running   10.0.5.7   10.0.5.7-1074  100m
    
    NAME                                                 READY   STATUS    VERSION              RECONCILED VERSION      AGE
    managementcluster.management.apiconnect.ibm.com/m1   16/16   Running   10.0.5.7   110.0.5.7-1074   162m
    
    
    NAME                                             READY   STATUS    VERSION              RECONCILED VERSION      AGE
    portalcluster.portal.apiconnect.ibm.com/portal   3/3     Running   10.0.5.7   10.0.5.7-1074   139m
    Important: If you need to restart the deployment, wait until all Portal sites complete the upgrade. Run the following commands to check the status of the sites:
    1. Log in as an admin user:
      apic login -s <server_name> --realm admin/default-idp-1 --username admin --password <password>
    2. Get the portal service ID and endpoint:
      apic portal-services:get -o admin -s <management_server_endpoint> \
                   --availability-zone availability-zone-default <portal-service-name> \
                   --output - --format json
    3. List the sites:
      apic --mode portaladmin sites:list -s <management_server_endpoint> \ 
                   --portal_service_name <portal-service-name> \ 
                   --format json

      Any sites currently upgrading display the UPGRADING status; any site that completed its upgrade displays the INSTALLED status and the new platform version. Verify that all sites display the INSTALLED status before proceeding.

      For more information on the sites command, see apic sites:list and Using the sites commands.

    4. After all sites are in INSTALLED state and have the new platform listed, run:
      apic --mode portaladmin platforms:list -s <server_name> --portal_service_name <portal_service_name>
      

      Verify that the new version of the platform is the only platform listed.

      For more information on the platforms command, see apic platforms:list and Using the platforms commands.

  19. If you are upgrading from v10.0.4-ifix3, or upgrading from v10.0.1.7-eus (or higher): Enable analytics as explained in Enabling Analytics after upgrading.
  20. If you are upgrading from v10.0.5.3 or earlier, then you must update your ingress certificates.
    • If all API Connect subsystems are in the same namespace, then update the certificates with the ingress-issuer-v1.yaml from your target release.
      kubectl apply -f helper_files/ingress-issuer-v1.yaml -n <namespace>
    • For a multi-namespace deployment, use the helper_files/multi-ns-support files, and follow the steps: 2.b to 2.f on Deploying operators in a multi-namespace API Connect cluster.
  21. Optional: For the optional components API Connect Toolkit and API Connect Local Test Environment, install the latest version of each after you complete the upgrade of the subsystems.
  22. Optional: Configure additional features related to inter-subsystem communication security, such as CA verification and JWT security: Optional post-upgrade steps for upgrade to 10.0.5.3 (or later) from earlier 10.0.5 release.