Known issues in foundational services

Get a quick overview of the known issues for the available foundational services.

Table 1. Known issues
Service
Description More information
IAM In foundational services version 3.23 and later, while listing the users in the group in Platform UI console by using Azure SCIM integration, the username in the group might be displayed as undefined undefined. It is a limitation from Azure. Currently, no workaround is available.
IAM While login into Platform UI console by using SAML option, login page is displayed twice. It means, once you provide the login details, instead of displaying the home page of the console, the login page is displayed again. However, the second time you don't need to provide the details in the login page, you just need to click Login and the home page of the console will be displayed. It is a known limitation. Currently, no workaround is available.
IAM Before you register the OIDC clients by using IdP V3 API, you need to login into third party ID provider. And, then you can register the OIDC clients in the application. While registering, you use application url as cp-console url and redirect URL as https://<cp-console-url>/ibm/api/social-login/redirect/<name of the oidc>. However, you might face issue while opening the cp-console browser. When you click the configured ID provider name, you might not be redirected to the authentication page of that IdP. To troubleshoot the issue, see OIDC registration fails to update.
IAM If you configure LDAP and SAML after upgrading the foundational services version from 3.11 (or earlier) to the versions 3.12, 3.13, 3.14, 3.15 and 3.16, the UI will display both LDAP and SAML options even after the identity provider (IdP) configuration. To establish the link, upgrade from the current foundational services version to the version 3.17.
IAM The IAM certificates are not getting updated even after the service is upgraded. For an instance, in the foundational services 3.6 version, the certificates are created with ClusterIssuer and in the foundational services 3.13 version, the certificates are created with Issuer. If you upgrade the cluster from the foundational services 3.6 version to the foundational services 3.13 version, the IAM certificates are not getting updated with Issuer. They are still referencing the ClusterIssuer. This issue is limited only when you upgrade from v3.6.4 version. From 3.6.5 version onwards, the IAM certificates are getting updated once the service is upgraded. To resolve this issue, complete the following steps:
  • Get the depleted certificate by using the following command, oc get certificates.v1alpha1.certmanager.k8s.io --selector=‘certmanager.k8s.io/issuer-name=cs-ca-clusterissuer’;
  • Delete the depleted certificate by using the command, oc get certificates.v1alpha1.certmanager.k8s.io --selector='certmanager.k8s.io/issuer-name=cs-ca-clusterissuer'--no-headers | xargs oc delete certificates.v1alpha1.certmanager.k8s.io.
IAM The iam-onboarding job is not completed on bare metal cluster because the auth-idp pod is not running completely. The platform-auth-service container restarts every time due to application error. The platform-auth-service container fails because it is present in the auth-idp pod and has dependency on the liberty rest API. The JVM (Java virtual machine) fails because many javacore dump files generate in the container that stops liberty process. The JVM is specific to Bare Metal cluster where HugePages feature is integrated into the Linux kernel. If HugePages are enabled, JVM uses the HugePages by default. In normal scenario, if physical HugePages are not available then operating system provides small 4k pages. In the container, if the pod environment is not allowed to access the HugePages, the operating system deletes the process that uses HugePages. Currently, no work around is available to complete the iam-onboarding job because you cannot edit the files inside the container.
IAM LDAP user names are case-sensitive. You must use the name exactly the way it is configured in your LDAP directory.
IAM SAML user with Platform UI administrator permission only has viewer role set in IAM. You must assign roles individually to SAML users in IAM.
IAM The OpenShift group does not synchronize when a user is added or removed from an LDAP group. An OpenShift group is created when you add the LDAP group to teams. When a user is added or removed from an LDAP group at the LDAP server side, the OpenShift group does not update by any process or thread in IAM. To resolve this issue, delete and re-add the LDAP group to teams to recreate the OpenShift group with the latest members.
IAM The OpenShift users are not removed when you remove them from the LDAP group. An OpenShift group is created when you add the LDAP group to teams. An OpenShift user is created when you add an LDAP user to teams, or when this LDAP user logs in to the IBM Cloud Pak console. When a user is removed from an LDAP group at the LDAP server side, the OpenShift group does not update by any process or thread in IAM. An OpenShift user or group is deleted only if this user or group is deleted from teams. To resolve this issue, delete and re-add the LDAP group to teams to recreate the OpenShift group with the latest members, and manually delete the OpenShift user. To delete the user, use the following command: oc delete user <user_id>.
IAM The SAML identity provider is removed from the SAML configuration when you upgrade from OCP version 4.10 to 4.12 To resolve the issue, complete the following steps:
1. Restart the MongoDB and auth pods.
oc delete pod -n ibm-common-services -l app=icp-mongodb
2. Verify that the pods are running.
oc get pod -n ibm-common-services
egrep 'NAME icp-mongodb'
If the pod status shows as Running, proceed with the next step.
3. Delete the auth pods.
oc delete pod -n ibm-common-services -l k8s-app=auth-idp
oc delete pod -n ibm-common-services -l k8s-app=auth-pap
oc delete pod -n ibm-common-services -l k8s-app=auth-pdp
4. Verify the pod status.
oc get pod -n ibm-common-services
egrep 'NAME auth-idp auth-pap auth-pdp'
Installer - Federal Information Processing Standard (FIPS) When you enable FIPS compliance for a service by using the CommonService CR, the configuration is not propagated in the OperandConfig. If you use the OperandConfig to add the configuration, the configuration might be removed after an upgrade. To work around this issue, enable FIPS-compliance in the OperandConfig by adding spec.fipsEnabled: true. If you upgrade foundational services, you must re-enable FIPS-compliance in the OperandConfig.
Installer OLM is unable to generate new installation plans for updates or new installations. For more information about the issue and the steps to resolve the issue, see OLM is unable to generate new install plans.
Installer The issue of incorrect configuration of the Platform API service in the CommonService custom resource is fixed in foundational services version 3.18.x. For more information, see IBM Cloud Pak foundational services InstallPlan fails during installation or upgrade.
Installer In foundational services version 3.15.x, an operator fails to upgrade and continuously shows UpgradePending status in a namespace. To resolve the issue, complete the steps in Operator shows UpgradePending status in a namespace.
Installer
For foundational services versions 3.13 and prior, if the ClusterServiceVersion (CSV) of the IBM NamespaceScope Operator Restricted shows the Pending status, you need to manually correct the typographical error in the CSV. To resolve the issue, complete the steps in IBM NamespaceScope Operator Restricted CSV shows pending status.
Installer In OpenShift clusters with multitenant isolation mode, each project is isolated by default. Network traffic is not allowed between pods or services in different projects. To resolve the issue, complete the steps in Disabling network isolation.
Installer After you upgrade foundational services, you might see some of the operator pods are in Crashloopbackoff status. This is because of an Operator Lifecycle Manager (OLM) known issue. For more information about the issue and the steps to resolve the issue, see Operator upgrade fails - OLM known issue.
Installer After you upgrade the foundational services installer Version 3.2.4 directly to 3.6.3, or first to Version 3.5.6 and then from Version 3.5.6 to Version 3.6.x, management-ingress pods do not start. To resolve the issue, complete the steps in Management ingress pods fail after upgrade from a Helm release.
Installer - IAM When there is an OpenShift user admin it collides with IBM Cloud Pak foundational services default user admin. To resolve the issue, rename the IBM Cloud Pak foundational services default username if an admin username exists in OpenShift. For more information, see Changing the default admin username
Installer When upgrading to foundational services version 3.6.4, you encounter an issue with the upgrade of the Operand Deployment Lifecycle Manager operator. The upgrade of foundational services fails. To resolve this issue, complete the steps in Operand Deployment Lifecycle Manager cannot update when upgrading to version 3.6.4.
Installer When you install or upgrade foundational services, you might see that some of the operators are in a Pending, Unknown, or Can't Update status. This is because of an Operator Lifecycle Manager (OLM) known issue. For more information about the issue and the steps to resolve the issue, see the following topics:
Installer - MongoDB When you install foundational services 3.8.x on OpenShift 4.6 or 4.7 on Linux on Power (ppc64le) or IBM Z architecture, one of the MongoDB pods hangs. To resolve the issue, complete the steps in MongoDB pod hangs after installation.
Installer When you install foundational services on Azure environment with Azure storage, foundational services pods do not start. To resolve this issue, get the scc.uid from the installation namespace before creating the custom Azure storage class. For more information, see Using Azure File storage class.
Installer When you set the approval strategy for foundational services 3.8.x and 3.9.x to Manual, either the CommonService custom resource status is never Succeed or the OperandRequest instance status is never Running. Complete the steps in CommonService custom resource or OperandRequest instance failing under Manual approval strategy
Installer - IAM When you install or upgrade to foundational services 3.11.x, the iam-config-job pod goes into the Crashloop status and the deployment stalls. The following error is displayed in the iam-config-job log: 000. The iam-config-job uses the zen-console route. This issue occurs when the zen-console rout cannot be accessed, which might be caused by network policy or the DNS issue. To avoid this issue, ensure that the zen-console route is accessible.
OpenShift Container Platform upgrade If you configured Multitenancy account quota enforcement in your cluster in a previous release, upgrade of your OpenShift Container Platform version might fail. To resolve the issue, first disable Multitenancy account quota enforcement, then upgrade your OpenShift Container Platform version.
Installer - EDB Postgres When you install foundational services version 3.12.x with EDB Postgres in a namespace other than ibm-comon-services, the create-postgres-license-config job fails because of BackoffLimitExceeded. To resolve this issue, complete the steps in Deploying the EDB Postgres operator in a custom namespace.
Installer - EDB Postgres When you upgrade foundational services version 3.12.x or older with EDB Postgres in a namespace other than ibm-comon-services, cloud-native-postgresql does not work correctly. To resolve this issue, complete the steps in Incorrect EDB Postgres operator environment variable configuration.
Installer When you install foundational services, cloud-native-postgresql is installed with the certified-operators catalogsource. To resolve this issue, see cloud-native-postgresql is installed with certified-operators CatalogSource.
Installer When you install foundational services in an air-gapped environment, the ibm-zen-cpp-operator image fails to be pulled after running the mirroring command with the latest CASE. The issue exists on certain registries such as Quay. To resolve this issue, see ibm-zen-cpp-operator image fails to mirror during an air-gapped installation.
Installer After upgrading an OpenShift cluster to OpenShift version 4.15.x via the OpenShift console, the foundational services operator CSV fails with the following message: install strategy failed: rolebindings.rbac.authorization.k8s.io "ibm-common-service-operator-service-auth-reader". To resolve this issue, see Install strategy fails after upgrading OpenShift to 4.15.x.
MongoDB When you install foundational services, the use of NFS storage and self-defined persistent volumes have extra restrictions that might stop some of your workloads. For example, MongoDB deployment might not run properly.
Audit logging service 3.6.x With the HTTP support for audit logging technology preview, a service can generate audit records and post in any namespace if it is sending data in insecure mode. If the service posts audit records over TLS, the service must be running in the ibm-common-services namespace. In Audit logging 3.7.0, the Rsyslog sidecar must run in the same namespace as your fluentd instance.
Audit logging service 3.7.0 Services that use the audit logging Rsyslog sidecar must run in the same namespace as the fluentd instance configured for the service.
Cert-manager If there are two cert-managers on your cluster, your Certificates might not be in the ready status. You must uninstall one of the cert-managers. See Problem when you install two different cert-managers.
Cert-manager Pods which have certificate secrets mounted are restarted about every 10 hours in foundational services 3.13 Temporary fix is to scale down ibm-cert-manager-operator pod. See Pods restarted regularly every 10 hours for more information. This bug is fixed in foundational services 3.14.
Cert-manager The self-signed CA certificate that is used by IBM Cloud Pak foundational services and created by the cert-manager service has a duration of 90 days. The CA certificate is refreshed by cert-manager but the leaf certificates that use the CA certificate must be manually refreshed. Recommend that user check the expiration date for the CA certificate and refresh the CA certificate before the expiration date and renew the leaf certificates. The CA certificate duration can also be updated. See Refreshing IBM Cloud Pak foundational services internal CA certificate.
Cert-manager In foundational services version 3.6.x and earlier, the self-signed CA certificate that is used by IBM Cloud Pak foundational services and created by the cert-manager service has a duration of 90 days. The CA certificate is refreshed by cert-manager but the leaf certificates that use the CA certificate must be manually refreshed. Recommend that user extend the duration for the CA certificate, refresh the CA certificate, and then renew the leaf certificates. See Refreshing foundational services internal certificates.
Cert-manager In foundational services version 3.7.x and 3.8.x, you experience the following symptoms:
- cert-manager-cainejctor pod restarts many times.
- Certificates are not ready.
- cert-manager-webhook pod has errors about bad TLS certificate in the logs.
To resolve this problem, see Certificate manager experiences CPU resource issues.
Cert-manager After upgrading foundational services from EUS version to 3.16.0 version, you might have issues with creating Issuers and Certificates with the error message that contains the following phrase: failed calling webhook webhook.certmanager.k8s.io To resolve this problem, see Cannot create Issuers or Certificates after upgrade.
Events operator When upgrading Events operator from previous versions, a Zookeeper pod ends up in a CrashLoopBackOff state. To resolve this problem, see Zookeeper pod hangs in a CrashLoopBackOff state.
Events operator Events operator is periodically printing the following message: Failed to acquire lock during the reconciliation process, and it is timing out. This might indicate that the lock was not properly released due to an error. To resolve the problem, restart the Events operator to release the lock.
platform-api Starting in foundational services version 3.8.0, secrets are constantly being created by the Platform API operator. Eventually thousands of unnecessary secrets are created, overwhelming applications that watch secrets. This situation impacts the performance of the cluster overall. This issue is fixed in foundational services version 3.10.0. When it is not possible for you to upgrade to foundational services version 3.10.0, see Secrets constantly created by Platform API operator for information about how to work around the problem.
Platform UI - installer version 3.12.0 Administration panel is available under the Administration category in main navigation on Platform UI. You can manually remove the ibm-commonui-bindinfo-common-webui-ui-extensions config map from the IBM Cloud Pak namespace
Platform UI Upgrade of Platform UI (zen) operand fails. To resolve this problem, see Upgrade of Platform UI (zen) operand fails.
OpenShift Container Platform upgrade and Logging Failure occurs when you upgrade from OCP version 4.6 to 4.7. The issue appears to be a defective behavior on the OCP side of assigning the logging SCC to apiserver. To work around the issue, oc delete scc logging-elk-filebeat-ds and then manually restart the apiserver pods that are in CrashLoopBackOff status.