Troubleshooting installation on OpenShift
Review the following known issues and troubleshooting tips if you encounter a problem while installing API Connect on OpenShift, including as a component of IBM Cloud Pak for Integration (CP4I).
V10.0.7.0: Failed integration-ibm-cloud-native-postgresql
CatalogSource
on ROKS 4.14 and OpenShift Container Platform 4.15
The API Connect operator creates the EDB catalog source in the same namespace as the API Connect operator.
Status:
Message: couldn't ensure registry server - error ensuring pod: : error creating new pod: integration-ibm-cloud-native-postgresql-: pods "integration- ibm-cloud-native-postgresql-hnjbn" is forbidden: violates PodSecurity "restricted:v1.24": allowPrivilegeEscalation != false (container "registry-server" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "registry-server" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "registry-server" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "registry-server" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
Reason: RegistryServerError
The problem occurs when the namespace is set to enforce the restricted pod security admission
policy with the pod-security.kubernetes.io/enforce: restricted
label.
ROKS 4.14 and some OpenShift Container Platform versions such as 4.15 have
enforce
set to restricted
.
integration-ibm-cloud-native-postgresql
CatalogSource
to use the restricted security context
constraint:oc patch CatalogSource integration-ibm-cloud-native-postgresql --type merge --patch '{"spec":{"grpcPodConfig":{"securityContextConfig":"restricted"}}}'
One or more pods in CrashLoopBackoff
or Error
state, and
report a certificate error in the logs
-
javax.net.ssl.SSLHandshakeException: Received fatal alert: certificate_unknown
-
Error: unable to verify the first certificate
-
ERROR: openssl verify failed to verify the Portal CA tls.crt, ca.crt chain signed the Portal Server tls.crt cert
- Use
apicops
(v10 version 0.10.57+ required) to validate the certificates in the system:apicops upgrade:stale-certs -n <namespace>
- If any certificate that is managed by cert-manager fails the validation, delete the stale
certificate
secret:
oc delete secret <stale-secret> -n <namespace>
Cert-manager automatically generates a new certificate to replace the one you deleted.
- Use
apicops
to make sure all certificates can be verified successfully:apicops upgrade:stale-certs -n <namespace>
You see the denied: insufficient scope
error during an air-gapped
deployment
Problem: You encounter the
denied: insufficient scope
message while mirroring images during an air-gapped
installation or upgrade.
Reason: This error occurs when a problem is encountered with the entitlement key that is used for obtaining images.
Solution: Obtain a new entitlement key by completing the following steps:
- Log in to the IBM Container Library.
- In the Container software library, select Get entitlement key.
- After the Access your container software heading, click Copy key.
- Copy the key to a safe location.
Apiconnect operator pod fails
Problem: During installation (or
upgrade), the apiconnect
operator fails with the following message:
panic: unable to build API support: unable to get Group and Resources: unable to retrieve the complete list of server APIs: packages.operators.coreos.com/v1: the server is currently unable to handle the request
goroutine 1 [running]:
github.ibm.com/velox/apiconnect-operator/operator-utils/v2/apiversions.GetAPISupport(0x0)
operator-utils/v2/apiversions/api-versions.go:89 +0x1e5
main.main()
ibm-apiconnect/cmd/manager/main.go:188 +0x4ee
- Apiconnect operator is in crash loopback status
- Kube
apiserver
pods log the following information:E1122 18:02:07.853093 18 available_controller.go:437] v1.packages.operators.coreos.com failed with: failing or missing response from https://10.128.0.3:5443/apis/packages.operators.coreos.com/v1: bad status from https://10.128.0.3:5443/apis/packages.operators.coreos.com/v1: 401
- The IP logged here belongs to the package server pod present in the
openshift-operator-lifecycle-manager
namespace - Package server pods log the following error message:
./apis/packages.operators.coreos.com/v1
API call is being rejected with 401 issueE1122 18:10:25.614179 1 authentication.go:53] Unable to authenticate the request due to an error: x509: certificate signed by unknown authority I1122 18:10:25.614224 1 httplog.go:90] verb="GET" URI="/apis/packages.operators.coreos.com/v1" latency=161.243µs resp=401 UserAgent="Go-http-client/2.0" srcIP="10.128.0.1:41370":
- The problem is intermittent
- If you find the exact symptoms as described, the solution is to delete package server pods in
the
openshift-operator-lifecycle-manager
namespace. - New package server pods log the
200 Success
message for the same API call.
Disabling the Portal web endpoint check
portal-www
pod,
admin container logs, if the endpoint cannot be reached:
An error occurred contacting the provided portal web endpoint: example.com
The provided Portal web endpoint example.com returned HTTP status code 504
In this
instance, you can disable the Portal web endpoint check so that the Developer Portal
service can be created successfully. - On Kubernetes, OpenShift, and IBM® Cloud Pak for Integration
- Add the following section to the Portal custom resource (CR) template:
spec: template: - containers: - env: - name: PORTAL_SKIP_WEB_ENDPOINT_VALIDATION value: "true" name: admin name: www