Troubleshooting
Problem
This document describes the general information and diagnostic data needed to start troubleshooting issues related to EDB Postgres (CloudNativePG) components used with IBM Cloud Pak related operators. When you open a case, include the diagnostics retrieved from using this document.
Resolving The Problem
General Diagnostic Information
These items are the general diagnostics, which are helpful in most situations regardless of component.
Note: oc commands are interchangeable with kubectl.
Note: oc commands are interchangeable with kubectl.
1: Provide a detailed description of the problem and your environment
- Provided a detailed description of your issue. Include screen captures and re-create steps if possible.
- Has this problem always been an issue, or is it an issue that started only after an upgrade or changed configuration?
- What is the business impact? Do we need to be aware of any deadlines impacted by the issue?
- What are the CloudNativePG operator, Cloud Pak, and OCP versions?
- Provide details on which Cloud Pak component has issues (including the pod names and namespace).
- Provide a reference to the documentation being followed for the failing operation.
- Is this environment development, test, or production?
- Which platform do you use the Cloud Pak (Example: OpenShift on-prem, OpenShift on IBM Cloud Public, AWS, Azure )?
2: Collecting the cnp cluster report
2.1 Skip this step if you have already installed the cnp plugin.
EDB Postgres for Kubernetes provides a plugin for
https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/cnp-plugin/
2.2 Collect the cnp cluster Status. The
kubectl
to manage a PostgreSQL cluster in Kubernetes. The plugin also works with the OpenShift environment for oc clients.https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/cnp-plugin/
curl -sSfL https://github.com/EnterpriseDB/kubectl-cnp/raw/main/install.sh | sudo sh -s -- -b /usr/local/bin
Installation Tips :
- Make sure you have "/usr/local/bin" in the directory $PATH for the `oc` command to recognize the `kubectl-cnpg` plugin
- Test by running "oc cnp --help" if the oc command recognizes the installed kubectl-cnp plugin.
- Make sure to use the same or newer release than your EDB Postgres for Kubernetes operator. Many of the commands are not available on cnpg 1.15 and older versions.
- The "kubectl" and "oc" commands are interchangeable
- For AirGap installation, see the instructions to download the RPM packages, and use yum to install on the disconnected environment
- See the EnterpriseEDB website for detail on installing the EDB Postgres for Kubernetes Plugin
status
command provides an overview of the status of your cnp cluster The Postgres cluster name varies between the cloud Pak operators that use the database. You can find the name of the postgresSQL cluster name by using the following command.
$ oc get cluster -A
NAMESPACE NAME AGE INSTANCES READY STATUS PRIMARY
cp4data wa-dwf-ibm-mt-dwf-pg 293d 3 3 Cluster in healthy state wa-dwf-ibm-mt-dwf-pg-6
cp4data wa-postgres 126d 3 3 Cluster in healthy state wa-postgres-1
Run the status command and save.2.3 Collect the Cluster reportoc cnp status <cnp-cluster-name> --verbose -n <cnp-namespace> > cnp-cluster-name-status.txt
Replace the cnp-cluster-name and the cnp-namespace for the problem cluster .oc cnp report cluster <cnp-cluster-name> -f <cluster-name>-report.zip -n <cnp-namespace> --logs
Replace the cnp-cluster-name and the cnp-namespace for the problem cluster
For version 1.18.7, add the option "-ojson" to make sure to collect the logs
2.4 Collect the postgreSQL EDB operator report
2.4.1 Find the namespace where the postgresql-operator-controller-manager pod is running
oc get pods -A | grep postgresql-operator-controller-manager
example output :
ibm-common-services postgresql-operator-controller-manager-1-18-1-77c64c8684-qbk2w 1/1 Running 0 4d13h
2.4.2 Run the below command to collect the postgresql-operator-controller report. Replace the namespace <cnp-operator-namespace> from the above command.
Note : <cnp-operator-namespace> is the nameSpace where the postgresql-operator-controller-manager pod is running . The cnp operator may be installed in a different namespace than where the EDB clusters instances are runningoc cnp report operator -n <cnp-operator-namespace> -f <cnpOperator-name>-report.zip --logs
NOTE: For version 1.18.7, add the option "-ojson" to make sure to collect the logs
2.5 Collect all the operator versions and pod status status
oc get csv -A > all-csv.txt oc get pods -A > all-pods.txt
2.6 Collect the Custom Resource for the Cloud Pak operator, which manages the cnpg cluster
For example:
For Watson Discovery
For BTS operatoroc get wd wd -o yaml -n <WD-namespace> > watsondiscoverCR.yaml
oc get bts cp4ba-bts -n <bts-namespace> -o yaml > bts-CR.yaml
3: Collecting the data to diagnose installation and upgrade-related issues
For installation and upgrade related issues, collect Common services mustgather using the following script,
cat > cp-must-gather-CP-CS.sh << 'EOT'
#!/bin/bash
export MY_CLOUDPAK_NAMESPACES=cp4d,cp4ba
export MUST_GATHER_IMAGE=icr.io/cpopen/cpfs/must-gather:latest
export CLOUDPAK_NAMESPACES=common-service,ibm-common-services,openshift-operators,openshift-operator-lifecycle-manager,openshift-marketplace,$MY_CLOUDPAK_NAMESPACES
export MUST_GATHER_MODULES=overview,system,failure,cloudpak,route
oc adm must-gather --image=$MUST_GATHER_IMAGE -- gather -m $MUST_GATHER_MODULES -n $CLOUDPAK_NAMESPACES
EOT
- Replace "cp4d,cp4ba" with your cluster's namespace used for cloudpak operators.
- If the cluster is offline (AirGap environment), use the locally mirrored registry MUST_GATHER_IMAGE=[LOCAL_REGISTRY:5000]/cpopen/cpfs/must-gather:latest
- See the "Cloudpak and common services mustgather" for additional options.
- Change file permission of the shell script crate from the above script and run the scripts to collect the support data.
chmod +x cp-must-gather-CP-CS.sh
./cp-must-gather-CP-CS.sh
A tarZ file named cloudpak-must-gather-xxx.tar.gz will be generated under must-gather.local.xxx/quay-io-opencloudio-must-gather-xxxx directory
- Upload the cloudpak-must-gather-xxx.tar.gz the file. No need to tarZ the long directory.
Related Information
Document Location
Worldwide
[{"Type":"MASTER","Line of Business":{"code":"LOB67","label":"IT Automation \u0026 App Modernization"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSRV9V","label":"IBM Cloud Pak foundational services"},"ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"},{"Type":"MASTER","Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSNDZS","label":"IBM Watson Discovery Cartridge for IBM Cloud Pak for Data"},"ARM Category":[{"code":"a8m0z000000XaqFAAS","label":"Watson Knowledge Studio-\u003EInstall"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"},{"Type":"MASTER","Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SS2JQC","label":"IBM Cloud Pak for Automation"},"ARM Category":[{"code":"a8m0z0000001hwzAAA","label":"Business Console-\u003EConfiguration from UI-\u003EOther"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]
Product Synonym
common services, clous Pak Foundational services
Was this topic helpful?
Document Information
Modified date:
14 June 2024
UID
ibm17001335