IBM Support

Mustgather: Collecting data to diagnose issues with EDB Postgres (CloudNativePG)

Troubleshooting


Problem

This document describes the general information and diagnostic data needed to start troubleshooting issues related to  EDB Postgres (CloudNativePG)  components used with IBM Cloud Pak related operators. When you open a case, include the diagnostics retrieved from using this document.

Resolving The Problem

General Diagnostic Information
 
These items are the general diagnostics, which are helpful in most situations regardless of component.   
Note: oc commands are interchangeable with kubectl.
 

1: Provide a detailed description of the problem and your environment

  • Provided a detailed description of your issue. Include screen captures and re-create steps if possible.
  • Has this problem always been an issue, or is it an issue that started only after an upgrade or changed configuration?
  • What is the business impact? Do we need to be aware of any deadlines impacted by the issue?
  • What are the CloudNativePG operator, Cloud Pak, and OCP versions?
  • Provide details on which Cloud Pak component has issues (including the pod names and namespace).
  • Provide a reference to the documentation being followed for the failing operation.
  • Is this environment development, test, or production?
  • Which platform do you use the Cloud Pak  (Example: OpenShift on-prem, OpenShift on IBM Cloud Public, AWS,  Azure  )?
2: Collecting  the cnp cluster report 
 
      2.1  Skip this step if you have already installed the cnp plugin. 
         EDB Postgres for Kubernetes provides a plugin for kubectl to manage a PostgreSQL cluster in Kubernetes. The plugin also works with the OpenShift environment for oc clients.
        https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/cnp-plugin/  
        

     curl -sSfL https://github.com/EnterpriseDB/kubectl-cnp/raw/main/install.sh | sudo sh -s -- -b /usr/local/bin
      
       Installation Tips : 
            - Make sure you have "/usr/local/bin"  in the directory $PATH for the  `oc` command to recognize the `kubectl-cnpg` plugin 
            - Test by running "oc cnp --help"  if the oc command recognizes the installed kubectl-cnp plugin. 
            - Make sure to use  the same or newer release than your EDB Postgres for Kubernetes operator. Many of the commands are not available on cnpg 1.15 and older versions.
            - The "kubectl"  and "oc"  commands are  interchangeable 
            - For AirGap installation, see the instructions to download the RPM packages, and use yum to install on the disconnected environment 
            - See the EnterpriseEDB website for detail on installing the EDB Postgres for Kubernetes Plugin
            
     2.2  Collect the cnp cluster Status. The status command provides an overview of the  status of your cnp cluster 
         
The Postgres cluster name varies between the cloud Pak operators that use the database. You can find the name of the postgresSQL cluster name by using the following command.
 

$ oc get cluster -A

NAMESPACE   NAME                AGE    INSTANCES   READY   STATUS                           PRIMARY
cp4data  wa-dwf-ibm-mt-dwf-pg    293d   3          3       Cluster in healthy state   wa-dwf-ibm-mt-dwf-pg-6
cp4data  wa-postgres             126d   3          3       Cluster in healthy state   wa-postgres-1
          Run the status command and save.        

     oc cnp status <cnp-cluster-name>  --verbose -n <cnp-namespace>  > cnp-cluster-name-status.txt 
           Replace the cnp-cluster-name  and the cnp-namespace for the problem cluster .
           
 2.3 Collect the Cluster report         

     oc cnp report cluster <cnp-cluster-name> -f  <cluster-name>-report.zip -n <cnp-namespace>  --logs 
       Replace the cnp-cluster-name  and the cnp-namespace for the problem cluster 
        For  version 1.18.7, add the option "-ojson" to make sure to collect the logs 
 2.4 Collect the postgreSQL EDB  operator report    
    
   2.4.1  Find the namespace where the  postgresql-operator-controller-manager pod is running

oc get pods -A | grep postgresql-operator-controller-manager

 
example output : 
ibm-common-services   postgresql-operator-controller-manager-1-18-1-77c64c8684-qbk2w           1/1     Running                  0                  4d13h
 
    2.4.2  Run the below command to collect the postgresql-operator-controller report.  Replace the namespace <cnp-operator-namespace>  from the above command. 

     oc cnp report operator -n <cnp-operator-namespace>  -f  <cnpOperator-name>-report.zip  --logs 
 Note :  <cnp-operator-namespace>  is the nameSpace where the postgresql-operator-controller-manager pod is running . The cnp operator may be installed in a different namespace than where the EDB clusters instances are running   
NOTE:  For  version 1.18.7, add the option "-ojson" to make sure to collect the logs 
2.5 Collect all the operator versions and pod status status 
  
     oc get csv -A  > all-csv.txt 
     oc get pods -A  > all-pods.txt 
 
2.6  Collect the Custom Resource for the Cloud Pak operator, which manages the cnpg cluster 
For example: 
     For Watson Discovery 
  
     oc get wd wd  -o yaml  -n <WD-namespace>   > watsondiscoverCR.yaml 
     For BTS  operator  
  
     oc get bts cp4ba-bts -n <bts-namespace> -o yaml > bts-CR.yaml
3: Collecting the data to diagnose installation and upgrade-related issues 
      
For installation and upgrade related issues,  collect Common services mustgather using the following script,

cat > cp-must-gather-CP-CS.sh << 'EOT'
#!/bin/bash
export MY_CLOUDPAK_NAMESPACES=cp4d,cp4ba
export MUST_GATHER_IMAGE=icr.io/cpopen/cpfs/must-gather:latest
export CLOUDPAK_NAMESPACES=common-service,ibm-common-services,openshift-operators,openshift-operator-lifecycle-manager,openshift-marketplace,$MY_CLOUDPAK_NAMESPACES
export MUST_GATHER_MODULES=overview,system,failure,cloudpak,route
oc adm must-gather --image=$MUST_GATHER_IMAGE -- gather -m $MUST_GATHER_MODULES -n $CLOUDPAK_NAMESPACES
EOT

 
  • Replace "cp4d,cp4ba" with your cluster's namespace used for cloudpak operators. 
  • If the cluster is offline (AirGap environment), use the locally mirrored registry MUST_GATHER_IMAGE=[LOCAL_REGISTRY:5000]/cpopen/cpfs/must-gather:latest 
  • Change file permission of the shell script crate from the above script and run the scripts to collect the support data.
   
chmod +x cp-must-gather-CP-CS.sh
./cp-must-gather-CP-CS.sh
A tarZ file named cloudpak-must-gather-xxx.tar.gz  will be generated under must-gather.local.xxx/quay-io-opencloudio-must-gather-xxxx directory 
  • Upload the cloudpak-must-gather-xxx.tar.gz the file. No need to tarZ the long directory.

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB67","label":"IT Automation \u0026 App Modernization"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSRV9V","label":"IBM Cloud Pak foundational services"},"ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"},{"Type":"MASTER","Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSNDZS","label":"IBM Watson Discovery Cartridge for IBM Cloud Pak for Data"},"ARM Category":[{"code":"a8m0z000000XaqFAAS","label":"Watson Knowledge Studio-\u003EInstall"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"},{"Type":"MASTER","Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SS2JQC","label":"IBM Cloud Pak for Automation"},"ARM Category":[{"code":"a8m0z0000001hwzAAA","label":"Business Console-\u003EConfiguration from UI-\u003EOther"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Product Synonym

common services, clous Pak Foundational services

Document Information

Modified date:
14 June 2024

UID

ibm17001335