IBM Support

Mustgather: Collecting data to diagnose issues with IBM Cloud Pak for Business Automation

Troubleshooting


Problem

This document describes the general information and diagnostic data needed to start troubleshooting issues related to container components included in IBM Cloud Pak for Business Automation (CP4BA). When you open a case for the Cloud Pak for Business Automation, include the diagnostics retrieved from using this document.
Note: Previously called Cloud Pak for Automation

Resolving The Problem

When you contact support for assistance with a Cloud Pak for Business Automation issue, collect the following troubleshooting data.

General Diagnostic Information
These items are the general diagnostics, which are useful in most situations regardless of component. 
Important: When you run the diagnostic commands, run them from an empty collection directory to make it easy to package the files. Run the commands from the project or namespace containing Cloud Pak for Business Automation or use the -n <namespace> flag with all oc commands.
Note: oc commands are interchangeable with kubectl.

1: Provide a detailed description of the problem and your environment

  • Provided a detailed description of your issue. Include screen captures and re-create steps if possible.
    Is it an intermittent or recreatable issue? Has this problem always been an issue or is it an issue that started only after a change occurred?
    What is the business impact? Do we need to be aware of any deadlines impacted by the issue?
    Provide details on which component of the Cloud Pak has issues.
  • Provide a reference to the documentation being followed for the failing operation.
  • Is this environment development, test, or production?
  • Which platform setup are you using (OpenShift, OpenShift on IBM Cloud Public, other Kubernetes platform)?
  • What is the database type and version?
     

2: Gather configuration information

Option 1:
If you are using CP4BA 23.0.1 or later, there is an improved mustgather image that can gather similar info and more. You can use this reference to build a command more tailored to your setup and problem. For more details, see Gathering deployment information and logs from Cloud Pak for Business Automation. Although the following commands mentioned in option 1 can be used on any versions to gather general information.

Option 1 gets all the information about resources in the namespace. It additionally gets the output of oc logs for each pod.
oc get icp4acluster -o yaml > CP4BAconfig.yaml
cp4ba_namespace=$(cat CP4BAconfig.yaml|awk '/namespace:/{print $2}')
oc adm must-gather --image=icr.io/cpopen/cpfs/must-gather:latest -- gather -m automationfoundation -n ${cp4ba_namespace:?}
Important: The cp4ba_namespace variable must contain the desired cloud pak namespace that you want to collect data from. If a icp4acluster CR exists then the above script will parse out the namespace, otherwise just replace with the appropriate value.  If using an air gap setup, you should ensure you have pushed the latest version of the must-gather image into your local repository. The command requires cluster admin access to execute.
Generally, this collection takes 5-10 minutes and produces a 25-50MB gzip file.

If you are having issues with foundational services in the ibm-common-services namespaces or other namespaces, then you can run the command again with another namespace to pull data from there. For example:
oc adm must-gather --image=icr.io/cpopen/cpfs/must-gather:latest -- gather -m automationfoundation -n ibm-common-services
Option 2: Option 2 provides some basic configuration data if you aren't able to gather option 1. It gets far less data and depending on the problem we likely will need to request more configuration data as the problem progresses.

##Provide the OpenShift and Kubernetes version information:
oc version > version.txt
kubectl version >> version.txt

##Provide the operator version information:
oc get csv > operatorInfo.txt
oc get csv -n ibm-common-services >> operatorInfo.txt

##Provide the Custom Resource(CR) .yaml file used by the operators to configure the environment
oc get icp4acluster -o yaml > cp4baConfig.yaml
oc get content -o yaml > contentConfig.yaml

##If you have the needed permissions, collection information about the nodes.
oc get nodes -o wide > nodes.txt
oc get mcp > mcp.txt

##Collect information about the pod statuses
oc get pods > pods.txt

##Collect information about the pod containers
oc get pods -o jsonpath="{..image}" > containerInfo.txt

##Gather route configuration
oc get route > routes.txt

##Collect the defined secrets
oc get secrets > secrets.txt

##Collect the defined persistent volume claims
oc get pvc > pvcs.txt

##For installation or upgrade problems, get the job information:
oc get jobs > jobs.txt

##Collect the description and log of any pod you are having issues with:
##Replace <pod-name> with the pod you need to collect data for. 

##oc describe pod <pod-name> > describe-<podname>.txt
##oc logs <podname> > log-<podname>.log

3: Collect Operator logs 

If you are having issues during the deployment by the operator (often during install or upgrade), then collect the operator logs:
  • Get the pod logs with this command:
    oc logs $operator_pod_name > $operator_pod_name.log
  • If the issue is with CP4BA's cp4a, foundation, or content operators, get the logs from the recent completed reconciles:
    oc cp $operator_pod_name:/tmp/ansible-operator/runner/ ./operator_logs/
Where $operator_pod_name is the name of the operator pod you are concerned with. For more details, see the installation troubleshooting page.
 
4: Collect scripting diagnostics (install/upgrade issues)
If you are having install or upgrade issues and you made use of the product scripting then provide the following files from the client machine that executed the scripts from cert-kubernetes files.
Provide a zip/tar of the files located at cert-kubernetes/scripts/.

5: Collect Browser data for UI issues

 
For console or web application usage issues, capture the following browser data:

6: Collect data when reporting a security vulnerability

If you are reporting a possible security vulnerability or asking about an existing CVE, see IBM Cloud Pak for Business Automation Security Vulnerability Policy.
This document includes details on our general policies and what to provide when reporting an issue.

For CVE issues from scanning tool, it will help us address your problem quickest if you can provide the following data about any items reported from vulnerability scanning:  image with digest, file & filepath with vulnerability, version detected, CVE number and the tool used.
Note: You should be using and scanning only images in the latest Ifix if you are concerned about CVEs. It is expected that the customer would not report scenarios where the file in question doesn't actually match to the software for the reported CVE.

Component-Specific Diagnostics
For issues related to a particular Cloud Pak container component, we recommend reviewing the MustGather or troubleshooting page for that component.

What to do next

  1. Review the diagnostic at the time of the problem to try to determine the source of the problem.
     
  2. Check these locations for known issues:
  3. Once you complete gathering all the needed information and diagnostics, you can add them to your case. Alternatively, you can upload files to ECURep. For more information, see Enhanced Customer Data Repository (ECuRep) - Overview.

Document Location

Worldwide

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SS7JTW","label":"IBM Digital Business Automation"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
27 June 2024

UID

ibm16120897