IBM Support

Mustgather: Collecting data to diagnose issues with IBM Automation Foundation

Troubleshooting


Problem

This document describes the general information and diagnostic data needed to start troubleshooting issues related to IBM Automation Foundation. If you have a case related to the IBM Automation Foundation, include the diagnostics retrieved from using this document.

Resolving The Problem

When support is needed for an IBM Automation Foundation issue, collect the following troubleshooting data.

General Diagnostic Information

These items are the general diagnostics, which are useful in most situations regardless of component. 

1: Provide a detailed description of the problem and your environment

  • Provide a detailed description of your issue. Include screen captures and re-create steps if possible.
    Is it an intermittent or recreatable issue? Has this problem always been an issue or is it an issue that started only after a change occurred? What is the business impact? Do we need to be aware of any deadlines impacted by the issue?
  • Provide details on which component has issues.
  • Provide a reference to the documentation being followed for the failing operation.
  • Is this environment development, test, or production?
  • Which platform setup are you using (OpenShift, OpenShift on IBM Cloud Public, other Kubernetes platform)?
     

2: Gather configuration information and logs

Option 1:
Gather the output of the IBM Automation Foundation oc adm mustgather command:
oc adm must-gather --image=icr.io/cpopen/cpfs/must-gather:latest -- gather -m automationfoundation -n <cloud pak namespace>
Where <cloud pak namespace> is the project IAF is installed to.
Provide the compressed output stored to: ./must-gather.local.<number>/icr-io-cpopen-cpfs-must-gather-<hash>/
This command gathers detailed information about every object in the IAF namespace in addition to some of other pieces of info related to IAF setup and the environment.
Note: This is an oc adm command so it requires a cluster admin user. If you cannot get someone with the permissions to execute this then go with option 2. The -n parameter is required and must be a single namespace.  If using an air gap setup, you should ensure you have pushed the latest version of the must-gather image into your local repository.
Generally, this collection takes 5-10 minutes and produces a 25-50MB gzip file.
If you are having issues with foundational services in the ibm-common-services namespaces, then it is recommended to also get the collection for that namespace as well:
oc adm must-gather --image=icr.io/cpopen/cpfs/must-gather:latest -- gather -m automationfoundation -n ibm-common-services
Option 2:
Manually gather a smaller subset of general info about the namespace objects and pods. This gathers much less than the mustgather command and depending on the output we may have to request more info. When you run the diagnostic commands, run them from an empty collection directory to make it easy to package the files. Run the commands from the project or namespace containing the problematic IBM Automation Foundation (IAF) deployment or include the -n <namespace> option.
  • If you are using OpenShift, provide the output of this command:
    oc version > version.txt
  • Collect Custom Resource (CR) information
    oc get AutomationUIConfig -o yaml > AutomationUIConfig.yaml
    oc get Cartridge -o yaml > Cartridge.yaml
    oc get ZenService -o yaml > ZenService.yaml
    
    When using AutomationBase:
    oc get AutomationBase -o yaml > AutomationBase.yaml
    oc get CartridgeRequirements -o yaml > CartridgeRequirements.yaml
    oc get EventProcessor -o yaml > EventProcessor.yaml
    oc get Kafka -o yaml  > Kafka.yaml
    oc get KafkaClaim -o yaml > KafkaClaim.yaml
    
    When using Insights Engine:
    oc get InsightsEngine -o yaml > InsightsEngine.yaml
  • Collect information about the pod statuses:
    oc get pods > pods.txt
  • Collect information about the pod containers:
    oc get pods -o jsonpath="{..image}" > containerInfo.txt
  • On OpenShift, gather route configuration:
    oc get route > routes.txt
    Note: If needed, more detailed route config information can be gotten with -o yaml option
  • Collect the defined secrets:
    oc get secrets > secrets.txt
  • Collect the defined configmaps:
    oc get configmaps > configmaps.txt
  • Collect the defined persistent volume claims:
    oc get pvc > pvcs.txt
  • Collect the description and log of any pod you are having issues with:
    oc describe pod <pod-name> > describe-<podname>.txt
    oc logs <podname> > log-<podname>.txt
If the issue is related to installation also gather this info:
  • On OpenShift, collect information about the jobs:
    oc get jobs > jobs.txt
  • Collect the logs from the IAF operator pods:
    oc get pods | grep "iaf-.*-controller-manager"
    For each operator pod: 
    oc logs <pod name> > log-<pod name>.txt

3: Collect Browser data for UI issues

 
For console or web application usage issues, capture the following browser data:

Component-Specific Mustgathers
For issues related to a particular IBM Automation Foundation component, we recommend reviewing the mustgather for that component.

What to do next

  1. Review the diagnostic at the time of the problem to try to determine the source of the problem.
     
  2. Check these locations for known issues:
  3. Once you complete gathering all the needed information and diagnostics, you can add them to your case. Alternatively, you can upload files to ECURep. For more information, see Enhanced Customer Data Repository (ECURep) - Overview.

Document Location

Worldwide

[{"Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSBYVB","label":"IBM Cloud Pak for Business Automation"},"ARM Category":[{"code":"a8m0z0000001gWWAAY","label":"CloudPak4Automation Platform"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)"}]

Document Information

Modified date:
06 February 2024

UID

ibm16427035