IBM Support

Mustgather: Collecting data to diagnose issues with IBM Business Automation Studio

Product Documentation


Abstract

This document describes the general information and diagnostic data needed to start troubleshooting issues related to IBM Business Automation Studio containers, which are included in IBM Cloud Pak for Business Automation. Include the diagnostics retrieved from using this document when you open a case for Business Automation Studio in Cloud Pak for Business Automation.

Content

Overview of Business Automation Studio diagnostic information


General diagnostic information


As needed diagnostic information


Detailed diagnostic collection steps

Use these detailed steps to gather different types of data for BA Studio.  When you run the diagnostic commands, run the commands from an empty collection directory to make it easy to package the files. Run the commands from the project or namespace containing BA studio or use the -n <namespace> flag with all oc commands.
Note: oc commands are interchangeable with kubectl.

1: Provide a detailed description of the problem and your environment

  • Provided a detailed description of your issue. Include screen captures and re-create steps if possible.
    Is it an intermittent or recreatable issue? Has this problem always been an issue or only started after a change occurred?
    What is the business impact? Do we need to be aware of any deadlines impacted by the issue?
  • Provide a reference to the documentation being followed for the failing operation
  • Which platform are you using (OpenShift, managed OpenShift, other Kubernetes platform)?
  • For Business Automation Studio, what is the database type and version?


2: Gather the configuration information

Gather the general configuration data
oc get icp4acluster -oyaml > CP4BAconfig.yaml
oc get content -oyaml > ContentCR.yaml

oc adm must-gather --image=icr.io/cpopen/cpfs/must-gather:latest -- gather -m automationfoundation -n <cloud pak namespace>
The -n parameter is required and must be a single namespace.  If using an air gap setup, you should ensure you have pushed the latest version of the must-gather image into your local repository. The command requires cluster admin access to execute. Generally, we see collections takes 5 - 10 minutes and produces a 25 - 50MB gzip file although times can vary greatly depending on the environment. If you have trouble with this see item 2 of the main Cloud Pak MustGather for more details.
In CP4BA 23.0.1 and later, there is an improved mustgather image that can gather more targeted information. This can allow for some alternative collections to get data specific to certain issues. For more details, see Gathering deployment information and logs from Cloud Pak for Business Automation.
Here are some options that include BA Studio config data and logs where cp4baNS is the Cloud Pak namespace and 23.0.2 is the appropriate version tag.
Workflow Authoring:
oc adm must-gather --image=icr.io/cpopen/cp4ba/icp4a-must-gather:23.0.2 -- gather -m cp4ba –p workflow_authoring -n cp4baNS
Workflow Process Service Authoring:
oc adm must-gather --image=icr.io/cpopen/cp4ba/icp4a-must-gather:23.0.2 -- gather -m cp4ba –p wfps_authoring -n cp4baNS
Business Applications Designer:
oc adm must-gather --image=icr.io/cpopen/cp4ba/icp4a-must-gather:23.0.2 -- gather -m cp4ba –p application -n cp4baNS
Automation Document Processing Authoring:
oc adm must-gather --image=icr.io/cpopen/cp4ba/icp4a-must-gather:23.0.2 -- gather -m cp4ba –p document_processing -n cp4baNS

3: Log and Tracing data for WebSphere Liberty 
For Business Automation Studio usage issues, follow these steps to enable IBM WebSphere Liberty tracing on the container.
  1. Edit the icp4acluster CR used by the operator create the bastudio pods.
    Modify the traceSpecification property in the bastudio logs section of the yaml and set the following trace string.
    spec:
      ...
      bastudio_configuration:
        ...
        logs:
          trace_specification: '*=info:com.ibm.bpm.rest.*=all:com.ibm.bpmsdk.*=all:com.ibm.bpm.socialbus.*=all'
    Update the CR with the new configuration using your preferred method. For example the edit command can be used.
    oc edit icp4acluster

    Note: It can take a large amount of time to recognize the change (length of an operator reconcile) and update the studio configuration. You can grep the log file for traceSpecification to see when the trace settings change.

  2. Optional: To have the changes applied immediately, modify the configmap ending with bastudio-overrides-configmap.
    This configmap should contain a trace-specification.xml file. Edit the settings of this file to match what was used in the CR file.

  3. Re-create your issue and gather the BA Studio log files. The following command can be used to gather the logs where pod name is one of the BA Studio pods.
    oc cp <pod-name>:/logs/application/BAS ./BAS
    Note: The logs can also be gathered directly from the associated persistent volume(PV)
  4. Disable the trace by setting traceSpecification back to "*=info" and applying the changes again.

4: Export of your application

If the issue is specific to a certain application, provide an export of that application.
For more information, see Exporting Projects.

5: Collect Operator logs

If you are having issues during the deployment by the CP4BA operator, then collect the operator logs:
oc cp $operator_pod_name:/tmp/ansible-operator/runner/ ./operator_logs/
 
Where $operator_pod_name is the name of the operator pod you are concerned with (for example ibm-cp4a-operator). Generally you should provide this for the cp4a and content operators. For more information, see the installation troubleshooting page.

6: Collect Browser data for UI issues

For console or web application usage issues, capture the following browser data:

7: Gathering verbose:gc, javacores and heapdumps

 
For issue related to performance, hangs, jvm crashes, or memory issues, we need to get dumps from the liberty servers.
 
  1. Determine the names of the BA Studio server pods by using the get pods command.
    oc get pods | grep bastudio-deployment
  2. If dumps need to be generated, you can use the Liberty server dump commands to create them. Use the javadump command to generate javacores for each BAW server pod. Include the option --include=heap or --include=system to generate heapdumps or system core dumps. For example, the following command generates a javacore and heapdump for the pod.

    oc exec <podname> -- bash -c "server javadump --include=heap"
    Note: If a BA Studio Liberty server JVM crashes, then dumps will be generated as well.
  3. Provide the dumps by tarring the files in the BA Studio dump persistent volume (bastudio-dump-pvc). This command can be used on any ba studio pod that shares the dump pvc.

    oc cp <pod-name>:output/dump ./BAS/dump

    Note: You can also get the files by directly accessing the related persistent volume.

Enabling verbose:gc and other JVM dump options.

  1. Update the CR to include the needed JVM options and point the logs at an appropriate location.
    bastudio_configuration
      jvm_customize_options: -verbose:gc -Xverbosegclog:/logs/application/BAS/verbosegc/verbosegc.%Y%m%d.%H%M%S.%pid.txt,20,10000 -Xdump:stack:events=allocation,filter=#25m
    These options enable verbose:gc, send log files to the logging PVC under the BAS/verbosegc directory and enables stack dump for gc events larger than 25MBs.
  2. The operator will rollout the changes. To confirm the change or speed up the process, you can view or edit the icp4adeploy-bas-credential-secret. The jvm.options key in this secret contains the settings. The pods do need to be restarted to pickup the new settings if you change the secret.
  3. Once enabled logs can be gathered from the logging PV as mentioned in item 3 of this mustgather.

8: Gathering resource registry data

 
If you have an issue with the interaction between BA Studio and resource registry, then gather the follow data:
  • Get a dump of the resource registry contents. Run this command from one of the resource registry pods.
    etcdctl --cacert=/shared/resources/tls/ca-cert.pem --user=root:<root password> --insecure-skip-tls-verify get "" --from-key
    The root password can be determine by checking the secret icp4adeploy-rr-admin-secret.
  • Enable this trace string in addition to any other needed tracing when recreating the issue:
    com.ibm.bpm.dbaregistry.*=all: com.ibm.bpm.resourceregistry.*=all: com.ibm.bpm.serviceregistry.*=all: com.ibm.bpm.bas.registry.*=all
    See item 3 for more details on enabling trace.
     
 

What to do next

  1. Review the log files and traces at the time of the problem to try to determine the source of the problem.
     
  2. Check these locations for known issues:
  3. Once you completed gathering all the needed information and diagnostics, you can add them to your case. Alternatively, you can upload files to ECURep. For more information, see Enhanced Customer Data Repository (ECuRep) - Overview.

Document Location

Worldwide

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SS2JQC","label":"IBM Cloud Pak for Automation"},"ARM Category":[{"code":"a8m0z0000001evyAAA","label":"Business Automation Studio"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
06 February 2024

UID

ibm11078569