IBM Support

Business Automation Insights is not updating data, most common problems and troubleshooting event emission techniques.

Troubleshooting


Problem

We have implemented Business Automation Insights (BAI) and it is not updating, what can I do to troubleshoot this issue?

Symptom

If you do not see data in your Business Performance Dashboard, or Kibana Dashboard and confirmed that new data was created from a BPM event emitter source.

Cause

There are many possible causes that can create the situation and we can review the overall general system architecture by using the following diagram.
image-20230710143531-4

Environment

We try to use the most general techniques possible, at times the following techniques can be easier when using various BAI releases. 

Diagnosing The Problem

You might have various different error messages for any of the integrated components. It is best to take a systematic approach to find the possible locations where events can be stuck on.   Let us assume we created a BPMN event in BAW where we have already set up tracking groups but we don’t see the events in BPC/Kibana dashboard, the event must be sitting somewhere. From the diagram, we can see event generation starts at the event emitters, and as an example, we can say that there is event emission from BAW to the Kafa topic.  The Kafka topic runs inside BAI and receives the events.  The Flink BPM job processes and transforms the events and puts them into the Elastic Search (ES).  Kibana or BPC then reads the data from ES and an end-user creates and views dashboards based on the available data.  

Resolving The Problem

For this exercise, we can start troubleshooting the Event emitters and we use BPMN event emitters from BAW as an example.  You can find out more about specific emitters at the following URL 
https://www.ibm.com/docs/en/cloud-paks/cp-biz-automation/21.0.x?topic=server-configuring-event-emitters

 The first place to look at is the queue point on the BAW machine where the event was emitted as the cause might be with the JMS queue.  In the WebSphere application server, expand the Service integration server section and select the Service Integration Bus Browser link.  Expand the nested sections and click on Queue Points.  In this example, I purposely stop my BAI server and can see that events accumulate in the monitorDestomation.bus
image-20230710184026-1

If you find yourself in this situation immediately disable the emission of events as you will quickly reach the high limit for the number of messages for the destination. The high limit on the queue would cause BAW to stop without the ability to restart the server.  The following has more details on this situation and How to start Business Automation Workflow (BAW) after getting the message CWSIP0291W
If the monitorDestomation.bus has 0 as an entry for its queue depth then Kafka received the events and we can move to the next step, troubleshooting Flink.

Once the emission of events is disabled you need to check the DEF configuration, the status of the messaging engine, and the configuration of the Java™ Message Service (JMS) resources.

At run time, BPMN processes emit events in native JSON format by using the Dynamic Event Framework (DEF) mechanism.
1.    Events are sent to a dedicated JMS queue.
2.    These events are consumed by the BPM event emitter.
3.    The BPM event emitter formats them into raw events and sends them to Apache Kafka (or IBM Event Streams).
Check the BAIConfigure.properties file to validate that BAW and BAI have been configured to connect correctly.  https://www.ibm.com/docs/en/cloud-paks/cp-biz-automation/23.0.1?topic=workflow-configuring-bpm-event-emitter
If you believe BAIConfiure.properties is correct continue troubleshooting the BPM event emitter that uses the following link: https://www.ibm.com/docs/en/cloud-paks/cp-biz-automation/23.0.1?topic=troubleshooting-bpm-event-emitter
Troubleshooting Flink
To identify troubles that affect the Flink jobs, you can inspect the logs of the job manager and task manager containers and look at the Flink web interface. When using BAI4S the Flink user Interface should start by default after running the ./bai-start --acceptLicense command and by default the Flink web interface is available at https://machinename.xyz.yourcompany.com:8081

Depending on your configuration make sure the correct number of "jobs" is running, the following screenshots show some jobs running 
image-20230710185847-2
In my configuration, I know I should have at least six running jobs on this environment and since I am running BAI4S I could restart bai to see if more jobs get into a running state that is: bai-stop + bai-start to see the correct number of running jobs. For instance, after a restart, I would recheck the Flink Web interface
image-20230710190247-3
If do not see the correct number of Flink Jobs running then this might mean your last savepoint has been corrupted, if you are on a production server you should consult support however if you are on a test environment you can try to start from a savepoint:  https://www.ibm.com/docs/en/cloud-paks/cp-biz-automation/23.0.1?topic=tolerance-restarting-from-checkpoint-savepoint
If this does not resolve the issue you should then continue troubleshooting Flink. Flink troubleshooting on BAI4S can be found at: https://www.ibm.com/docs/en/cloud-paks/cp-biz-automation/19.0.x?topic=troubleshooting-after-bai-start-completes#concept_vlc_kmv_zjb__flink
If you are running BAI on CP4BA and can see the job manager log reports errors similar to what follows.
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
Could not allocate all requires slots within timeout of 300000 ms. 
Slots required: 8, slots allocated: 0
Please review the following for troubleshooting Apache Flink jobs in CP4BA

 

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB45","label":"Automation"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSBYVB","label":"IBM Cloud Pak for Business Automation"},"ARM Category":[{"code":"a8m0z0000001iTuAAI","label":"Use-\u003EBAI App Usage"}],"ARM Case Number":"TS013379116","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Document Information

Modified date:
21 July 2023

UID

ibm17010877