IBM Support

Disabling super groups causes the collator pod to crash

Troubleshooting


Problem

After disabling super grouping, the deployment-ibm-hdm-analytics-dev-collater-aggregationservice pod keeps restarting because of out-of-memory problem.

Symptom

From the logs it appears the collater is creating a list of all events and groups which is causing the OOM. Just before the OOM the collater prints the cache which includes 869417 elements based on count of eid string. The cache may include more because the log rolled in the middle of printing the cache. The last few lines of the log are:

2023-05-08 06:03:46 DEBUG Graph:260 - Graph--> Adding the mapping between groups and events for dismantling the supergroup.
2023-05-08 06:03:50 DEBUG PollerTask:71 - normalizerConnectionEnabled is enabled, getting the super grouping enablement and rank list from normalizer...
2023-05-08 06:03:50 INFO  NormalizerAgg:52 - Connecting to normalizer with url: http://noihybrid-ibm-hdm-analytics-dev-normalizer-aggregationservice:5600/api/aggregation/v1/configuration
JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2023/05/08 06:03:52 - please wait.
JVMDUMP032I JVM requested System dump using '/app/core.20230508.060352.1.0001.dmp' in response to an event
JVMDUMP030W Cannot write dump to file /app/core.20230508.060352.1.0001.dmp: Permission denied

Diagnosing The Problem

Resolving The Problem

This issue is resolved in NOI version 1.6.9

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB45","label":"Automation"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSTPTP","label":"Netcool Operations Insight"},"ARM Category":[{"code":"a8m0z0000001jZTAAY","label":"NOI Netcool Operations Insights-\u003ECNEA Cloud Native Event Analytics"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Document Information

Modified date:
04 September 2023

UID

ibm17028498