Troubleshooting
Problem
After disabling super grouping, the deployment-ibm-hdm-analytics-dev-collater-aggregationservice pod keeps restarting because of out-of-memory problem.
Symptom
From the logs it appears the collater is creating a list of all events and groups which is causing the OOM. Just before the OOM the collater prints the cache which includes 869417 elements based on count of eid string. The cache may include more because the log rolled in the middle of printing the cache. The last few lines of the log are:
2023-05-08 06:03:46 DEBUG Graph:260 - Graph--> Adding the mapping between groups and events for dismantling the supergroup.
2023-05-08 06:03:50 DEBUG PollerTask:71 - normalizerConnectionEnabled is enabled, getting the super grouping enablement and rank list from normalizer...
2023-05-08 06:03:50 INFO NormalizerAgg:52 - Connecting to normalizer with url: http://noihybrid-ibm-hdm-analytics-dev-normalizer-aggregationservice:5600/api/aggregation/v1/configuration
JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2023/05/08 06:03:52 - please wait.
JVMDUMP032I JVM requested System dump using '/app/core.20230508.060352.1.0001.dmp' in response to an event
JVMDUMP030W Cannot write dump to file /app/core.20230508.060352.1.0001.dmp: Permission denied
Diagnosing The Problem
Resolving The Problem
This issue is resolved in NOI version 1.6.9
Document Location
Worldwide
[{"Type":"MASTER","Line of Business":{"code":"LOB45","label":"Automation"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSTPTP","label":"Netcool Operations Insight"},"ARM Category":[{"code":"a8m0z0000001jZTAAY","label":"NOI Netcool Operations Insights-\u003ECNEA Cloud Native Event Analytics"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]
Was this topic helpful?
Document Information
Modified date:
04 September 2023
UID
ibm17028498