APAR status
Closed as program error.
Error description
We can confirm that there is a memory leak in the portal-tunnel pods. We believe this is the cause of the toggling from passive to progressing to passive, which happens each time the oomkiller kills the process with the memory leak2 instances of the operator changing to haMode to "progressing to passive": Both look like: {"level":"info","ts":1649776092.7196248,"logger":"controllers.P ortalCluster","msg":"Currentportal pods in multi DC mode","porta p1-sit","spec.ha.mode":"passive","s tatus.hamode":"passive","dbSts":"passive","wwwSts":"passive","d b":"all(Ready/Total): 3/3 passive(Ready/Total): 3/3","www":"all (Ready/Total): 2/3 passive(Ready/Total): 2/3","nginx":"all ( Ready/Total): 3/3 passive(Ready/Total): 3/3"} {"level":"info","ts":1649776092.7196639,"logger":"controllers.P ortalCluster","msg":"UpdatingHAMode","portalcluster":"apic-sit/p ressingto passive (ready for traffic)"} Each time it is due to one of the www pods failing a ready check, so if the customer can provide the www logs There is no error from the -tunnel pods before they restart and I am pretty sure they are getting oom killed due to the memory limit being too small. It is set to 256MB in the profile but it should be 512MB. Please integrate the following section to the portal CR in both DCs and see if it stops the RESTARTS number growing for the -tunnel pods and stops the state toggling: spec: template: - containers: - name: server resources: limits: memory: 512Mi name: tunnel Note that you will already have a spec section and may already have a template section so please integrate the aboe YAML into both portal CRs. This change will be integrated into a future release of APIConnect.
Local fix
Problem summary
portal-tunnel pods had a memory leak that resulted in oomkiller killing the ws-tunnel process and a brief disconnect between the DCs
Problem conclusion
The memory leak is now fixed
Temporary fix
Comments
APAR Information
APAR number
LI82618
Reported component name
API CONNECT ENT
Reported component ID
5725Z2201
Reported release
A0X
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2022-05-12
Closed date
2022-07-12
Last modified date
2022-09-08
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
API CONNECT ENT
Fixed component ID
5725Z2201
Applicable component levels
[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSMNED","label":"IBM API Connect"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"A0X","Line of Business":{"code":"LOB45","label":"Automation"}}]
Document Information
Modified date:
08 September 2022