APAR status
Closed as program error.
Error description
The customer is running their Integration Node in an environment where they limit the memory, whilst writing an excessive number of archive statistics, resource stats and activity log files to a slow remote system. IIB has a single thread that writes the stats data to an in-memory queue. If the writing of the stats data to the file system is slower than the stats objects are being created, then these objects can build up on the queue and consume memory. In an environment such as this, the stats objects are being added to the in-memory queue faster than they are being written to the file system and the Integration Node environment is running out of memory, hence the abend in ImbAbend::newHandler(). This APAR introduces a new environment variable that would allow a customer to set a limit on the size (number of stats context objects) the queue can reach if these stats objects are to be written out. If when we come to write the objects, the size of the queue exceeds this limit set by the environment variable, then instead of attempting to write out the stats data, we throw it away. This would mean that there will be gaps in the stats data when the remote file system is slow, but if configured correctly, it should avoid the abends. The value for the environment variable will depend on a customers unique IIB environment. There will be one context object per flow per interval, so depending on how many flows across all of the Integration Servers a customer has, will make a difference. The size of the context objects will vary a lot as the context object for the node stats will contain an entry per node, for example. A suggested starting value for setting the environment variable would be 100, as below: MQSI_STATISTICS_MAX_QUEUE_SIZE=100 Once the environment variable has been set, the product will check the value and if the size of the in-memory queue exceeds this value, the records will be discared for that interval instead of attempting to write them out. In that case, we will log a warning to say we have dropped the stats records: BIP9945W "Queue depth is over configured maximum so dropping statistics records" It will be the responsibility of the customer to tune the value to suit their requirements based on their Integration Node environment. If they see the warning message all the time, then increase the MQSI_STATISTICS_MAX_QUEUE_SIZE value by a small increment and retry until they see the warning message less frequently, but also avoid the abends.
Local fix
Problem summary
**************************************************************** USERS AFFECTED: All users of IBM Integration Bus and App Connect Enterprise who write statistics logs to a remote file system. Platforms affected: z/OS, MultiPlatform **************************************************************** PROBLEM DESCRIPTION: In an environment where the integration Node has limited memory and an excessive number of log files are being written to a slow remote system, the integration node may encounter an Out Of Memory abend. IIB/ACE has a single thread that writes the statistics data to an in-memory queue. If the writing of the statistics data to the file system is slower than the statistics objects are being created, then these objects can build up on the queue and consume memory. In an environment such as this, the statistics objects are being added to the in-memory queue faster than they are being written to the file system and the Integration Node environment is running out of memory, hence the abend in ImbAbend::newHandler().
Problem conclusion
This APAR introduces a new environment variable that would allow a customer to set a limit on the size (number of statistics context objects) the queue can reach if these statistics objects are to be written out. If when we come to write the objects, the size of the queue exceeds this limit set by the environment variable, then instead of attempting to write out the statistics data, we throw it away. This would mean that there will be gaps in the statistics data when the remote file system is slow, but if configured correctly, it should avoid the abends. The value for the environment variable will depend on a customers unique IIB/ACE environment. There will be one context object per flow per interval, so the number of flows across all of the Integration Servers a customer has, will make a difference. The size of the context objects will vary a lot as the context object for the node statistics will contain an entry per node, for example. A suggested starting value for setting the environment variable would be 100, as below: MQSI_STATISTICS_MAX_QUEUE_SIZE=100 Once the environment variable has been set, the product will check the value and if the size of the in-memory queue exceeds this value, the records will be discarded for that interval instead of attempting to write them out. In that case, we will log a warning to say we have dropped the statistics records: It will be the responsibility of the customer to tune the value to suit their requirements based on their Integration Node environment. If they see the warning message frequently, then increase the MQSI_STATISTICS_MAX_QUEUE_SIZE value by a small increment and retry until they see the warning message less often whilst still avoiding the abend. --------------------------------------------------------------- The fix is targeted for delivery in the following PTFs: Version Maintenance Level v10.1 10.1.0.1 v11.0 11.0.0.25 v12.0 12.0.12.0 The latest available maintenance can be obtained from: http://www-01.ibm.com/support/docview.wss?rs=849&uid=swg27006041 If the maintenance level is not yet available,information on its planned availability can be found on: http://www-1.ibm.com/support/docview.wss?rs=849&uid=swg27006308 ---------------------------------------------------------------
Temporary fix
Comments
APAR Information
APAR number
PH50284
Reported component name
IIB Z/OS
Reported component ID
5655AB100
Reported release
A00
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2022-10-19
Closed date
2024-01-29
Last modified date
2024-01-29
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
IIB Z/OS
Fixed component ID
5655AB100
Applicable component levels
[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSNQH8","label":"IBM Integration Bus for z\/OS"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"10.0","Line of Business":{"code":"LOB67","label":"IT Automation \u0026 App Modernization"}}]
Document Information
Modified date:
29 January 2024