A fix is available
APAR status
Closed as program error.
Error description
REASON=00F30908 was added by APAR PH27803 to indicate that End of Memory and Termination processing detected an unexpected loop in the VLCAAACE control block chain. When an address space ends, IBM MQ has processing to ensure that all connections have completed before cleaning up the state relating to the address space. As part of this processing, there is a loop which checks the state of the remaining connections (ACE control blocks) and waits for 1 second before checking again. In the reported case, there was not a control block loop. Rather, there was a job that had been hung for days due to the problem fixed by APAR PH50958. Because the hung thread did not end and the queue manager was not restarted, the 1-second loop continued. Within this main loop is an inner loop that runs a chain of ACE control blocks to check the state of each remaining block. There is a counter in the inner loop that is intended to detect a loop in the chain and abend the queue manager if we process more than 5,000,000 (5 million) elements in the chain. For the intended loop-detection behavior, the loop counter should be set to zero before we start the inner loop each time. However, the initialization of the counter is wrongly done before the start of the outer loop. Since the hung thread means that we are repeatedly going round the outer loop once a second, the loop counter gradually increases over time and reaches the defined limit after 57 days. MQ incorrectly believes that there is a loop in the ACE chain and abends the queue manager with reason 00F30908. The loop-checking logic is not intended to detect a hung thread, so the abend is not appropriate in those circumstances.
Local fix
Until PH50958 can be applied, if there is a hung job, recycle the queue manager before 57 days.
Problem summary
**************************************************************** * USERS AFFECTED: All users of IBM MQ for z/OS Version 9 * * Release 2 Modification 0 and Release 3 * * Modification 0. * **************************************************************** * PROBLEM DESCRIPTION: If a thread is hung in MQ, then under * * certain circumstances the QMGR could * * terminate with reason 00F30908 * * approximately 57 days later. This will * * be accompanied by a dump being * * scheduled. * **************************************************************** Incorrect checking can result in a hung thread being incorrectly interpreted as a loop in a control block chain. The QMGR responds to this scenario by terminating the QMGR with reason 00F30908.
Problem conclusion
The control block loop detection code has been corrected to prevent the QMGR from terminating.
Temporary fix
Comments
APAR Information
APAR number
PH57260
Reported component name
IBM MQ Z/OS V9
Reported component ID
5655MQ900
Reported release
200
Status
CLOSED PER
PE
NoPE
HIPER
YesHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2023-09-29
Closed date
2023-11-06
Last modified date
2023-12-02
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
UI94299 UI94300
Modules/Macros
CSQ3SSI1 CSQ3SSI2
Fix information
Fixed component name
IBM MQ Z/OS V9
Fixed component ID
5655MQ900
Applicable component levels
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"200","Line of Business":{"code":"LOB45","label":"Automation"}}]
Document Information
Modified date:
04 December 2023