A fix is available
APAR status
Closed as program error.
Error description
Original APAR title: MQ Z/OS: A CHANNEL REMAINS IN THE STOPPING STATE DUE TO THE RESUME FROM A LATCH SUSPEND NOT TAKING PLACE CORRECTLY . . A suspended SRB (service request block) was marked as resumed but was still paused. A very small timing window has been found in the suspend/resume code that caused the resume not to take effect. In the reported case, the symptom was that a receiver channel had CHSTATUS attributes that included STATUS(STOPPING), SUBSTATE(MQICALL), and STOPREQ(YES). STOP CHANNEL(channel-name) MODE(FORCE) did not end it. An attempt to start the channel from the sender queue manager resulted in: AMQ9558E: The remote channel '<channel-name>' on host '<ip-address>' is not currently available. CSQX558E is the z/OS equivalent to AMQ9558E. The message in the CHIN log was: CSQX514E CSQXRESP Channel <channel-name> is active This was despite the QMGR definition having settings of ADOPTCHK(ALL) and ADOPTMCA(ALL). From the TCP/IP perspective, the socket on the z/OS side was in the CLOSEWAIT (CLOSWT) state. In a dump, the thread for the channel was in commit processing (modules CSQMCCMT and CSQRUC01) for batch confirmation. The commit processing was scheduled on a CHIN adapter TCB. The adapter scheduled an SRB to CSQRUCA3 to complete the commit. The adapter subsequently suspended in CSQVSRX waiting for this request to complete. The SRB tried to obtain the IVSA.csObjectLatch latch for a queue and suspended in CSQVXLT0 due to latch contention. Normally a latch wait is short, but in this case, the wait was for hours or days. The latch was no longer held, yet the waiting thread did not wake up from its suspend state. Additional symptoms and keywords: -------------------------------- CLOSWT CLOSE_WAIT CLOSE-WAIT . Symptoms can vary based on the function that was not resumed. Channels, IMS, Db2, and other jobs can be affected. . In one case, the suspend came from CSQJW101 for a write to the active log. The I/O completed, but the waiter was not successfully resumed.
Local fix
To clear the hung thread, a recycle of the QMGR and CHIN is needed. Use STOP MODE(FORCE) if necessary. If shutdown does not complete, cancel the address spaces, starting with the CHIN or other hung job first.
Problem summary
**************************************************************** * USERS AFFECTED: All users of IBM MQ for z/OS Version 9 * * Release 1 Modification 0, * * Release 2 Modification 0 and * * Release 3 Modification 0. * **************************************************************** * PROBLEM DESCRIPTION: A task running inside MQ is suspended, * * for example while waiting for a latch, * * and is not correctly woken up when the * * task should resume processing. * **************************************************************** During CSQVSUSP processing IEAVPSE reported that the provided Pause Element Token (PET) was stale. CSQVSUSP updated the ROB to provide a valid PET and retried the IEAVPSE request, causing it to be suspended. However CSQVRESM was already in the process of resuming the same ROB using the same stale PET. Under rare timing conditions this CSQVRESM doesn't detect that the PET to use has changed, and does not release the PE. This results in the suspended task remaining hung.
Problem conclusion
CSQVRESM is changed to appropriately handle the stale PET when the reported timing condition occurs.
Temporary fix
Comments
APAR Information
APAR number
PH50958
Reported component name
IBM MQ Z/OS V9
Reported component ID
5655MQ900
Reported release
100
Status
CLOSED PER
PE
NoPE
HIPER
YesHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2022-11-16
Closed date
2023-06-16
Last modified date
2023-09-29
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
UI92293 UI92294 UI92295
Modules/Macros
CSQVSRX
Fix information
Fixed component name
IBM MQ Z/OS V9
Fixed component ID
5655MQ900
Applicable component levels
R100 PSY UI92295
UP23/07/15 P F307 ¢
R200 PSY UI92294
UP23/07/15 P F307 ¢
R300 PSY UI92293
UP23/07/15 P F307 ¢
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"100","Line of Business":{"code":"LOB45","label":"Automation"}}]
Document Information
Modified date:
29 September 2023