IBM Support

PH39958: MQ Z/OS: PLR HANGS ON PEER QMGRS LEADING TO S026 AND ABNORMAL QUEUE MANAGER TERMINATION

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • PLR hangs on peer qmgrs, leading to S026-08118001 /
    S026-08118002 abends and abnormal queue manager termination,
    following a failed PLR attempt when the owning queue manager
    restarts.
    It follows abend 602 which occurs during qmgr startup if the
    AMSM address space fails to start.
    Following the restart of CSQ1 it attempted to perform PLR for
    it's connection to the MUTUAL structure - this failed due to an
    IxlRsnCodeHeldBySys return code when attempting to lock a list
    header, and resulting in the connection to the structure being
    disconnected with REASON=FAILURE.
    The other qmgrs received a DiscConnFail event for CSQ1's
    connection to MUTUAL, and attempted to perform PLR, however they
    required an ENQ held by CSQ1 which would not be released until
    it ended, or disconnected from the admin structure for another
    reason.
    As a result the structure task for MUTUAL on each of the other
    queue managers hung waiting for the ENQ until terminated by XCF.
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All users of IBM MQ for z/OS Version 9       *
    *                 Release 1 Modification 0,                    *
    *                 Release 2 Modification 0 and                 *
    *                 Release 3 Modification 0.                    *
    ****************************************************************
    * PROBLEM DESCRIPTION: Abend S026-08118001, followed later by  *
    *                      abend S026-08118002 and abnormal        *
    *                      queue manager termination S6C6 occurs   *
    *                      due to ENQ contention when a peer queue *
    *                      manager fails Peer Level Recovery of    *
    *                      its own connection to an application    *
    *                      structure during startup.               *
    ****************************************************************
    A queue manager terminated abnormally while shared queue
    operations were inflight. Any peer queue managers connected to
    the structure attempted Peer Level Recovery (PLR) for the failed
    connection, but were unable to perform recovery due to a
    list's lock being held by the system,
    When the terminated queue manager restarted, it connected to the
    structure and attempted PLR, however this also failed for the
    same reason, and the queue manager disconnected, indicating
    REASON=FAILURE.
    The connected peers detected this failure, and attempted to
    start PLR, however this required an ENQ that was held by the
    disconnecting queue manager until termination.
    This resulted in the structure task of each connected peer
    hanging until terminated by XCF.
    

Problem conclusion

  • Connected peers will no longer attempt PLR when a queue manager
    fails PLR for it's own connection during startup, preventing
    the hang condition.
    PLR for the connection will be retried when the owning queue
    manager next attempts to connect to the structure.
    

Temporary fix

Comments

APAR Information

  • APAR number

    PH39958

  • Reported component name

    IBM MQ Z/OS V9

  • Reported component ID

    5655MQ900

  • Reported release

    100

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2021-08-18

  • Closed date

    2022-07-21

  • Last modified date

    2022-10-07

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    UI81009 UI81010 UI81600

Modules/Macros

  • CSQECLOS CSQESEX  CSQESTE
    

Fix information

  • Fixed component name

    IBM MQ Z/OS V9

  • Fixed component ID

    5655MQ900

Applicable component levels

  • R100 PSY UI81010

       UP22/07/01 P F206

  • R200 PSY UI81009

       UP22/07/01 P F206

  • R300 PSY UI81600

       UP22/08/03 P F208 ¢

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"100","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
07 October 2022