IBM Support

PH54116: MQ Z/OS: AN RRS JOB THAT IS CONNECTED TO MQ AND ABENDED IS WAITING IN CSQ3EPX CSECT CSQ3RRSX ROUTINE 23/06/01 PTF PECHANGE

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • A hang occurred in MQ for a job using RRS coordination after
    the job abended or was cancelled.
    
    From a dump at the time of the hang, the RRS perpective is:
    CTXCEMGR issued a PC 3506 to RRS to process its end context
    syncpoint exit.  RRS then queued a request to the RRS address
    space to process syncpoint for the associated UR and suspended
    the requestor until syncpoint completes.
    
    There was 1 interest from the MQ Resource Manager. RRS called
    MQ, which had not returned. For the RRS TCB, the linkage stack
    entries show RRS did a PC 81F06, and then there was a PC 30D
    from MQ Load Module CSQ3EPX CSECT CSQ3RRSX routine
    EBACTL_RETRY1. MQ is waiting for ECB DIEWA.lECB.  The RRS TCB
    was looping in EBACTL_RETRY1 waiting for the EBACTL flag to be
    set for a thread.
    
    In the reported case, a cancel or kill of the job was done. The
    related MQ thread was making an MQ API call. Since the work
    being done was from an RRS batch application, the task
    underwent an EB switch. As part of this call an execution unit
    switch (EUS) was required, and the task suspended awaiting the
    request completing. In the reported case, the task was killed
    (abended S422) from USS, and recovery routine EUS1FRRE got
    control. The EUS did not complete in the next 2 seconds, so the
    EBDR flag was set to indicate that recovery should be deferred
    until End of Task (EOT). The recovery routine percolated.
    
    A critical recovery exit in CSQMCPRH was not called which
    would haveve turned on EBACTL in the context ACE EB.
    This later leads to looping in CSQ3RRSX and the address space
    failing to terminate.
    
    The problem only exists when EBDR is set in the context ACE EB,
    which only happens for certain timing windows when a task
    abends (or is cancelled) while waiting for an execution unit
    switch.
    
    This problem is a regression caused by APAR PH38111.
    
    
    Additional symptoms:
    -------------------
    ABEND422 ABENDS422
    
    In the reported case, the RRS application affected was
    Financial Transaction Manager for SWIFT Services for z/OS
    (FTM). The problem can affect other RRS applications.
    
    For the reported FTM case, symptoms included:
    
      DNFF4008E ou-service LT 'ltname': Attempt to initialize
      SFD failed; reason code='Old SFD process found for this LT'.
    
      We issued abort + abort force for the LT (logical terminal),
      but the unix broker task was still running.
    
      We tried to cancel the SFD (SWIFTNet FIN Daemon) broker job,
      but the task was still running.
    
      We then moved the LT's to other LPAR; but the broker job
      on the original LPAR still has the instance.DNF_FSM_SLS.lt
      queue open, so we get this error on other LPAR:
    
      DNFF4400E ou-service LT 'ltname': MQ 'OPEN' operation on
      queue 'instance.DNF_FSM_SLS.lt' of queue manager 'qmgr-name'
      failed; Reason code='2042';
      error text='SLS input queue open error'.
    
      Reason code 2042 means MQRC_OBJECT_IN_USE
    

Local fix

  • The problem of the hung RRS job was resolved with an IPL.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All users of IBM MQ for z/OS Version 9       *
    *                 Release 1 Modification 0 and Version 9       *
    *                 Release 2 Modification 0 and Version 9       *
    *                 Release 3 Modification 0.                    *
    ****************************************************************
    * PROBLEM DESCRIPTION: An RRS application that is connected to *
    *                      MQ hangs after abending or being        *
    *                      canceled.                               *
    ****************************************************************
    After an RRS application that is connected to MQ abends or is
    canceled it may hang.
    
    The hang is caused by a loop that waits for the connected
    application to finish its work inside the Queue Manager's
    Address Space. However the internal state that reflects when the
    connected RRS application is no longer operating inside the
    Queue Manager's Address Space was not being set correctly
    resulting in the hang.
    

Problem conclusion

  • The code has been corrected so that the internal state that
    reflects when the connected RRS application is no longer
    operating inside the Queue Manager's Address Space is set
    correctly preventing the connected RRS application from hanging.
    

Temporary fix

Comments

APAR Information

  • APAR number

    PH54116

  • Reported component name

    IBM MQ Z/OS V9

  • Reported component ID

    5655MQ900

  • Reported release

    100

  • Status

    CLOSED PER

  • PE

    YesPE

  • HIPER

    YesHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2023-04-24

  • Closed date

    2023-08-07

  • Last modified date

    2023-09-01

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    UI93081 UI93082 UI93083

Modules/Macros

  • CSQ3ID30 CSQCECTX CSQM148M CSQMCLMT CSQMCPRH CSQVEOT1 CSQVEUS2
    CSQVEUS3
    

Fix information

  • Fixed component name

    IBM MQ Z/OS V9

  • Fixed component ID

    5655MQ900

Applicable component levels

  • R100 PSY UI93083

       UP23/08/19 P F308 ­

  • R200 PSY UI93082

       UP23/08/19 P F308 ­

  • R300 PSY UI93081

       UP23/08/19 P F308 ­

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"100","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
28 September 2023