IBM Support

PH47266: ABN=5C6-00E20045,M=CSQGFRCV,LOC=CSQILPLM.CSQIUOWA+00000662

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • During a QMGR restart it fails with the abn 5c6-00E20045
    because of a partially deleted shared queue. When a task then
    tries to open the queue the error causes it to loop. The
    looping task is part of a display queue command running on DB2
    server. Checkpoint processing then runs and issues a query to
    DB2, and this unluckily picks the same DB2 server task and is
    queued behind the looping  task. That meant checkpoints were no
    longer being written, but the tests continued doing lots of
    persistent/recoverable work.
    The QMGR was then cancelled, leaving a huge backlog of
    outstanding recovery processing since the last successful
    checkpoint. On the next startup the volume of this recovery
    processing resulted in the various memory problems.
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All users of IBM MQ for z/OS Version 9       *
    *                 Release 2 Modification 0 and                 *
    *                 Release 3 Modification 0.                    *
    ****************************************************************
    * PROBLEM DESCRIPTION: When MQ checkpointing has stalled a     *
    *                      Queue Manager could get into an         *
    *                      unrecoverable situation after the logs  *
    *                      containing the most recent checkpoint   *
    *                      have been reused.                       *
    ****************************************************************
    If logging activity continues after MQ checkpoint processing has
    stalled for long enough the log containing the most recent
    checkpoint could be reused. This has the potential to leave the
    Queue Manager in an unrecoverable situation as there would be no
    checkpoint to rebuild the Queue Manager from.
    

Problem conclusion

  • A new message CSQJ169E has been added to indicate when the last
    checkpoint is no longer contained on any of the active logs.
    When this scenario is detected this new message will be output
    during active log switch processing to indicate to the user that
    checkpoint processing may have stalled. Action may need to be
    taken to ensure that a new checkpoint is taken to prevent the
    Queue Manager from proceeding into an unrecoverable situation.
    
    The IBM Documentation is updated as follows:
    
    Both the V930 & V920 doc pages below will have new entries for
    message CSQJ169E:
    IBM MQ
     -Reference
      -Messages and reason codes
       -IBM MQ for z/OS messages,completion, and reason codes
        -Messages for IBM MQ for z/OS
         -Recovery log manager messages (CSQJ...)
    
    CSQJ169E
        LAST CHECKPOINT NOT FOUND IN ACTIVE LOG COPY & WITH
    STARTRBA=&, CHECKPOINT RBA=&.
    
        Explanation
    
    During active log switch processing the last checkpoint was not
    found on any active logs. This could leave the Queue Manager in
    an unrecoverable position if there are insufficient archive logs
    available to find the required recovery point during restart
    processing. This may be an indication that checkpoint processing
    may have stalled or is not completing in a timely manner and
    should be investigated.
    
        System action
    
    Log switch processing will continue.
    
        System programmer response
    
    You may be able to re-establish checkpointing by stopping and
    restarting the Queue Manager. If checkpointing is stalled, the
    STOP QMGR command may not be able to shut down the Queue Manager
    normally. If this happens, then the Queue Manager may need to be
    cancelled. Before doing so, ensure that the logs from the
    restart RBA onwards are available. The restart RBA can be found
    using the DISPLAY USAGE command.
    
    If it appears that checkpointing has stalled, then take a dump
    of the Queue Manager Address Space and contact your IBM support
    center for assistance to help understand why checkpointing may
    have stalled.
    
    If checkpointing does not appear to have stalled, then an
    alternative reason for this situation might be that the Queue
    Managers active logs are too small for the current workload and
    checkpoint processing is not completing during the scope of one
    active logs lifespan.
    

Temporary fix

Comments

APAR Information

  • APAR number

    PH47266

  • Reported component name

    IBM MQ Z/OS V9

  • Reported component ID

    5655MQ900

  • Reported release

    200

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2022-06-15

  • Closed date

    2023-09-21

  • Last modified date

    2023-11-01

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    UI93668 UI93669 UI93670 UI93671 UI93672 UI93673 UI93674 UI93675
    UI93676 UI93677 UI93678 UI93679

Modules/Macros

  • CSQFJDIC CSQFJDIE CSQFJDIF CSQFJDIK CSQFJDIU CSQFLTXC CSQFLTXE
    CSQFLTXF CSQFLTXK CSQFLTXU CSQFMTXC CSQFMTXE CSQFMTXF CSQFMTXK
    CSQFMTXU CSQJW307
    

Fix information

  • Fixed component name

    IBM MQ Z/OS V9

  • Fixed component ID

    5655MQ900

Applicable component levels

  • R200 PSY UI93674

       UP23/10/10 P F310

  • R201 PSY UI93675

       UP23/10/10 P F310

  • R202 PSY UI93676

       UP23/10/10 P F310

  • R203 PSY UI93677

       UP23/10/10 P F310

  • R204 PSY UI93678

       UP23/10/10 P F310

  • R205 PSY UI93679

       UP23/10/10 P F310

  • R300 PSY UI93668

       UP23/10/10 P F310

  • R301 PSY UI93669

       UP23/10/10 P F310

  • R302 PSY UI93670

       UP23/10/10 P F310

  • R303 PSY UI93671

       UP23/10/10 P F310

  • R304 PSY UI93672

       UP23/10/10 P F310

  • R305 PSY UI93673

       UP23/10/10 P F310

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSYHRD","label":"IBM MQ"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"200","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
02 November 2023