IBM Support

IV71217: NODE DOWN IN CAA CLUSTER DUE TO CONFIGRM MEMORY LEAK

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as fixed if next.

Error description

  • ***************************************************************
    * USERS AFFECTED:
    * Systems running rsct.core.rmc 3.1.5.0 through 3.1.5.8,
    * as well as rsct.core.rmc 3.2.0.0 through 3.2.0.4.
    * This includes AIX 6.1 TL9 and AIX 7.1 TL3, and VIOS 2.2.3.
    * Other AIX levels can be affected if RSCT has been updated
    * independently of AIX.
    ***************************************************************
    * PROBLEM DESCRIPTION:
    * Starting in rsct.core.rmc 3.1.5.0, a slow memory leak in
    * IBM.ConfigRM under CAA can lead to a cluster service
    * shutdown, which causes to a node failure in both PowerHA v7
    * (halt) and VIOS SSP (system panic).
    * The leak occurs as long as CAA is active, regardless of
    * what PowerHA or SSP is doing, and only on the node
    * operating as the ConfigRM Group leader.  The GL node
    * can be identified in "lssrc -ls IBM.ConfigRM".
    * A reboot is guaranteed to reset the situation.  Time to
    * failure after a new boot is estimated to be between 4 and 8
    * months, although no existing records of failures in the
    * field still retained the time of the last reboot, so a
    * precise deadline is not known.
    ***************************************************************
    * RECOMMENDATION:
    * The fix for RSCT 3.1.5 is available via RSCT APAR IV66606.
    * An interim fix for RSCT 3.1.5 is available from either:
    * ftp://aix.software.ibm.com/aix/ifixes/iv66606/
    * https://aix.software.ibm.com/aix/ifixes/iv66606/
    *
    * The fix for RSCT 3.2.0 is available via RSCT APAR IV69760.
    * The fix for RSCT 3.2.0 will also ship with:
    * AIX 6.1 TL9 SP5, AIX 7.1 TL3 SP5, and VIOS 2.2.3.5.
    * An interim fix for RSCT 3.2.0 is available from either:
    * ftp://aix.software.ibm.com/aix/ifixes/iv69760/
    * https://aix.software.ibm.com/aix/ifixes/iv69760/
    ***************************************************************
    * NOTICE:
    * Several older interim fix packages have been made available
    * through various IBM support paths prior to the release of
    * the official interim fixes above.
    *
    * Customers holding any of those older packages which have
    * not yet been installed should discard the unused package
    * for the one available in the links above.
    *
    * Customers who already have any other IV66606 package
    * installed should check the state of IBM.ConfigRM in lssrc;
    * if the subsystem is active, then no further action is
    * needed.  If the subsystem is not running, contact IBM
    * support for assistance in replacing it, since the absence
    * of IBM.ConfigRM may cause emgr removal checks to fail.
    * In particular, customers with the "IV66606.1" package
    * installed on RSCT 3.1.5.0 or 3.1.5.1 are known to be
    * exposed to this problem.
    ***************************************************************
    

Local fix

Problem summary

  • ***************************************************************
    * USERS AFFECTED:
    * Systems running rsct.core.rmc 3.1.5.0 through 3.1.5.8,
    * as well as rsct.core.rmc 3.2.0.0 through 3.2.0.4.
    * This includes AIX 6.1 TL9 and AIX 7.1 TL3, and VIOS 2.2.3.
    * Other AIX levels can be affected if RSCT has been updated
    * independently of AIX.
    ***************************************************************
    * PROBLEM DESCRIPTION:
    * Starting in rsct.core.rmc 3.1.5.0, a slow memory leak in
    * IBM.ConfigRM under CAA can lead to a cluster service
    * shutdown, which causes to a node failure in both PowerHA v7
    * (halt) and VIOS SSP (system panic).
    * The leak occurs as long as CAA is active, regardless of
    * what PowerHA or SSP is doing, and only on the node
    * operating as the ConfigRM Group leader.  The GL node
    * can be identified in "lssrc -ls IBM.ConfigRM".
    * A reboot is guaranteed to reset the situation.  Time to
    * failure after a new boot is estimated to be between 4 and 8
    * months, although no existing records of failures in the
    * field still retained the time of the last reboot, so a
    * precise deadline is not known.
    ***************************************************************
    * RECOMMENDATION:
    * The fix for RSCT 3.1.5 is available via RSCT APAR IV66606.
    * An interim fix for RSCT 3.1.5 is available from either:
    * ftp://aix.software.ibm.com/aix/ifixes/iv66606/
    * https://aix.software.ibm.com/aix/ifixes/iv66606/
    *
    * The fix for RSCT 3.2.0 is available via RSCT APAR IV69760.
    * The fix for RSCT 3.2.0 will also ship with:
    * AIX 6.1 TL9 SP5, AIX 7.1 TL3 SP5, and VIOS 2.2.3.5.
    * An interim fix for RSCT 3.2.0 is available from either:
    * ftp://aix.software.ibm.com/aix/ifixes/iv69760/
    * https://aix.software.ibm.com/aix/ifixes/iv69760/
    ***************************************************************
    * NOTICE:
    * Several older interim fix packages have been made available
    * through various IBM support paths prior to the release of
    * the official interim fixes above.
    *
    * Customers holding any of those older packages which have
    * not yet been installed should discard the unused package
    * for the one available in the links above.
    *
    * Customers who already have any other IV66606 package
    * installed should check the state of IBM.ConfigRM in lssrc;
    * if the subsystem is active, then no further action is
    * needed.  If the subsystem is not running, contact IBM
    * support for assistance in replacing it, since the absence
    * of IBM.ConfigRM may cause emgr removal checks to fail.
    * In particular, customers with the "IV66606.1" package
    * installed on RSCT 3.1.5.0 or 3.1.5.1 are known to be
    * exposed to this problem.
    ***************************************************************
    

Problem conclusion

Temporary fix

  • *********
    * HIPER *
    *********
    

Comments

  • This APAR is being closed FIN. This means that a solution to
    this APAR is expected to be delivered from IBM in a release
    (if any) to be available within the next 24 months.
    

APAR Information

  • APAR number

    IV71217

  • Reported component name

    AIX 610 STD EDI

  • Reported component ID

    5765G6200

  • Reported release

    610

  • Status

    CLOSED FIN

  • PE

    NoPE

  • HIPER

    YesHIPER

  • Submitted date

    2015-03-19

  • Closed date

    2015-03-19

  • Last modified date

    2015-05-12

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    IV71219

Fix information

Applicable component levels

  • R610 PSY

       UP

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SG11Q","label":"AIX 6.1 HIPERS, APARs and Fixes"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSMV87","label":"AIX 6.1 Enterprise Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSMVAX","label":"AIX Express Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSAUMY","label":"IBM AIX Enterprise Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SG11Q","label":"AIX 6.1 HIPERS, APARs and Fixes"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SG11R","label":"APARs - AIX 7.1 environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]

Document Information

Modified date:
17 December 2021