IBM Support

IJ09741: CAA:UNEXPECTED CLCONFD STARTUP ON STOPPED NODE LEADS TO DEADLOCKAPPLIES TO AIX 7100-04

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • After a reboot of a STOPPED node, a manual startup of
    clconfd can result in a deadlock which prevents the node
    from starting CAA cluster services.
    
    The syslog shows repetitious messages:
    
    Sep 18 16:00:46 aix1aiib19p caa:info unix:
    kcluster_lock.c      wait_on_node_bringup    480
    Active node count = 0 rc = -1 code = 455
    Sep 18 16:00:46 aix1aiib19p caa:err|error unix:
    kcluster_lock.c get_storage_nodecnt     105
    NDD_NODE_CNT returned rc=16
    Sep 18 16:00:47 aix1aiib19p caa:err|error unix:
    kcluster_lock.c get_storage_nodecnt     105
    NDD_NODE_CNT returned rc=16
    Sep 18 16:00:47 aix1aiib19p caa:err|error unix:
    kcluster_lock.c get_storage_nodecnt     105
    NDD_NODE_CNT returned rc=16
    Sep 18 16:00:48 aix1aiib19p caa:info unix:
    kcluster_lock.c      get_storage_nodecnt     132
    NDD_NODE_CNT rc 16 count -1 line 113
    Sep 18 16:00:48 aix1aiib19p caa:info unix:
    kcluster_lock.c      count_active_nodes      393      rc
    -1 code 314 num_nodes_active 0 up_node_cnt 0 db_node_cnt
    0
    Sep 18 16:00:48 aix1aiib19p caa:info unix:
    kcluster_lock.c      count_active_nodes      395
    num_local_nodes_active 0 local_up_node_cnt 0
    local_db_node_cnt 0
    Sep 18 16:00:48 aix1aiib19p caa:info unix:
    kcluster_lock.c      wait_on_node_bringup    480
    Active node count = 0 rc = -1 code = 455
    Sep 18 16:00:48 aix1aiib19p caa:err|error unix:
    kcluster_lock.c get_storage_nodecnt     105
    NDD_NODE_CNT returned rc=16
    

Local fix

  • "stopsrc -s clconfd" or "kill -9 clconfd_pid", followed
    by
    "clmgr online node START_CAA=yes"
    

Problem summary

  • An operation to start a CAA node will fail due to failure to
    obtain lock.
    

Problem conclusion

  • Do not attempt to JOIN a STOPPED node to the CAA cluster.
    

Temporary fix

Comments

  • 7100-04 - use AIX APAR IJ09741
    7100-05 - use AIX APAR IJ09764
    7200-01 - use AIX APAR IJ13807
    7200-02 - use AIX APAR IJ13519
    7200-03 - use AIX APAR IJ12372
    

APAR Information

  • APAR number

    IJ09741

  • Reported component name

    AIX V7.1

  • Reported component ID

    5765H4000

  • Reported release

    710

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2018-09-27

  • Closed date

    2018-09-28

  • Last modified date

    2019-11-14

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    IJ09764 IJ09871 IJ12372 IJ13519 IJ13807

Fix information

  • Fixed component name

    AIX V7.1

  • Fixed component ID

    5765H4000

Applicable component levels

  • R710 PSY U879895

       UP19/07/15 I 1000

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SG11R"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"710","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]

Document Information

Modified date:
20 April 2022