General Page
All nodes in the IBM Fusion 2.9.0 rack go into a monitoring degraded state with the following error message and condition in each node monitoring CR:
- lastTransitionTime: <timestamp>
message: 'Failed to get NetworkSwitchCR info: stale switch cr entries, switch <switch-name> lastTransitionTime time greater than <integer-number> minutes'
observedGeneration: 1
reason: immMonitoring
status: 'True'
type: Error
environmentReadiness:
- category: ComputeHealth
message: Failed to get hardware monitoring data for the node <compute-node-name>.
messageArgs:
- <compute-node-name>
messageCode: BMYMO0002W
messageType: WARNING
Note: Node upsize and configuration might also fails with the same issue.
- lastTransitionTime: <timestamp>
message: 'Failed to get NetworkSwitchCR info: stale switch cr entries, switch <switch-name> lastTransitionTime time greater than <integer-number> minutes'
observedGeneration: 1
reason: immMonitoring
status: 'True'
type: Error
environmentReadiness:
- category: ComputeHealth
message: Failed to get hardware monitoring data for the node <compute-node-name>.
messageArgs:
- <compute-node-name>
messageCode: BMYMO0002W
messageType: WARNING
Note: Node upsize and configuration might also fails with the same issue.
Workaround
Follow the steps to resolve the issue:
- Take a backup of the network switch CRs. Follow the steps to download the instances of the
switchCR:- Log in to the OpenShift web console.
- Go to Administration > CustomResourceDefinitions.
- Enter the text "switch" in the search bar and select the Switch CRD from the search list.
- Go to Instances tab.
- Click each switch instance and go to the YAML tab.
- Click Download to get the switch instances.
- Run the following commands using OpenShift CLI:
oc project ibm-spectrum-fusion-ns
oc get switch
Example output:
NAME AGE
hspeed1-isfdeld 6d7h
hspeed2-isfdeld 6d8h
mgmt1-isfdeld 6d7h
mgmt2-isfdeld 6d8h - Delete
hspeed1andhspeed2switch CRs using the following command.
oc delete switch <hspeed-name>
For example:
oc delete switch hspeed1-isfdeld
oc delete switch hspeed2-isfdeld - Delete the network operator pod using the following commands.
oc get pods | grep network
oc delete pods <network-pod-name> - Run the following command to check whether the
hspeed1andhspeed2switch CRs get recreated on the switch CR.
oc get switch -w - After the
hspeed1andhspeed2switch CRs get created, then use the following command to delete themgmt1andmgmt2switch CRs.
oc delete switch <mgmt-name>
For example:
oc delete switch mgmt1-isfdeld
oc delete switch mgmt2-isfdeld - Delete the network operator pod using the following commands.
oc get pods | grep network
oc delete pods <network-pod-name> - Run the following command to check whether the
mgmt1andmgmt2switch CRs get recreated on the switch CR.
oc get switch -w - Once all the switch CRs get recreated, then nodes should return to
Normalstate in a few minutes.
[{"Type":"MASTER","Line of Business":{"code":"LOB69","label":"Storage TPS"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSXEFDS","label":"IBM Fusion HCI Appliance Software"},"ARM Category":[{"code":"a8m3p0000000rX7AAI","label":"HW"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"2.9.0"}]
Was this topic helpful?
Document Information
Modified date:
12 March 2025
UID
ibm17185468