Removing and replacing a node canister
You can use this procedure to remove a faulty node canister
and replace it with a new node canister. You can remove the parts from the faulty node canister and
reinstall them into the new node canister. You can also use this procedure to allow for replacement
of parts inside the node canister.
About this task
Notes:
- Ensure the FRU part number (P/N) of the replacement part matches that of the failed node canister, or is an approved substitute. The FRU P/N is identified on the label of the canister and on the FRU packaging.
- Do not operate the control enclosure with one node canister that is removed for longer than 16 minutes. Operating for longer than this period might cause the enclosure to shut down due to overheating.
- No tools are required to complete this task. Do not remove or loosen any screws.
- Use care when you remove a node canister from the control enclosure. The node canister is long and its center of gravity is far forward. It can be helpful to have a lift or other sturdy, flat surface ready to receive the node canister during removal.
- If the node canister is being replaced because of a failure to boot, refer to Resolving a problem with failure to boot
Procedure
- Review the Event Log to identify the faulty node canister.
- Review Procedure: Understanding system volume dependencies to identify any volume dependencies on the node canister.
- Follow Procedure: Powering off a node canister to verify that the hosts will not lose access to data in volumes.
- From the rear of the control enclosure, label each cable and remove it from the node canister.
Removing the faulty node canister
-
Remove the new node canister from its packaging.
Ensure that the FRU P/N of the replacement node canister matches that of the failed node canister or that the new P/N is an approved substitute.
- Remove the node canister, as described in Reseating a node canister in the control enclosure, and place it on a flat, level surface.
- Remove the covers from the faulty and replacement node canisters and set them aside, as described in Removing and replacing the cover of a node canister.
- Complete the following procedures to remove parts from the faulty node canister and install them in the replacement canister.
Replacing the new node canister
- Replace the cover of the new node canister, as described in Removing and replacing the cover of a node canister.
- Install the new node canister into the control enclosure, as described in Reseating a node canister in the control enclosure.
- Reconnect the cables that were removed in step 4 to the appropriate ports in the replacement node canister.
- If the node canister was communicating with other node canisters by using RDMA over Ethernet, use the Service Assistant Tool or the sainfo lsnodeip command to check whether the node IP configuration was lost. If needed, use the Service Assistant Tool or the satask chnodeip command to set the node IP address.
-
Connect directly with the replaced canister CLI. Using the following methods:
- Via technician port (DHCP) at 192.168.0.1
- Via service IP on ethernet port1, if known (blank USB key to retrieve if needed)
If unable to connect, refer to https://www.ibm.com/docs/en/flashsystem-9x00/8.5.x?topic=rp-resolving-problem-failure-boot-2
Once connected issue sainfo lsservicenodes command to verify the node status.
Note: Node error code 545 is expected. For more information, see 545.If error 545 is present, issue command satask chbootdrive -replacecanister to update the drives to match the serial number of the new node canister. The node will automatically reboot and join cluster.
To help identify the node canister, the inside of the release levers is labeled with the serial number.
- Use the management GUI or service assistant GUI to check that the node canister is online (or is Active) in the system.
- Review the management GUI to determine that all errors are resolved.