White Papers
Abstract
With Db2 Analytics Accelerator maintenance level 7.5.10.1 or later, the use of 'RDMA over Converged Ethernet' (RoCE) Express cards for the cluster network in Db2 Analytics Accelerator on Z multi-node installations is supported and highly recommended. The cluster network is used for inter-node communication.
RoCE Express cards provide the following benefits when compared with HiperSockets:
- RoCE cards consume significantly less CPU resources than HiperSockets; thus, more of the power of the configured IFLs is used for workload execution.
- RoCE cards increase the available CPU capacity for workload execution by ~10% compared with HiperSockets.
- RoCE cards enhance the operational system stability in high workload situations.
This document describes how to configure a cluster network based on RoCE Express cards for a Db2 Analytics Accelerator on Z multi-node installation. If the multi-node installation uses HiperSockets for the cluster network, the steps needed to migrate from the HiperSocket-based cluster network configuration to a RoCE-based cluster network configuration are described as well.
Content
![Multi-node installaton with RoCE cards for cluster network](/support/pages/system/files/inline-images/RoceCardArchitecture.png)
The Appendix of this document contains screen shots that show where to find the FIDs on the HMC and in the DPM.
- name: The name of the network interface definition. It must be the same for all LPARs, for example "roce".
- ipv4: The IP address of the network interface of this LPAR. The IP address must be different for each LPAR. The IP addresses of all LPARs of a multi-node cluster must all be in the same subnet.
- mode: The bonding mode. Use "balance-rr".
- FID: The function ID of a RoCE card port for the LPAR. Each RoCE card has two ports, and each port has its own FID. The function IDs per RoCE card and port are different for each LPAR.
- Note: If a defined FID contains upper-case characters, for example "000A", then provide them as lower-case characters in the JSON configuration file, for example "000a". See APAR PH57620 for more information.
{
"version": "7.5.11",
"accelerator_name": "IDAAONZ",
"accelerator_description": "Accelerator with RoCE cards",
"accelerator_type": "multi-node",
"db2_pairing_ipv4": "10.20.92.119/24",
"network_interface_bindings": {
"mgmt_nw": "activation-profile",
"db2_nw": "db2",
"cluster_nw": "roce"
},
"runtime_environments": [
{
"cpc_name": "A100",
"head": {
"lpar_name": "LP01",
"network_interfaces": [
{
"name": "db2",
...
},
{
"name": "roce",
"ipv4": "192.168.101.229/24",
"bond_settings": {
"mode": "balance-rr",
"slaves": [
{
"FID": "0000"
},
{
"FID": "0001"
},
{
"FID": "0002"
},
{
"FID": "0003"
}
]
}
}
],
....
},
....
}
...
}
- Do not remove the HiperSockets interface definition so that the "network_interface_bindings" section initially continues to refer to the HiperSockets cluster network.
- Add the network interface definitions for the RoCE cards to each node in the "runtime_environments" section as a currently unused network.
{
"version": "7.5.11",
"accelerator_name": "IDAAONZ",
"accelerator_description": "Accelerator with RoCE cards",
"accelerator_type": "multi-node",
"db2_pairing_ipv4": "10.20.92.119/24",
"network_interface_bindings": {
"mgmt_nw": "activation-profile",
"db2_nw": "db2",
"cluster_nw": "cluster"
},
"runtime_environments": [
{
"cpc_name": "A100",
"head": {
"lpar_name": "LP01",
"network_interfaces": [
{
"name": "db2",
...
},
{
"name": "cluster",
"ipv4": "192.168.100.229/24",
"device": "0.0.0008"
},
{
"name": "roce",
"ipv4": "192.168.101.229/24",
"bond_settings": {
"mode": "balance-rr",
"slaves": [
{
"FID": "0000"
},
{
"FID": "0001"
},
{
"FID": "0002"
},
{
"FID": "0003"
}
]
}
}
],
....
},
....
}
...
}
![Roce interface availability wiht green status on head node](/support/pages/system/files/inline-images/RoceVerifyHead.png)
![Roce interface availability wiht green status on data node](/support/pages/system/files/inline-images/RoCEVerifyData.png)
- In the Admin UI, open the Diagnosis panel and select the Test your network tab.
- Starting from the head node, run ping commands to check the RoCE card cluster network IP address of each data node.
- Continue with data1 node. Run a ping command to check the head node and the remaining data nodes by using the cluster network IP address of each node.
- Continue with the remaining data nodes until you have pinged each node from each node on the cluster network IP address.
- Edit the JSON configuration file again and change the "network_interface bindings" from "cluster_nw":"cluster" to "cluster_nw":"roce". Note that "cluster" and "roce" are just sample names specified above in Step 2 of the JSON snippet examples. You can use different names in your JSON configuration file.
- Upload and apply the changed JSON configuration file using the Admin UI.
- Perform a Reset of the accelerator (do not select the wipe check box!) in the Admin UI to apply the new RoCE card cluster network across all accelerator nodes. This step incurs a short accelerator outage until all services have been started again. If the reset fails, perform a Shutdown on the Admin UI, followed by a re-activation of all accelerator LPARs to initiate a new restart.
Optionally, you can remove the HiperSocket "network_interfaces" definition from the JSON configuration file for all nodes and upload the file again. Following the example that has been used in Step 2, the following snippet shows the HiperSocket "network_interfaces" definition to be removed from the "runtime_environments" section of all nodes.
{
"name": "cluster",
"ipv4": "192.168.100.229/24",
"device": "0.0.0008"
},
![image-20230913113104-1](/support/pages/system/files/inline-images/image-20230913113104-1.png)
![image-20230913113148-2](/support/pages/system/files/inline-images/image-20230913113148-2.png)
![image-20230913113212-3](/support/pages/system/files/inline-images/image-20230913113212-3.png)
![image-20230913113225-4](/support/pages/system/files/inline-images/image-20230913113225-4.png)
![image-20230913114250-1](/support/pages/system/files/inline-images/image-20230913114250-1.png)
![image-20230913114907-2](/support/pages/system/files/inline-images/image-20230913114907-2.png)
Was this topic helpful?
Document Information
Modified date:
26 March 2024
UID
ibm17031391