High availability setup

High availability setups for applications on Linux® on IBM Z® or LinuxONE typically use path redundancy for network connections.

Identifying suitable PCI functions

To avoid outages during PCI network adapter maintenance, use redundant paths through PCI functions. Such PCI functions have different values in /sys/bus/pci/devices/<pci_id>/pfip/segment*, where <pci_id> is the function address. The pfip/segement* provides an abstract indication of the path that is used to access the PCI function with segment0 having the highest significance. This can be used to compare the paths used by two or more PCI functions, to give an indication of the degree of isolation between them. If possible, choose PCI functions with a high degree of isolation. Read more information about segments and other PCI device information.

In the following example shows the PCI functions with function addresses 000d:00:00.0 and 00b5:00:00.0:
# cat /sys/bus/pci/devices/000d:00:00.0/pfip/segment0
0x01
# cat /sys/bus/pci/devices/00b5:00:00.0/pfip/segment0
0x03

Bonding for PCI network interface redundancy

On Linux, you can use the bonding device driver to create bonded interfaces. For more information about bonded interfaces, see Linux Channel Bonding Best Practices and Recommendations. This publication describes bonding of OSA-Express based interfaces, but the descriptions of the bonding device driver also apply to PCI based interfaces. In particular, as for OSA-Express, the BONDING_MODULE_OPTS specification must include fail_over_mac option. The exact option name can vary by distribution.

Path redundancy for SMC-R connections

Use SMC-R link groups to guard against link failure in SMC-R connections. SMC-R automatically creates link groups for PCI functions with matching PNET IDs.

To safeguard against failure of the TCP/IP connection, use a bonded interface that combines paths through two different OSA-Express adapters.

Figure 1. HA setup for SMC-R

The graphic is explained in the text that follows.

Figure 1 shows a Linux instance with four paths to an external LAN.

PCI functions with a high level of isolation (see Bonding for PCI network interface redundancy) are selected. In Linux, these paths result in network interfaces eno181 and eno13.

The paths through the two OSA Express adapters result in network interfaces eth0 and eth1. These two interfaces are bonded into an interface bond0.

In its IOCDS, the hardware configuration assigns the same PNET ID, PNET1, to eno181, eno13, eth0, and eth1. This common PNET ID associates the four interfaces, and by extension also the bonded interface bond0. The two PCI based interfaces form an SMC link group.

A connection that is initiated through bond0 has a redundant TCP/IP connection. The SMC link group provides failover for RDMA traffic.

Situations to consider

External network issues
Even if the PCI network interface is fully operational, communication with the target system can fail due to external network problems, such as faulty switches or routers. Use standard mechanisms like Spanning Tree Protocol (STP) and dynamic routing to mitigate single points of failure in the network. These mechanisms are outside the scope of this document.
Carrier loss
If a physical link of a PCI network adapter is lost or inactive (for example due to cable disconnection or the loss of an optical signal), all interfaces on that adapter will report NO-CARRIER status. In a bonding configuration, the bond interface automatically switches to an alternate interface until the carrier is restored. You can simulate this condition for testing by disconnecting the network cable or disabling the corresponding port on the hardware switch.
PCI Device recovery
Device recovery can be initiated by firmware, by the device driver, or manually by using the zpcictl command. For more details about zpcictl, see Device Drivers, Features, and Commands: Chapter 39 - PCI Express support).
A recovery action includes a controlled shutdown and a subsequent re-enabling of the device. As a result, the network interfaces corresponding to the PCI function can be destroyed and re-created. For a successful recovery upon recreation, ensure that configuration settings like IP addresses, bond, and VLAN are set up persistently in your network management tools.
PCI function in standby state
A PCI function can be configured offline either from the SE or HMC or by the owning Linux system. Firmware-initiated code updates can include setting PCI functions offline and online again. Note that when a PCI function is configured offline, the corresponding network interface is destroyed. The network interface is re-created by configuring it online again. For a successful recovery, ensure that configuration settings like IP addresses, bond, and VLAN are set up persistently in your network management tools.