NFS tiebreaker

The NFS tiebreaker resolves tie situations that are based on reserve files that are stored on an NFS v4 server. The NFS server can be used for multiple System Automation for Multiplatforms clusters. If the same server is used for multiple tie NFS breakers, each tiebreaker needs a reserve file with a unique name.

It is not possible that in a cluster split situation more than one node has quorum or pending quorum at any time. If the node, which obtained the quorum fails afterward, other nodes automatically try to obtain quorum that is based on the challenger-defender protocol.

The NFS server can be on any system that supports to run NFS v4. If you use an NFS server, which is compliant with the newer v4.1 or pNFS standard for System Automation for Multiplatforms tie breakers, make sure that the replication and failover capabilities of the NFS server are disabled. Use the NFS server for System Automation for Multiplatforms tiebreaker purposes only.

NFS v4 client libraries must be installed on all System Automation for Multiplatforms cluster nodes.

An example scenario for using an NFS tiebreaker is a three site setup. Two sites host a set of two-node clusters and the tiebreaker is supposed to be on the third site. A disk tiebreaker cannot be used, because it requires a SAN setup that is not necessarily crossing all three sites. It is also not possible to make any assumptions about the network topology. No network device on the third site can be chosen as destination address for the network tiebreaker. In this case, the third site can be used to host the NFS v4 server that is used as tiebreaker.

If the NFS quorum server is down or not accessible in a cluster split situation, cluster nodes do not get quorum. This situation is similar to a disk tiebreaker, where no node gets quorum if the disk device failed or is unreachable. Make sure that the NFS quorum server is permanently running and works reliably.

System Automation mounts the NFS file system at various stages to the cluster nodes, but not periodically.
Initialize
The mount is established when the NFS tie breaker is set as the active tie breaker, during the Initialize operation. The same happens during domain or node start-up. If this fails, the node might be unable to join the domain.
Reserve
During the Reserve operation, before the reserve file is accessed, the NFS mount is checked, and (re-)established if needed.
Terminate
The NFS file system is unmounted during the Terminate operation, which runs when the NFS tie breaker is no longer the active tie breaker, or when the domain/node is stopped.
System Automation for Multiplatforms mounts the NFS file system at various stages on the cluster nodes, but not periodically:
  • Initially the mount is established when the NFS tie breaker is set as the active tie breaker during the Initialize operation or the domain or node startup. If the mount fails, the node might be unable to join the domain.
  • During the Reserve operation before the reserve file is accessed, the NFS mount is checked, and (re-) established if needed.
The NFS file system is unmounted during the Terminate operation, which is performed when the NFS tie breaker is no longer the active tie breaker, or when the domain or node is stopped.
Note: Existence of the reserve file is crucial in case of a cluster split and deleting the reserve file can cause both nodes in a cluster to be granted quorum. Use a naming schema for these files that allows for a direct association between the reserve file and the cluster by using the reserve file. For example, NFS_reserve_file_SAP_HA_sapnode1_sapnode2_DO_NOT_REMOVE clearly states the purpose of the file, the name of the cluster, and the names of the nodes that use the reserve file. If the file was deleted, activate the default operator tiebreaker, create the file again, and then activate the NFS tiebreaker again. For more information about the operator tie breaker, see Configuring the tiebreaker.