GPFS file systems in the high availability configuration

IBM® General Parallel File System (GPFS™) software is used to share file systems across the system. The high availability (HA) configuration includes redundant GPFS network shared disk (NSD) servers and enforces dependencies between HA resources and GPFS file systems.

Multiple GPFS clusters are defined in the system:

Management GPFS cluster: This GPFS cluster is called MANAGEMENT.GPFS and contains the management host and the standby management host. It contains the file systems required for the warehouse tools and the database performance monitor, and cross mounts the cluster-wide file systems (/db2home, /dwhome, and /stage) served by the foundation GPFS cluster.
Foundation GPFS cluster: This GPFS cluster is called FOUNDATION.GPFS, and contains the administration hosts in the first roving high availability (HA) group and the data hosts in the second roving HA group. The foundation GPFS cluster provides the cross-cluster file systems to all of the other GPFS clusters in the environment.
Additional GPFS clusters: Each additional GPFS cluster contains the hosts in each subsequent roving HA group and is called hagroup#.GPFS, where # starts at 3 and increments by 1 for each additional HA group in the cluster.

To make the GPFS file systems highly available, two hosts in each high availability (HA) group are assigned as the GPFS NSD servers and run concurrently. The two hosts that act as the GPFS NSD servers are attached to the external storage and share the GPFS file systems over the internal application network to the other hosts in the HA group. If a host that is assigned as a GPFS server fails, the other GPFS server remains operational and continues to share the file system to the client hosts. The client hosts are the remaining hosts in the HA group.

The following file systems are managed by GPFS and shared to all hosts in the system:

/db2home: Instance home file system
/dwhome: User home file system
/stage: Scratch space

The following file systems are managed by GPFS and shared to all hosts in the same roving HA group:

/db2fs/bcuaix/NODEnnnn: Containers for permanent table spaces, where nnnn represents the database partition number
/bkpfs/bcuaix/NODEnnnn: Backup file system for database partition nnnn
/db2path/bcuaix/NODEnnnn: Database directory file system for database partition nnnn

The following file systems are managed by GPFS and shared between the management host and the standby management host:

/opmfs: File system that is used by the database performance monitor
/usr/IBM/dwe/appserver_001: File system that is used by warehouse tools applications

The management host and the standby management host are assigned as the GPFS NSD servers for the high availability configuration for the management hosts.

Each GPFS cluster is mapped to an HA group, except for the foundation GPFS cluster. The foundation GPFS cluster acts as two separate HA groups. The first HA group contains the administration host and the standby administration host. The second HA group contains all of the data hosts in the foundation GPFS cluster. As a rule, all of the hosts within the same HA group are able to mount all database partition file systems associated with that HA group. Exclusion files are configured so that the file systems used by the database partitions in the first HA group are mounted only on the administration hosts, and that the file systems used by the database partitions in the second HA group are mounted only on the data hosts in the second HA group.

For example, assume that for a system with the standard configuration that the first three active data hosts (host003, host004, host005) and one standby data host (host006) are contained in the same roving HA group. Each active data host runs ten database partitions, so the file systems mounted across the hosts in the HA group are for database partitions 6 to 35. The exclusion files for the foundation GPFS cluster are configured so that only the host003, host004, host005, and host006 hosts mount the following file systems for database partitions 6 to 35, where nnnn represents the database partition number:

/db2fs/bcuaix/NODEnnnn
/db2path/bcuaix/NODEnnnn
/bkpfs/bcuaix/NODEnnnn

Figure 1 shows the file systems that are mounted on all hosts in the system, the file systems that are mounted on the management hosts, and the file systems that are mounted on all hosts in a core warehouse roving HA group. In each HA pair or roving HA group, two hosts must be defined as the NSD servers and two hosts must be defined as the quorum manager.

Figure 1. GPFS file systems

The HA configuration for the core warehouse instance includes dependencies between the database partition resources and the /db2home instance home file system that is managed by GPFS. These dependencies prevent IBM Tivoli® System Automation for Multiplatforms (Tivoli SA MP) from attempting to start the database partitions if the /db2home file system is not mounted. The dependencies also trigger a failover if a failure occurs on the instance home file system on an active host.

The GPFS file system mounts are also managed by Tivoli SA MP software. During normal operations, the GPFS software keeps the GPFS file systems mounted on the appropriate hosts. For example, if a host is rebooted, the GPFS software will automatically mount the managed file system resources. However, if a GPFS file system is not automatically mounted by the GPFS software or if the GPFS file system is unmounted, then the Tivoli SA MP software will detect that the file system is not mounted and automatically mount the GPFS file system before attempting to start the core warehouse database.

To determine the state of the GPFS software on all hosts in the system, run the following command as root on the management host:

dsh -n ${ALL} "/usr/lpp/mmfs/bin/mmgetstate -Y | tail -1" | sort

The command should return an active state for each host, similar to the following sample output:

host01: mmgetstate::0:1:::host01:1:active:1:2:2:quorum node:(undefined):
host02: mmgetstate::0:1:::host02:1:active:1:4:5:quorum node:(undefined):
host03: mmgetstate::0:1:::host03:2:active:1:2:2:quorum node:(undefined):
host04: mmgetstate::0:1:::host04:2:active:1:4:5:quorum node:(undefined):
host05: mmgetstate::0:1:::host05:3:active:1:4:5:quorum node:(undefined):
host06: mmgetstate::0:1:::host06:4:active:1:4:5:quorum node:(undefined):
host07: mmgetstate::0:1:::host07:5:active:1:4:5::(undefined):