GPFS logs

The GPFS log is a repository of error conditions that are detected on each node, as well as operational events such as file system mounts. The GPFS log is the first place to look when you start debugging the abnormal events. As GPFS is a cluster file system, events that occur on one node might affect system behavior on other nodes and all GPFS logs can have relevant data.

The GPFS log can be found in the /var/adm/ras directory on each node. The GPFS log file is named mmfs.log.date.nodeName, where date is the time stamp when the instance of GPFS started on the node and nodeName is the name of the node. The latest GPFS log file can be found by using the symbolic file name /var/adm/ras/mmfs.log.latest.

The GPFS log from the prior startup of GPFS can be found by using the symbolic file name /var/adm/ras/mmfs.log.previous. All other files have a time stamp and node name appended to the file name.

At GPFS startup, log files that are not accessed during the last 10 days are deleted. If you want to save old log files, then copy them elsewhere.

Many GPFS log messages can be sent to syslog on Linux®. The systemLogLevel attribute of the mmchconfig command determines the GPFS log messages to be sent to the syslog. For more information on the mmchconfig command, see mmchconfig command.

This example shows normal operational messages that appear in the GPFS log file on Linux node:

2022-07-26_13:59:55.090-0400: runmmfs starting (1135)
2022-07-26_13:59:55.096-0400: [I] Removing old /var/adm/ras/mmfs.log.* files:
2022-07-26_13:59:55.212-0400: runmmfs: [I] Unloading modules from /lib/modules/3.10.0-514.el7.x86_64/extra
2022-07-26_13:59:55.312-0400: runmmfs: [I] Loading modules from /lib/modules/3.10.0-514.el7.x86_64/extra
Module Size Used by
mmfs26 2842309 0
mmfslinux 824803 1 mmfs26
tracedev 48529 2 mmfs26,mmfslinux
2022-07-26_13:59:56.321-0400: runmmfs: [I] Invoking /usr/lpp/mmfs/bin/mmfsd
2022-07-26_13:59:56.331-0400: [I] This node has a valid advanced license
2022-07-26_13:59:56.330-0400: [I] Initializing the fast condition variables at 0x7F5821FA5E00 ...
2022-07-26_13:59:56.331-0400: [I] mmfsd initializing. {Version: 5.1.4.0 Built: May 20 2022 16:49:25} ...
2022-07-26_13:59:56.353-0400: [I] Tracing in overwrite mode
2022-07-26_13:59:56.353-0400: [I] Cleaning old shared memory ...
2022-07-26_13:59:56.353-0400: [I] First pass parsing mmfs.cfg ...
2022-07-26_13:59:56.353-0400: [I] Enabled automated long waiter detection.
2022-07-26_13:59:56.353-0400: [I] Enabled automated long waiter debug data collection.
2022-07-26_13:59:56.353-0400: [I] Enabled automated expel debug data collection.
2022-07-26_13:59:56.353-0400: [I] Verifying minimum system memory configurations.
2022-07-26_13:59:56.353-0400: [I] The system memory configuration is 64235 MiB
2022-07-26_13:59:56.353-0400: [I] The daemon memory configuration hard floor is 1536 MiB
2022-07-26_13:59:56.353-0400: [I] Initializing the main process ...
2022-07-26_13:59:56.358-0400: [I] CreateLocalConnTab: err 0
2022-07-26_13:59:56.359-0400: [I] Second pass parsing mmfs.cfg ...
2022-07-26_13:59:56.386-0400: [I] Calling parsing /var/mmfs/gen/local.cfg ...
2022-07-26_13:59:56.386-0400: [I] Initializing NUMA support ...
2022-07-26_13:59:56.387-0400: [I] NUMA loaded platform library libnuma.
2022-07-26_13:59:56.387-0400: [I] NUMA BIOS/platform support for NUMA is enabled.
2022-07-26_13:59:56.387-0400: [I] NUMA Cgroup Version V1; is default cgroup? yes; is system cgroup? yes; Cgroup path: /sys/fs/cgroup/systemd/system.slice/gpfs.service
2022-07-26_13:59:56.387-0400: [I] NUMA mmfsd running on CPUs 0-31 with NUMA policy MPOL_DEFAULT
2022-07-26_13:59:56.387-0400: [I] NUMA discover num numa nodes 2 num numa mem nodes 2 numa_num_configured_cpus 32 get_nprocs_conf 32 numa_max_node 1
2022-07-26_13:59:56.387-0400: [I] NUMA discover system CPUs present 0-31
2022-07-26_13:59:56.387-0400: [I] NUMA discover system CPUs online 0-31
2022-07-26_13:59:56.387-0400: [I] NUMA discover system NUMA nodes has_memory 0-1
2022-07-26_13:59:56.387-0400: [I] NUMA discover system NUMA nodes has_normal_memory 0-1
2022-07-26_13:59:56.387-0400: [I] NUMA discover cpuset cpus count 32
2022-07-26_13:59:56.387-0400: [I] NUMA discover cpuset node count 2
2022-07-26_13:59:56.387-0400: [I] NUMA discover cpuset nodes 0-1
2022-07-26_13:59:56.387-0400: [I] NUMA discover cpuset node 0 with CPUs 0-7,16-23
2022-07-26_13:59:56.387-0400: [I] NUMA discover cpuset node 1 with CPUs 8-15,24-31
2022-07-26_13:59:56.387-0400: [I] NUMA discover cpuset nodes max NUMA distances of 11 for nodes with vCPUs and 11 for all nodes.
2022-07-26_13:59:56.387-0400: [I] GPFS vCPU limits include all vCPUs that Linux sees as online or possibly online via hot add, ht/smt changes, etc.
2022-07-26_13:59:56.387-0400: [I] GPFS detected 32 vCPUs.
2022-07-26_13:59:56.387-0400: [I] GPFS detected NUMA Complexity Metric values of 2 for nodes with vCPUs and 2 for all nodes.
2022-07-26_13:59:56.387-0400: [I] NUMA discover node 0 online CPUs: 0-7,16-23
2022-07-26_13:59:56.387-0400: [I] NUMA discover node 1 online CPUs: 8-15,24-31
2022-07-26_13:59:56.387-0400: [I] NUMA discover system online CPUs: 0-31
2022-07-26_13:59:56.387-0400: [I] NUMA discover cpuset node 0 (normal) with memory: 32739 MiB, RDMA devices: No, CPUs: 0-7,16-23.
2022-07-26_13:59:56.387-0400: [I] NUMA discover cpuset node 1 (normal) with memory: 32768 MiB, RDMA devices: No, CPUs: 8-15,24-31.
2022-07-26_13:59:56.387-0400: [I] NUMA discover cpuset node -1 (system) with memory: 65507 MiB, RDMA devices: No, CPUs: 0-31.
2022-07-26_13:59:56.387-0400: [I] Initializing User Counter support ...
2022-07-26_13:59:56.388-0400: [I] User counters CPU limit 2048 CPUs; found 32 system CPUs
2022-07-26_13:59:56.388-0400: [I] Initializing the page pool ...
2022-07-26_14:00:05.983-0400: [I] Initializing the mailbox message system ...
2022-07-26_14:00:05.984-0400: [I] Initializing encryption ...
2022-07-26_14:00:06.020-0400: [I] Encryption: loaded crypto library: GSKit FIPS context (ver: 8.6.0.0).
2022-07-26_14:00:06.020-0400: [I] Encryption key cache expiration time = 0 (cache does not expire).
2022-07-26_14:00:06.020-0400: [I] Initializing the thread system ...
2022-07-26_14:00:06.020-0400: [I] Creating threads ...
2022-07-26_14:00:06.026-0400: [I] Initializing inter-node communication ...
2022-07-26_14:00:06.027-0400: [I] Creating the main SDR server object ...
2022-07-26_14:00:06.027-0400: [I] Initializing the sdrServ library ...
2022-07-26_14:00:06.029-0400: [I] Initializing the ccrServ library ...allowRemoteConnections=1, noAuthentication=0
2022-07-26_14:00:06.045-0400: [I] proactiveReconnect is enabled by default
2022-07-26_14:00:06.045-0400: [I] Initializing the cluster manager ...
2022-07-26_14:00:06.458-0400: [I] Initializing the token manager ...
2022-07-26_14:00:06.641-0400: [I] Initializing network shared disks ...
2022-07-26_14:00:06.830-0400: [I] Register client command RPC handlers (15 handlers)
2022-07-26_14:00:07.176-0400: [I] VERBS RDMA not starting because configuration option verbsRdma is not enabled
2022-07-26_14:00:07.178-0400: [I] Starting CCR server (mmfsd) ...
2022-07-26_14:00:07.189-0400: [D] PFD load: mostRecent: 0 seq: 464621 (8192)
2022-07-26_14:00:07.189-0400: [D] PFD load: nextToWriteIdx: 1 seq: 464620 (8192)
2022-07-26_14:00:07.223-0400: [I] Initializing compression libraries handlers...
2022-07-26_14:00:07.225-0400: [I] Listening for local client connections on fd 6 in pid 1753
2022-07-26_14:00:07.725-0400: [N] Connecting to 192.168.118.154 node0 <c0p0>:[0]
2022-07-26_14:00:07.742-0400: [I] Connected to 192.168.118.154 node0 <c0p0>:[0]
2022-07-26_14:00:07.765-0400: [I] Accepted and connected to 192.168.118.154 node0 <c0p0>:[1]
2022-07-26_14:00:07.858-0400: [I] Node 192.168.118.154 (node0) is now the Group Leader.
2022-07-26_14:00:07.869-0400: [I] Calling user exit script mmClusterManagerRoleChange: event clusterManagerTakeOver, Async command /usr/lpp/mmfs/bin/mmsysmonc.
2022-07-26_14:00:07.879-0400: [N] mmfsd ready
2022-07-26_14:00:07.882-0400: [I] Calling user exit script mmMountFs: event mount, Async command /usr/lpp/mmfs/lib/mmsysmon/sendRasEventToMonitor.
2022-07-26_14:00:07.981-0400: mmcommon mmfsup invoked. Parameters: 192.168.118.153 192.168.118.154 all
2022-07-26_14:00:07.995-0400: [I] sendRasEventToMonitor: Successfully sent a file system event to the monitor. Event code=999306
2022-07-26_14:00:08.126-0400: [I] Accepted and connected to 192.168.118.155 node2 <c0n2>:[0]
2022-07-26_14:00:08.152-0400: [N] Connecting to 192.168.118.155 node2 <c0n2>:[1]
2022-07-26_14:00:08.162-0400: [I] Connected to 192.168.118.155 node2 <c0n2>:[1]
2022-07-26_14:00:08.172-0400: [I] Calling user exit script mmSysMonGpfsStartup: event startup, Async command /usr/lpp/mmfs/bin/mmsysmoncontrol.
2022-07-26_14:00:08.174-0400: [I] Calling user exit script mmSinceShutdownRoleChange: event startup, Async command /usr/lpp/mmfs/bin/mmsysmonc.2017-08-29_15:53:04.196-0400: runmmfs starting

The mmcommon logRotate command can be used to rotate the GPFS log without shutting down and restarting the daemon. After the mmcommon logRotate command is issued, /var/adm/ras/mmfs.log.previous contains the messages that occurred since the previous startup of GPFS or the last run of the mmcommon logRotate command. The /var/adm/ras/mmfs.log.latest file starts over at the point in time that the mmcommon logRotate command was run.

Depending on the size and complexity of your system configuration, the amount of time to start GPFS varies. If you cannot access a file system that is mounted, then examine the log file for error messages.