File system maintenance mode

Use file system maintenance mode to enable an IBM Storage Scale file system maintenance window.

Overview

Use file system maintenance mode whenever you perform maintenance on either NSD disks or NSD servers that might result in NSDs becoming unavailable. You cannot change any user files or file system metadata while the file system in maintenance mode. This way the system does not mark down NSD disks or NSD server nodes when I/O failures occur on those disks because they are not available (because of maintenance). Then, administrators can easily complete administrative actions on the NSD disks or NSD server nodes.

IBM Storage Scale file system operations that must internally mount the file system cannot be used while the file system is in maintenance mode. Other file system administrative operations, such as the operations run by the mmlsfs and mmlsdisk commands, can check the file system information.

Using file system maintenance mode

You can move the file system into maintenance mode to prevent unexpected or unwanted disk I/O operations in the file system when maintenance actions are applied to either the NSD disk systems or file system server nodes. I/O failures from any NSD disks or server nodes that are not available might result in disks that are marked as down if you do not move the file system into maintenance mode. Any disks that are marked as down must be manually started by using the mmchdisk command, which might take significant time for a large file system.

Additionally, no ordering assurance exists for the IBM Storage Scale nodes when you start or shut down nodes across the cluster. So, if the NSD servers are being shut down earlier than client nodes or started up later than client nodes, some NSD disks might also be marked down if I/O operations are run on those NSD server nodes. Unless the file system is in maintenance mode, you must manually control the shutdown or startup sequence for cluster nodes to avoid disk down events.

You can move the file system into maintenance mode before you shut down or mount the file system during the start process. Do this to release the control on the orders of nodes shutdown or startup sequence. When you remotely mount and access a file system, you should move the file system into maintenance mode before you shut down the NSD servers in the home cluster. Do this because users of remote file system might be not aware of the home cluster status. Then initiating I/O operations from remote cluster might cause file system disks to be marked down as well.

Setting up file system maintenance mode

You can enable, disable, or check the status of file system maintenance mode:
  • To enable or disable file system maintenance mode, enter the following command:
     mmchfs <fsName> —maintenance-mode yes [—wait] | no
  • To check the status of file system maintenance mode, enter the following command:
     mmlsfs <fsName> --maintenance-mode

Before you enter the mmchfs command to enable file system maintenance mode, make sure that you unmount the file system on the local and remote clusters. Additionally, long running commands such as mmrestripefs must complete because they internally mount the file system. If you cannot wait for long running commands, you must specify the --wait parameter. The --wait parameter waits on existing mounts and long running commands, and moves the file system into maintenance mode after all existing mounts and long running commands complete.

You can apply maintenance on network shared disk (NSD) disks or server nodes:
  1. Unmount the file system from all nodes, including remote cluster nodes. Enter the following command:
    mmumount <fsName> -a
  2. Check whether any pending internal mounts exist. Enter the following command:
    mmlsmount <fsName> -L
  3. Enter the following command to enable maintenance mode:
    mmchfs <fsName> --maintenance-mode yes
    Remember: If you use the --wait parameter with the mmchfs command, file system maintenance mode is enabled automatically after you unmount the file system from all local and remote nodes.
  4. Complete any needed maintenance on the NSDs or server nodes. Maintenance tasks on NSDs or server nodes include these tasks:
    • You can restart the NSD servers.
    • You can stop any access to NSDs.
    • You can shut down the entire cluster safely when the file system is in maintenance mode.
    Note: File system mount and other management operations that internally mount file system cannot run in this state, such as mmmount and mmrestripefs:
    mmmount <fsName> 
    Mon Jul 23 06:02:49 EDT 2018: 6027-1623 
    mmmount: Mounting file systems ... 
    mount: permission denied 
    mmmount: 6027-1639 Command failed. Examine previous error messages to determine cause.
    mmrestripefs <fsName> -b 
    This file system is undergoing maintenance and cannot be either mounted or changed. 
    mmrestripefs: 6027-1639 Command failed. Examine previous error messages to determine cause.
  5. Resume the normal file system operations such as mmmount after maintenance is complete. End the maintenance mode only after the NSD disks and NSD servers are operational:
    mmchfs <fsName> --maintenance no
    You can run offline fsck to check file system consistency before you resume file system maintenance mode.
CAUTION:
  • If you shut down either the NSD servers or the whole cluster, it is considered maintenance on NSD disks or servers and must be done under maintenance mode.
  • If no NSD disks or NSD server nodes are available for a specified file system, the file system maintenance mode state cannot be retrieved because it is stored with the stripe group descriptor. Additionally, you cannot resume the file system maintenance mode in this scenario.

Running the fsck service action while the file system is in maintenance mode

The offline fsck service action can be run while the file system is in maintenance mode. Maintenance mode is used to provide a dedicated timing window to check file system consistency when:
  • The offline fsck service action cannot be started while the file system is being used.
  • The offline fsck service action cannot be started due to some unexpected interfering file system mount or other management operations.
Do not specify these commands if your file system is in maintenance mode:
  • mmmount
  • mmrestripefs
  • mmdelfs
  • mmdefragfs
  • mmadddisk
  • mmdeldisk
  • mmrpldisk
  • mmchdisk
  • mmcrsnapshot
  • mmdelsnapshot
  • mmcrfileset
  • mmdelfileset
  • mmchfileset
  • mmchqos
  • mmchpolicy
  • mmquotaon
  • mmquotaoff
  • mmedquota
  • mmdefedquota
  • mmdefquotaon
  • mmdefquotaoff
  • mmsanrepairfs
  • mmputacl
Note: These commands fail when you specify them while your file system is in maintenance mode.

See also