Configuration 2: 5148-22L protocol nodes ordered standalone and added to an existing 5148 ESS (PPC64LE)

In this configuration, protocol nodes are ordered for attachment to an existing previously installed ESS. The EMS node, I/O server nodes, and protocol nodes have OS, kernel, systemd, network manager, firmware, and OFED, kept in synchronization as xCAT running on the EMS is used to manage and coordinate these levels. It is recommended to match IBM Spectrum Scale levels between the ESS and protocol nodes, but this is not mandatory.
Note: All protocol nodes in a cluster must be at the same code level.

A) Starting point and what to expect

  • An ESS is already installed and running (EMS and 2 I/O server nodes + possibly additional nodes)
  • A cluster has already been created.

    Run mmlscluster to check.

  • A GUI is active on the EMS node and it has been logged into.

    Run systemctl status gpfsgui to check.

  • The Performance Monitoring collector is configured and running on the EMS node.

    Run systemctl status pmcollector and mmperfmon config show to check.

  • Protocol nodes may or may not already exist on this cluster.
  • The ESS is at code level 5.3.1.1 or later.

    Run /opt/ibm/gss/install/rhel7/ppc64le/installer/gssinstall -V to check.

  • An xCAT OS image specific for CES exists (rhels7.4-ppc64le-install-ces).
    Run the following command to verify.
    # lsdef -t osimage
  • Newly ordered protocol nodes come with authentication prerequisites pre-installed upon them (includes sssd, ypbind, openldap-clients, krb5-workstation).
  • A default deploy template exists for the protocol nodes.

    Check for /var/tmp/gssdeployces.cfg.default on the EMS node

  • New standalone protocol node orders arrives with OS, kernel, OFED, iprraid, and firmware pre-loaded. This is verified in step B7.
  • New standalone protocol node orders arrive with an IBM Spectrum Scale Protocols code package in /root. This is verified in steps G and H.
  • New standalone protocol nodes do have not have any GPFS RPMs installed on them
  • Hardware and software call home may already be configured on the existing ESS system. Call home is reconfigured after deployment of the protocol nodes.
Important: Before proceeding:
  • Protocol nodes must be cabled up to the ESS switch for use with the xCAT network and the FSP network. For more information, see Figure 2.
  • Protocol nodes can be in the powered off state at this point.

B) Protocol node OS deployment

  1. Verify that the protocol nodes are cabled up to the ESS xCAT network and FSP network.
  2. On the EMS node, find the /var/tmp/gssdeployces.cfg.default file and copy to /var/tmp/gssdeployces.cfg.
    The default CES template is pre-filled so that the only fields needing customization to match the current cluster are:
    • DEPLOYMENT_TYPE: If these are your first protocol nodes, the type must be CES. If you are adding more protocol nodes to an ESS system that already has protocol nodes, use the type ADD_CES. Read the tips in the gssdeployces.cfg file carefully because using the incorrect deployment type and filling out the configuration file incorrectly could result in rebooting or reloading of any existing protocol nodes that might be a part of the cluster. Read all on-screen warnings.
    • EMS_HOSTNAME
    • EMS_MGTNETINTERFACE
    • SERVERS_UID
    • SERVERS_PASSWD
    • SERVERS_SERIAL: Change the serial numbers to match each protocol node being added.
    • SERVERS_NODES: Separate each desired protocol node name with a space.
  3. Configure /etc/hosts on the EMS node to list the protocol nodes.
    Note: This /etc/hosts helps during network setup in this step.
    Here is an example of the IP, FQDN, and hostname configured for EMS, IO, and 3 protocol nodes.
    • The EMS node is 192.168.45.20.
    • The I/O server nodes are 192.168.45.21 → 192.168.45.30.
    • The protocol nodes are 192.168.45.31 → X.
    192.168.45.20 ems1.gpfs.net ems1
    192.168.45.21 gssio1.gpfs.net gssio1
    192.168.45.22 gssio2.gpfs.net gssio2
    192.168.45.31 prt01.gpfs.net prt01
    192.168.45.32 prt02.gpfs.net prt02
    192.168.45.33 prt03.gpfs.net prt03
    172.31.250.3 ems1-hs.gpfs.net ems1-hs
    172.31.250.1 gssio1-hs.gpfs.net gssio1-hs
    172.31.250.2 gssio2-hs.gpfs.net gssio2-hs
    172.31.250.11 prt01-hs.gpfs.net prt01-hs
    172.31.250.12 prt02-hs.gpfs.net prt02-hs
    172.31.250.13 prt03-hs.gpfs.net prt03-hs
    Note: If the /etc/hosts file is already set up on the EMS node, copy it to the protocol node(s) first and then modify it. Each protocol node must have the same /etc/hosts file.
  4. Detect and add the protocol node objects to xCAT as follows.
    /var/tmp/gssdeploy -o /var/tmp/gssdeployces.cfg
    Proceed through all steps. Protocol nodes should be listed in xCAT afterwards.
    # lsdef
    ems1  (node)
    gssio1  (node)
    gssio2  (node)
    prt01  (node)
    prt02  (node)
    prt03  (node)
  5. Deploy the OS, kernel, systemd, netmgr, OFED, and IPR as follows.
    This is a decision point with two options depending upon the requirement.
    • Option 1: All standalone protocol nodes come preinstalled with OS, kernel, systemd, netmgr, OFED, and IPR. Now that the protocol nodes are discovered by xCAT, they can be set to boot from their hard drives, without reinstalling anything. If these preloaded levels are sufficient, then proceed with these steps. This option is quicker than option 2.
      1. Power off all protocol nodes.
        rpower ProtocolNodesList off
      2. Set the protocol node(s) to HD boot from the EMS node.
        rsetboot ProtocolNodesList hd
      3. Power on the protocol nodes.
        rpower ProtocolNodesList on
    • Option 2: It is also possible to completely wipe and reload the protocol nodes if desired.
      Remember: If you already have existing and active protocol nodes in the cluster, you must be very careful about which xCAT group is used and whether your gssdeployces.cfg file has a DEPLOYMENT_TYPE of CES or ADD_CES. Read the tips in the configuration file carefully, and read all on-screen warnings.
      Run the following commands to proceed.
      1. Reload the protocol nodes with the same levels of OS, kernel, systemd, netmgr, OFED, and IPR existing on the EMS / IO nodes.
        /var/tmp/gssdeploy -d /var/tmp/gssdeployces.cfg
      2. Proceed through all steps of the gssdeploy command. Protocol node installation progress can be watched using rcons Node.
        Note: Unlike on I/O server nodes, this step does not install any GPFS RPMs on protocol nodes except gpfs.gss.tools.
  6. If ADD_CES was used to add protocol nodes to a cluster that had existing or active protocol nodes, they would have been added to an xCAT group other than ces_ppc64. These protocol nodes must be moved to the ces_ppc64 group using these steps, run from the EMS node.
    1. Check to see which xCAT group was used for adding protocol nodes. Replace the configuration file name with the one used for gssdeploy -o and -d.
      # cat /var/tmp/gssdeployces.cfg | grep GSS_GROUP
    2. Move the added protocol nodes to the ces_ppc64 group.
      # chdef GroupNameUsed groups=all,ces_ppc64
  7. Once deployed, run gssinstallcheck or gssstoragequickcheck to verify the nodes are in a healthy state.
    1. Run gssstoragequickcheck -G ces_ppc64 to verify network adapter types, slots, and machine type model of the protocol nodes.
    2. Run gssinstallcheck -G ces_ppc64 to verify code and firmware levels on the protocol nodes.

C) Decide which adapter(s) to use for the GPFS network(s) vs CES protocol network(s)

It is recommended to plan for separation of the GPFS and CES networks, both by subnet and by card.

If adding protocol nodes to an existing or active protocol setup, in most cases it is recommended to match configurations of both the GPFS network and CES protocol networks to the existing protocol nodes. If planning a stretch cluster, or configurations in which not all protocol nodes see the same CES network, refer to IBM Spectrum Scale Knowledge Center.

Note: Before proceeding, protocol nodes must be cabled up to the GPFS cluster network and to the CES network.

D) Configure network adapters to be used for GPFS

Customer networking requirements are site-specific. The use of bonding to increase fault-tolerance and performance is advised but guidelines for doing this have not been provided in this document. Consult with your local network administrator before proceeding further. Before creating network bonds, carefully read ESS networking considerations.

Make sure that the protocol nodes high speed network IPs and host names are present in /etc/hosts on all nodes.

Here is an example excerpt from /etc/hosts, showing the -hs suffix IPs and host names to be used for the GPFS cluster configuration.
172.31.250.3 ems1-hs.gpfs.net ems1-hs
172.31.250.1 gssio1-hs.gpfs.net gssio1-hs
172.31.250.2 gssio2-hs.gpfs.net gssio2-hs
172.31.250.11 prt01-hs.gpfs.net prt01-hs
172.31.250.12 prt02-hs.gpfs.net prt02-hs
172.31.250.13 prt03-hs.gpfs.net prt03-hs
Note:
  • All nodes must be able to resolve all IPs, FQDNs, and host names, and ssh-keys must work.
  • If the /etc/hosts file is already set up on the EMS node, copy it to the protocol node(s) first and then modify it. Each protocol node must have the same /etc/hosts file.
To set up bond over IB, run the following command.
gssgennetworks -G ces_ppc64 --create --ipoib --suffix=-hs --mtu 4092

In this example, MTU is set to 4092. The default MTU is 2048 (2K) and the gssgennetworks command supports 2048 (2K) and 4092 (4K) MTU. Consult your network administrator for the proper MTU setting.

To set up bond over Ethernet, run the following command.
gssgennetworks -N ems1,gss_ppc64 --suffix=-hs --create-bond

For information on Infiniband issue with multiple fabrics, see Infiniband with multiple fabric in Customer networking considerations.

E) Configure network adapters to be used for CES protocols

Before deploying protocols, it is important to understand the customer network and protocol access requirements. CES protocols use a pool of CES IPs which float between nodes, providing redundancy in the case of node failure or degradation. The CES IPs are assigned and aliased by IBM Spectrum Scale to an adapter on each protocol node that has a matching predefined route and subnet. It is important that each protocol node has a base network adapter or bonded group of network adapters or ports with an established IP and routing so that CES IPs can be assigned by IBM Spectrum Scale code during protocol deployment.
Note: CES IPs are never assigned using ifcfg or nmcli commands. This is handled by the IBM Spectrum Scale code.
The following must be taken into account when planning this network:
  • Bandwidth requirements per protocol node (how many ports per bond, bonding mode, and adapter speed)
  • Redundancy of each protocol node, if needed. This determines the bonding mode used.
  • Authentication domain and DNS. This determines the subnet(s) required for each protocol node.
  • Are VLAN tags needed?
  • Set aside 1 IP per protocol node, per desired CES subnet. You will be using these when configuring the CES base adapter(s) on each protocol node. These IPs must be setup for forward and reverse DNS lookup.
  • Set aside a pool of CES IPs for later use. These IPs must be in DNS and be setup for both forward and reverse DNS lookup. You will not be assigning these IPs to network adapters on protocol nodes.
  • Prepare to configure each protocol node to point to the authentication domain or DNS. You need to do this manually using ifcfg or nmcli commands and by verifying /etc/resolv.conf after the settings have taken effect. When deployed from an EMS node, each protocol node might already have a default domain of gpfs.net present in /etc/resolv.conf and the ifcfg files. This default domain can be removed so that it does not interfere with the authentication setup and DNS for protocols.

Proceed with either configuring the CES protocol adapters manually using ifcfg or nmcli commands or by using gssgennetworks. The gssgennetworks command cannot be used if your CES protocol network requires VLAN tags nor does it set up additional domain or DNS servers.

When the network is configured on each protocol nodes, verify it using the mmnetverify command with these actions:
  • Make sure all protocol nodes can ping each other's base CES network by IP, host name, and FQDN.
  • Make sure all protocol nodes can ping the authentication server by IP, host name, and FQDN.
  • Make sure the authentication server can ping each protocol node base CES network by IP, host name, and FQDN
  • Spot check the desired NFS, SMB, or OBJ clients, external to the GPFS cluster and verify that they can ping each protocol node base CES network by IP, hostname, and FQDN
  • Even though the CES IP pool is not yet set up, because protocols are not deployed, double check that each protocol node can resolve each CES IP or host name using nslookup.

For an example showing how CES IP aliasing relies upon an established base adapter with proper subnets or routing, see CES IP aliasing to network adapters on protocol nodes.

For an example of CES-IP configuration that can be performed after deployment of protocols, see Configuring CES protocol service IP addresses.

F) Create a CES shared root file system for use with protocol nodes

If you already have an existing or active protocol setup and cesSharedRoot should already exist. Skip this step.

CES protocols require a shared file system to store configuration and state data. This file system is called CES shared root and it is a replicated file system that is recommended to be of size between 4GB and 10GB. The following ESS command automatically creates this file system with the recommended size and mirroring.
  • Run the following command from the EMS node.
    # gssgenvdisks --create-vdisk --create-nsds --create-filesystem --crcesfs
    # mmmount cesSharedRoot -N ems1-hs

    A file system named cesSharedRoot with a mount path of /gpfs/cesSharedRoot is created and mounted. Later in these steps, the IBM Spectrum Scale installation toolkit is pointed to this file system to use when deploying protocols.

G) Download the IBM Spectrum Scale protocols package (version 5.0.1.2 or later) on one of the protocol nodes

An example package name is: Spectrum_Scale_Protocols_Data_Management-5.0.1.2-ppc64LE-Linuxinstall

Each protocol node ships with an IBM Spectrum Scale protocols package in /root. The version and the license of this package matches with the ESS version that the protocol node was ordered with.

  • If the package is of the desired version and license, proceed with extraction.
  • If a different level is desired, proceed to IBM FixCentral to download and replace this version.
    If replacing this version, the following rules apply:
    • The IBM Spectrum Scale version must be 5.0.1.2 or later.
    • The CPU architecture must be PPC64LE.
    • The package must be a protocols package (The title and the file name must specifically contain Protocols).
Note: If Option 2 was specified in step B5 and the protocol node was reloaded, then there will be no Spectrum Scale protocols package in /root. It will need to be downloaded.

H) Extract the IBM Spectrum Scale protocols package

  • Enter the following at the command prompt: /root/Spectrum_Scale_Protocols_Data_Management-5.0.1.2-ppc64LE-Linux-install
  • By default, the package is extracted to /usr/lpp/mmfs/5.0.1.2/.

I) Configure the IBM Spectrum Scale installation toolkit

  1. Change directory to the installation toolkit directory:
    cd /usr/lpp/mmfs/5.0.1.2/installer
    View the installation toolkit usage help as follows.
    /usr/lpp/mmfs/5.0.1.2/installer/spectrumscale -h
  2. Set up the installation toolkit by specifying which local IP to use for communicating with the rest of the nodes. Preferably, this IP must be the same IP used for the GPFS network. Set the toolkit to the ESS mode.
    /usr/lpp/mmfs/5.0.1.2/installer/spectrumscale setup -s IP_Address -st ess
  3. Populate the installation toolkit configuration file with the current cluster configuration by pointing it to the EMS node.
    /usr/lpp/mmfs/5.0.1.2/installer/spectrumscale config populate -N EMSNode

    There are limits to the config populate functionality. If it does not work, simply add the EMS node to the installation toolkit and continue.

    View the current cluster configuration as follows.
    /usr/lpp/mmfs/5.0.1.2/installer/spectrumscale node list
    /usr/lpp/mmfs/5.0.1.2/installer/spectrumscale config gpfs
    Note: ESS I/O nodes do not get listed in the installation toolkit node list.
  4. Configure the details of the protocol nodes to be added to the cluster. If adding protocol nodes to an existing or active protocol setup, make sure to add all existing protocol nodes and configuration details. Note that a successful config populate operation from step 3 would have already performed this action.
    ./spectrumscale node add ems1-hs.gpfs.net -e -a -g   ## No need to perform this step 
                                                         ## if the config populate ran without error
    ./spectrumscale node add prt01-hs.gpfs.net -p     ## Add a protocol node
    ./spectrumscale node add prt02-hs.gpfs.net -p     ## Add a protocol node
    ./spectrumscale node add client01-hs.gpfs.net     ## Example of a non-protocol client node, if desired 
    ./spectrumscale node add nsd01-hs.gpfs.net        ## Example of a non-ESS nsd node, if desired
    ./spectrumscale enable smb                        ## If you'd like to enable and use the SMB protocol 
                                                      ## (it will be installed regardless) 
    ./spectrumscale enable nfs                        ## If you'd like to enable and use the NFS protocol
                                                      ## (it will be installed regardless)
    ./spectrumscale enable object                     ## If you'd like to enable and use the Object protocol 
                                                      ## (it will be installed regardless) 
    ./spectrumscale config protocols -e CESIP1,CESIP2,CESIP3   ## Input the CES IPs set aside from step (G) of
                                                               ## this procedure. Toolkit assigns IPs listed. 
    ./spectrumscale config protocols -f cesSharedRoot -m /gpfs/cesSharedRoot  ## FS name and mount point for  
                                                                              ## CES shared root, previously 
                                                                              ## setup during step (D) 
    ./spectrumscale config object -e <endpoint IP or hostname>  ## This address should be an RRDNS or similar address
                                                                ## that resolves to the pool of CES IP addresses.
    ./spectrumscale config object -o Object_Fileset             ## This fileset will be created during deploy
    ./spectrumscale config object -f ObjectFS -m /gpfs/ObjectFS ## This must point to an existing FS 
                                                                ## create the FS on EMS if it doesn't already exist
    ./spectrumscale config object -au admin -ap -dp             ## Usernames and passwords for Object
    
    ./spectrumscale config perfmon –r on              ## Turn on performance sensors for the protocol nodes.  
                                                      ## EMS GUI picks up sensor data once protocols are deployed 
    ./spectrumscale node list                         ## Lists out the node config (ESS IO nodes never show up here)
    ./spectrumscale config protocols                  ## Shows the protocol config
    For more information, see IBM Spectrum Scale installation toolkit.

J) Installation phase of IBM Spectrum Scale installation toolkit

  1. Run the installation toolkit installation precheck.
    ./spectrumscale install --precheck
  2. Run the installation toolkit installation procedure.
    ./spectrumscale install
Installation toolkit performs the following actions and it can be re-run in the future to:
  • Install GPFS, call home, performance monitoring, license RPMs on each node specified to the installation toolkit. The EMS and I/O server nodes are not acted upon by the installation toolkit.
  • Add nodes to the cluster (protocol, client, NSD).
  • Add non-ESS NSDs, if desired.
  • Start GPFS and mount all file systems on the newly added nodes.
  • Configure performance monitoring sensors.
  • Set client or server licenses.

GPFS configuration parameters such as pagepool, maxFilesToCache, verbsPorts need to be set up manually. You can do this after completing the installation phase or after completing the deployment phase. For more information about these parameters, see GPFS configuration parameters for protocol nodes.

K) Deployment phase of IBM Spectrum Scale installation toolkit

  1. Run the installation toolkit deployment precheck.
    ./spectrumscale deploy --precheck
  2. Run the installation toolkit installation procedure.
    ./spectrumscale deploy
Installation toolkit performs the following actions during deployment and it can be re-run in the future to:
  • Install SMB, NFS, and object RPMs on each protocol node specified to the installation toolkit.
  • Enable one or more protocols.
  • Assign CES IPs. IBM Spectrum Scale code aliases these IPs to the CES base network adapter configured during step E.
  • Enable authentication for file or object.
  • Create additional file systems using non-ESS NSD nodes. You must run installation first to add more non-ESS NSDs.
  • Add additional protocol nodes. You must run installation first to add more nodes and then run deployment for the protocol specific piece.

L) Tune the protocol nodes as desired

Protocol nodes should already be tuned with the same tuned and sysctl settings, and udev rules as the I/O server nodes. For more information, see OS tuning for RHEL 7.4 PPC64LE protocol nodes.

At this point, the main tuning settings to be aware of includes:
  • RDMA. If IB RDMA is in use (check using mmlsconfig verbsRDMA), issue mmlsconfig and verify that the verbPorts parameter refers to the correct ports on each protocol node.
  • pagepool. Use mmlsconfig for the pagepool settings of the EMS and I/O server nodes. The protocol nodes do not have pagepool defined at this point. Define pagepool using mmchconfig -N cesNodes pagepool=XX command.

    Where XX is typically 25% to 50% of the system memory. For more information, see GPFS configuration parameters for protocol nodes.

  • maxFilesToCache. Use mmlsconfig to view the maxFilesToCache settings of the EMS and I/O server nodes. The protocol nodes do not have maxFilesToCache defined at this point. Define maxFilesToCache using mmchconfig -N cesNodes maxFilesToCache=XX command.

    Where XX is typically 2M for protocol nodes in a cluster containing ESS. For more information, see GPFS configuration parameters for protocol nodes.

M) GUI configuration

  1. Open the existing EMS GUI in a web browser using the URL https://EssGuiNode where ESSGuiNode is the host name or IP address of the management server node. If the GUI is not yet set up, perform step 8 in this procedure.
  2. Software monitoring of protocol nodes occurs automatically by the GUI and all newly added protocol nodes should now exist in the HOME page Nodes section. Hardware monitoring of protocol nodes must be configured within the GUI panels as follows.
    1. Click Monitoring > Hardware > Edit Rack Components.

      The Edit Rack Components wizard is displayed.

    2. On the Welcome screen, select Yes, discover new servers and enclosures first and then click Next.

      This might take a few minutes. After the detection is complete, click OK.

    3. Click Next through the screens allowing edits of various rack components.

      The Other Servers screen shows the protocol nodes.

    4. On the Other Servers screen, select a rack and location for each protocol node and then click Next.
    5. Click Finish on the Summary screen.
The protocol nodes are now fully configured for hardware monitoring in the GUI.

N) Call home setup

Now that protocol nodes are deployed, call home needs to be configured.

  1. Check if call home has already been configured and if so, record the settings. Reconfiguring call home might require entering the settings again.
    • Check the hardware call home settings on the EMS node.
      gsscallhomeconf --show -E ems1
    • Check the software call home setup.
      mmcallhome info list
  2. Set up call home from the EMS node.
    gsscallhomeconf -N ems1,gss_ppc64,ces_ppc64 --suffix=-hs -E ems1 --register all --crvpd
For more information, see call home documentation in Elastic Storage Server: Problem Determination Guide.

O) Movement of quorum or management function to protocol nodes and off EMS or I/O nodes

Quorum and management functions can be resource intensive. In an ESS cluster that also has extra nodes, such as protocols, within the same cluster, it is recommended to move these functions to the protocol nodes. For information on changing node designation, see mmchnode command.