ESS 5000 software deployment preparation

Install an ESS software package and deploy the storage servers by using the following information. The goal is to create a cluster that allows client or protocol nodes to access the file systems.

Start of change

POWER9 EMS stack

Item Version
IBM Spectrum® Scale IBM Spectrum Scale 5.0.5.14
Operating system Red Hat® Enterprise Linux® 8.2
ESS 6.0.2.6
Kernel Start of change4.18.0-193.80.1.el8_2End of change
Systemd 239-31.el8_2.8
Network Manager 1.22.8-9.el8_2
GNU C Library glibc-2.28-164.el8.ppc64le.rpm
Mellanox OFED MLNX_OFED_LINUX-4.9-4.1.7.2
OFED firmware levels:
  • MT27500 = 10.16.1200
  • MT4099 = 2.42.5000
  • MT26448 = 2.9.1326
  • MT4103 = 2.42.5000
  • MT4113 = 10.16.1200
  • MT4115 = 12.28.2006
  • MT4117 = 14.31.1014
  • MT4119 = 16.31.1014
  • MT4120 = 16.31.1014
  • MT4121 = 16.31.1014
  • MT4122 = 16.31.1014
ESA Start of change4.5.7-0End of change
Ansible® Start of change2.9.27-1End of change
Podman Start of change1.6.4End of change
Container OS Red Hat UBI 8.4
xCAT Start of change2.16.3End of change (Not used in customer-shipped image; only for SCT)
Firmware RPM Start of changegpfs.ess.firmware-6.0.0-23.ppc64le.rpm End of change
System firmware Start of changeFW950.092 (FW950.45)End of change
Boot drive adapter IPR Start of change19512c00End of change
Boot drive firmware Start of change
  • Firmware: 9F23
  • Host adapter driver: 38.00.00.00
  • Host adapter firmware: 16.00.11.00
End of change
1Gb NIC firmware
  • Driver: tg3
  • Version: 3.137
  • Firmware version: 5719-v1.24i
Support RPM
Network adapter
  • MT27500 = 10.16.1200
  • MT4099 = 2.42.5000
  • MT26448 = 2.9.1326
  • MT4103 = 2.42.5000
  • MT4113 = 10.16.1200
  • MT4115 = 12.28.2006
  • MT4117 = 14.31.1014
  • MT4119 = 16.31.1014
  • MT4120 = 16.31.1014
  • MT4121 = 16.31.1014
  • MT4122 = 16.31.1014
End of change

ESS 5000 software stack

Component Version
Operating system Red Hat Enterprise Linux 8.2 PPC64LE
Container OS Red Hat Enterprise Linux 8.4 UBI
Start of changeIBM Spectrum ScaleEnd of change Start of change5.0.5.14End of change
Start of changeKernelEnd of change Start of changekernel-4.18.0-193.80.1.el8_2End of change
Systemd 239-31.el8_2.8
Network manager 1.22.8-9.el8_2
OFED MLNX_OFED_LINUX-4.9-4.1.7.2
OFED firmware levels:
  • MT27500 = 10.16.1200
  • MT4099 = 2.42.5000
  • MT26448 = 2.9.1326
  • MT4103 = 2.42.5000
  • MT4113 = 10.16.1200
  • MT4115 = 12.28.2006
  • MT4117 = 14.31.1014
  • MT4119 = 16.31.1014
  • MT4120 = 16.31.1014
  • MT4121 = 16.31.1014
  • MT4122 = 16.31.1014
Start of changeAnsibleEnd of change Start of change2.9.27-1End of change
Podman 1.6.4
System firmware FW950.092 (FW950.45)
ESA esagent.pLinux-4.5.7-0
Enclosure firmware
  • 5U92 = E558
  • 4U106 = 5266
Start of changendctlEnd of change Start of changendctl-ndctl-65-1.el8.rpmEnd of change
Start of changeOPAL PRDEnd of change Start of changeopal-prd-3000.0-1.el8End of change
IPR 19512c00
Boot drive firmware 9F23
Host adapter driver 38.00.00.00
Host adapter firmware 16.00.11.00
xCAT 2.16.3
Start of changeFirmware RPMEnd of change Start of changegpfs.ess.firmware-6.0.0-23.ppc64le.rpm End of change
Start of changeSupport RPMEnd of change Start of change
  • gpfs.gnr.support-essbase-1.0.0-3.noarch.rpm
  • gpfs.gnr.support-ess3000-1.0.0-3.noarch.rpm
  • gpfs.gnr.support-ess3200-1.0.0-2.noarch.rpm
  • gpfs.gnr.support-ess5000-1.0.0-3.noarch.rpm
End of change

Prerequisites

  • This document (ESS Software Quick Deployment Guide)
  • SSR completes physical hardware installation and code 20.
    • SSR uses Worldwide Customized Installation Instructions (WCII) for racking, cabling, and disk placement information.
    • SSR uses the ESS Hardware Guide for hardware checkout and setting IP addresses.
  • Worksheet notes from the SSR
  • Latest ESS tgz downloaded to the EMS node from Fix Central (If a newer version is available).
    • Data Access Edition or Data Management Edition: Must match the order. If the edition does not match your order, open a ticket with the IBM® Service.
  • High-speed switch and cables have been run and configured.
  • Low-speed host names are ready to be defined based on the IP addresses that the SSR have configured.
  • High-speed host names (suffix of low speed) and IP addresses are ready to be defined.
  • Container host name and IP address are ready to be defined in the /etc/hosts file.
  • Host and domain name (FQDN) are defined in the /etc/hosts file.

What is in the /home/deploy directory on the EMS node?

  • ESS 5000 tgz used in manufacturing (may not be the latest)
  • ESS 3000 tgz used in manufacturing (may not be the latest)
  • ESS 3200 tgz used in manufacturing (may not be the latest)
  • Red Hat Enterprise Linux 8.2 PPC64LE ISO (POWER9™ EMS)
  • Red Hat Enterprise Linux 7.9 PPC64LE ISO (POWER8® EMS)
    • This ISO is not needed for deployment but it is provided to restore the EMS node in case of a failure.
Start of change

Support for signed RPMs

ESS or IBM Spectrum Scale RPMs are signed by IBM.

The PGP key is located in /opt/ibm/ess/tools/conf.
-rw-r-xr-x 1 root root 907 Dec 1 07:45 SpectrumScale_public_key.pgp
You can check whether an ESS or IBM Spectrum Scale RPM is signed by IBM as follows.
  1. Import the PGP key.
    rpm --import  /opt/ibm/ess/tools/conf/SpectrumScale_public_key.pgp
  2. Verify the RPM.
    rpm -K RPMFile
End of change

ESS 3000 and ESS 5000 server networking requirements

In any scenario you must have an EMS node and a management switch. The management switch must be split into two VLANs.
  • Management VLAN
  • Service/FSP VLAN
You also need a high-speed switch (IB or Ethernet) for cluster communication.

ESS 3000

POWER8 or POWER9 EMS

It is recommended to buy a POWER9 EMS with ESS 3000. If you have a legacy environment (POWER8), it is recommended to migrate to IBM Spectrum Scale 5.1.x.x and use the POWER9 EMS as the single management server.
  • If you are adding ESS 3000 to a POWER8 EMS:
    • An additional connection for the container to the management VLAN must be added. A C10-T2 cable must be run to this VLAN.
    • A public/campus connection is required in C10-T3.
    • A management connection must be run from C10-T1 (This should be already in place if adding to an existing POWER8 EMS with legacy nodes).
  • If you are using an ESS 3000 with a POWER9 EMS:
    • C11-T1 must be connected on the EMS to the management VLAN.
    • Port 1 on each ESS 3000 canister must be connected to the management VLAN.
    • C11-T2 must be connected on the EMS to the FSP VLAN.
    • HMC1 must be connected on the EMS to the FSP VLAN.

ESS 5000

POWER9 EMS support only

EMS must have the following connections:
  • C11-T1 to the management VLAN
  • C11-T2 to the FSP VLAN
  • HMC1 to the FSP VLAN
ESS 5000 nodes must have the following connections:
  • C11-T1 to the management VLAN
  • HMC1 to the FSP VLAN

Code version

Start of changeESS provides the following version of the code.
  • Data management edition

    ess5000_6.0.2.6_0503-14_dme_ppc64le.tgz

  • Data access edition

    ess5000_6.0.2.6_0503-14_dae_ppc64le.tgz

End of change
Note: The version shown here might not be the GA version available on IBM FixCentral. It is recommended to go to IBM FixCentral and download the latest code.

POWER8 + POWER9 considerations

  • If both POWER8 and POWER9 EMS nodes are in an environment, it is recommended that you use only the POWER9 EMS for management functions (containers, GUI, ESA, collector).
  • Only a single instance of all management services is recommended and solely on the POWER9 EMS.
  • POWER8 only needs to exist as a management node if you are mixing a non-container-based release (5.3.x) with a container-based release (6.x.x.x).
  • It is recommended that all nodes in the storage cluster contain the same ESS release and IBM Spectrum Scale version.
  • It is recommended that you upgrade to the latest level before adding a building block.

Starting system state

SSR will assure that the following setups are done:
  1. Each node's management IP addresses are set up (typically 192.168.x.x/24). By default, these are blank.
  2. Each node's HMC1 IP addresses are set up (typically 10.0.0.x/24). By default, these are blank.
  3. The EMS FSP interface (C11-T2) has an IP address set on the same HMC1 subnet, typically 10.0.0.x/24.
  4. Each node and storage enclosure have clean hardware.
  5. All nodes ping over the FSP and management interfaces.
    • On EMS, this means pinging from T2 to the other node’s HMC1 interfaces
    • EMS can ping the management interface of the IO nodes or POWER9 protocol nodes (T1 interfaces)
  6. Consult the ESS 5000 Fix Central and determine if there is a newer version available. If so, download it to the EMS and place in /home/deploy. Remove any .tgz files that might have been there from manufacturing.
  7. SSR has passed notes on to the installation worksheet. This might be helpful information that was encountered during code 20 of the hardware.

Other notes

  • The following tasks must be complete before starting a new installation (tasks done by manufacturing and the SSR):
    • SSR has ensured all hardware is clean, and IP addresses are set and pinging over the proper networks (through the code 20 operation).
    • /etc/hosts is blank.
    • The ESS 5000 tgz file (for the correct edition) is in the /home/deploy directory. If upgrade is needed, download from Fix Central and replace.
    • Network bridges are cleared.
    • Images and containers are removed.
    • SSH keys are cleaned up and regenerated.
    • All firmware, operating system, RAID array (10) that are installed are at correct levels at the time of shipping for the latest version available of ESS 5000 (manufacturing task).
    • All code levels are at the latest at time of manufacturing ship.
  • Customer must make sure the high-speed connections are cabled and the switch is ready before starting.
  • All node names and IP addresses in this document are examples.
  • Changed root password should be same on each node, if possible. The default password is ibmesscluster. It is recommended to change the password after deployment is completed.
  • Each server's IPMI and ASMI passwords (POWER® nodes only) are set to the server serial number. Consider changing these passwords when the deployment is complete.

ESS best practices

  • ESS 6.x.x.x uses a new embedded license. It is important to know that installation of any Red Hat packages outside of the deployment upgrade flow is not supported. The container image provides everything required for a successful ESS deployment. If additional packages are needed, contact IBM for possible inclusion in future versions.
  • For ESS 3000, consider enabling TRIM support. This is outlined in detail in IBM Spectrum Scale RAID Administration. By default, ESS 3000 only allocates 80% of available space. Consult with IBM development, if going beyond 80% makes sense for your environment, that is if you are not concerned about the performance implications due to this change.
  • You must setup a campus or additional management connection before deploying the container.
  • If running with a POWER8 and a POWER9 EMS in the same environment, it is best to move all containers to the POWER9 EMS. If there is a legacy PPC64LE system in the environment, it is best to migrate all nodes to ESS 6.1.x.x and decommission the POWER8 EMS altogether. This way you do not need to run multiple ESS GUI instances.
  • If you have a POWER8 EMS, you must upgrade the EMS by using the legacy flow if there are xCAT based PPC64LE nodes in the environment (including protocol nodes). If there are just an ESS 3000 system and a POWER8 EMS, you can upgrade the EMS from the ESS 3000 container.
  • If you are migrating the legacy nodes to ESS 6.1.x.x on the POWER8 EMS, you must first uninstall xCAT and all dependencies. It is best to migrate over to the POWER9 EMS if applicable.
  • You must be at ESS 5.3.7 (Red Hat Enterprise Linux 7.7 / Python3) or later to run the ESS 3000 container on the POWER8 EMS.
  • You must run the essrun config load command against all the storage nodes (including EMS and protocol nodes) in the cluster before enabling admin mode central or deploying the protocol nodes by using the installation toolkit.
  • If you are running a stretch cluster, you must ensure that each node has a unique hostid. The hostid might be non-unique if the same IP addresses and host names are being used on both sides of the stretch cluster. Run gnrhealthcheck before creating recovery groups when adding nodes in a stretch cluster environment. You can manually check the hostid on all nodes as follows:
    mmdsh -N { NodeClass | CommaSeparatedListofNodes } hostid

    If hostid on any node is not unique, you must fix by running genhostid. These steps must be done when creating a recovery group in a stretch cluster.

  • Consider placing your protocol nodes in file system maintenance mode before upgrades. This is not a requirement but you should strongly consider doing it. For more information, see File system maintenance mode.
  • Do not try to update the EMS node while you are logged in over the high-speed network. Update the EMS node only through the management or the campus connection.
  • After adding an I/O node to the cluster, run the gnrhealthcheck command to ensure that there are no issues before creating vdisk sets. For example, duplicate host IDs. Duplicate host IDs cause issues in the ESS environment.