Installation and upgrade related information and checklists
Review the following installation and upgrade related information before starting with the installation or the upgrade of Elastic Storage Server (ESS).
- Component versions for this release
- Supported editions on each architecture
- ESS best practices and support statements
- Obtaining the required Red Hat Enterprise Linux and ESS code
- Customer networking considerations
- Supported upgrade paths
- Support for hardware call home
- Pre-installation checklist
- Post-installation checklist
- Other topics
- Sample installation and upgrade flow
New features and enhancements in ESS 5.3.0
- gssgennetworks support for IB bonding
- gssdeploy improvements (PPC64LE discovery / genesis)
- gssutils overhaul
- Support for setting MTU in gssgennetworks
Component versions for this release
- Supported architectures: PPC64BE and PPC64LE
- IBM Spectrum Scale™: 5.0.0.1
- xCAT: 2.13.9
- HMC: 860 SP2
- System firmware: SV860_138(FW860.42)
- Red Hat Enterprise Linux: 7.3
- Kernel: 3.10.0-514.44.1
- Systemd: 219-42.el7_4.10
- Network Manager: 1.8.0-11.el7_4
- OFED: MLNX_OFED_LINUX-4.1-4.1.6.1
- IPR: 17518300
- ESA: 4.2.0-9
Supported editions on each architecture
The following are the ESS editions supported on the available architectures.- Standard Edition
- Advanced Edition
- Data Management Edition
- Standard Edition
- Data Management Edition
ESS best practices and support statements
- It is advised that when performing normal maintenance operations (or upgrades) that you disable
autoload first.
mmchconfig autoload=no
Once the maintenance operation (or upgrade) is complete, re-enable autoload.mmchconfig autoload=yes
- By default, file systems must only be mounted on the management server node (EMS). Do not mount the file system on any other ESS nodes besides the EMS (where the primary GUI runs) which is mandatory for the GUI to function correctly.
- It is advised that you disable automount for file systems when performing an upgrade
to ESS 5.3.0.1 or
later.
mmchfs Device -A no
Device is the device name of the file system.
- Do not configure more than 5 failure groups in a single file system.
- All Infiniband devices must be set to CONNECTED_MODE=no.
- If you have 40Gb adapters, enable flow control on your switch
- RDMA over Ethernet (RoCE) is not supported.
- Sudo on the ESS nodes is not supported.
- Enabling the firewall on any ESS node is not supported.
- Enabling SELinux on any ESS node is not supported.
- Running any additional service or protocols on any ESS node is not supported.
- Move quorum and cluster or file system management function off of the ESS nodes where possible.
- All nodes must be at the same level prior to adding a building block. Therefore, upgrade existing ESS building block before adding the new one in.
- You must take down the GPFS cluster to run firmware updates in parallel.
- Do not independently update IBM Spectrum Scale (or any component) on any ESS node unless specifically advised from the L2 service. Normally this is only needed to resolve an issue. Under normal scenarios it is advised to only upgrade in our tested bundles.
Obtaining the required Red Hat Enterprise Linux and ESS code
- Red Hat Enterprise Linux 7.3
ISO
18402 3370316 rhel-server-7.3-ppc64-dvd.iso 45464 3114204 rhel-server-7.3-ppc64le-dvd.iso
- Network manager version :
1.8.0-11.el7_4
40869 8769 netmgr-RHBA-2017-2925-BE.tar.gz 31336 8660 netmgr-RHBA-2017-2925-LE.tar.gz
- Systemd version:
219-42.el7_4.10
56649 7212 systemd-530-RHBA-2018-0416-BE.tar.gz 24625 7149 systemd-530-RHBA-2018-0416-LE.tar.gz
- Kernel version:
3.10.0-514.44.1
2013 63700 kernel-530-RHSA-2018-0399-BE.tar.gz 5436 63544 kernel-530-RHSA-2018-0399-LE.tar.gz
On ESS 5.3.0.x systems shipped from manufacturing, these items can be found on the management server node in the /home/deploy directory.
If you are a member of IBM, you must contact ESS development or L2 service to obtain the code directly.
ESS 5.3.0.x can be downloaded from IBM® FixCentral.
The ESS software archive that is available in different versions for both PPC64BE and PPC64LE architectures.ESS_STD_BASEIMAGE-5.3.0.1-ppc64-Linux.tgz
ESS_ADV_BASEIMAGE-5.3.0.1-ppc64-Linux.tgz
ESS_DM_BASEIMAGE-5.3.0.1-ppc64-Linux.tgz
ESS_STD_BASEIMAGE-5.3.0.1-ppc64le-Linux.tgz
ESS_DM_BASEIMAGE-5.3.0.1-ppc64le-Linux.tgz
tar -xvf ESS_STD_BASEIMAGE-5.3.0.1-ppc64-Linux.tgz
The BASEIMAGE tar
file contains the following files that get extracted with the preceding command: - ESS_5.3.0.1_ppc64_Release_note_Standard.txt: This file contains the release notes for the latest code.
- gss_install-5.3.0.1_ppc64le_standard_20180412T022648Z.tgz: This .tgz file contains the ESS code.
- gss_install-5.3.0.1_ppc64le_standard_20180412T022648Z.md5: This .md5 file to check the integrity of the tgz file.
Customer networking considerations
Review the information about switches and switch firmware that were used to validate this ESS release. For information about available IBM networking switches, see the IBM networking switches page on IBM Knowledge Center.
Switch MTM - Switch description - Switch FW <<<<< This is just an example.
8828-E36/E37 - Mellanox SB7700 36port EDR - 3.6.5011
8831-F36 / F37 - Mellanox SX6036 36port FDR - 3.6.5011
8831-NF2 - Mellanox SX1710 36Port 40GbE - 3.6.5011
- SSH to the switch.
- Issue the following commands.
For example:# en # show version
Example output:login as: admin Mellanox MLNX-OS Switch Management Using keyboard-interactive authentication. Password: Last login: Mon Mar 5 12:03:14 2018 from 9.3.17.119 Mellanox Switch io232 [master] > io232 [master] > en io232 [master] # show version
Product name: MLNX-OS Product release: 3.4.3002 Build ID: #1-dev Build date: 2015-07-30 20:13:19 Target arch: x86_64 Target hw: x86_64 Built by: jenkins@fit74 Version summary: X86_64 3.4.3002 2015-07-30 20:13:19 x86_64 Product model: x86 Host ID: E41D2D52A040 System serial num: Defined in system VPD System UUID: 03000200-0400-0500-0006-000700080009
Infiniband with multiple fabric
- Use gssgennetworks to properly set up IB or Ethernet bonds on the ESS system.
- Create a cluster.
- Run mmfsadm test verbs config | grep verbsPorts
mmfs verbsPorts: mlx5_0/1/4 mlx5_1/1/7
Adapter mlx5_0, port 1 connected to fabric 4 and adapter mlx5_1 port 1 connected to fabric 7.
Now using mmchconfig, modify the verbsPorts for each node or node class to take the subnet into account.
Supported upgrade paths
- ESS version 5.1.x and 5.2.x to version 5.3.x on PPC64BE.
- ESS version 5.1.x and 5.2.x to version 5.3.x on PPC64LE.
Support for hardware call home
PPC64BE | PPC64LE | |
Call home when disk needs to be replaced | X | X |
Enclosure call home | Unsupported | Unsupported |
Server call home | Through HMC | Unsupported |
Pre-installation checklist
Obtain access to the required RHEL components (Contact IBM ESS development or L2 Service for access). | |
Obtain the kernel, systemd, networkmanager, RHEL ISO (Provided by ESS development or L2 | Service), and ESS tarball (FixCentral). Verify that the checksum match with what is listed in this document. Also ensure that you have the correct architecture packages (PPC64LE or PPC64BE). | |
Download and read the latest ESS Quick Deployment Guide and browse the related ESS 5.3 documentation in IBM Knowledge Center. | |
Obtain the customer RHEL license. | |
Contact the local SSR and ensure that all hardware checks have been completed. Make sure all hardware found to have any issues has been replaced. | |
If the 1Gb switch is not included in the order, contact the local network administrator to ensure isolated xCAT and FSP VLANs are in place. | |
Develop an inventory and plan for how to upgrade, install, or tune the client nodes. | |
Upgrade the HMC to SP2 if doing a PPC64BE installation. This can be done concurrently. | |
Consider talking to the local network administrator regarding ESS switch best practices, especially the prospect of upgrading the high-speed switch firmware at some point prior to moving the system into production, or before an upgrade is complete. For more information, see Customer networking considerations. | |
Review Elastic Storage Server: Command Reference. | |
Review ESS FAQ and ESS best practices. | |
Review the ESS 5.3.0 known issues. | |
Ensure that all client node levels are compatible with the ESS version. If needed, prepare to update the client node software on site and possibly other items such as the kernel and the network firmware or driver. | |
Power down the storage enclosures, or remove the SAS cables, until the gssdeploy -x operation is complete. |
Post-installation checklist
Call home has been set up and tested.
|
|
GUI has been set up and demonstrated to the customer. | |
GUI SNMP alerts have been set up, if desired. | |
The customer RHEL license is registered and active. | |
No issues have been found with mmhealth, GUI, gnrhealthcheck, gssinstallcheck, serviceable events. | |
No SAS width/speed issues have been found. | |
Client nodes are properly tuned. | |
It is advised that you turn on autoload to enable GPFS to recover automatically in case of
a daemon problem.
|
|
Connect all nodes to Red Hat Network (RHN). | |
Update any security related erratas from RHN if the customer desires (yum –y security). | |
Ensure that you have saved a copy of the xCAT database off to a secure location. | |
Install or upgrade the protocols. For more information, see Upgrading a cluster containing ESS and protocol nodes. | |
Ensure (if possible) that all network switches have had the firmware updated. |
Other topics
- Adding a building block (same architecture or LE<->BE)
- Restoring a management server
- Part upgrades or replacements
- VLAN reconfiguration on the 1Gb switch
Sample installation and upgrade flow
New installations go through manufacturing CSC. The system is fully installed with ESS 5.3.0, tested, malfunctioning parts replaced, and required RHEL pieces shipped in /home/deploy.
Installation
To install an ESS 5.3.0 system at the customer site, it is recommended that you use the Fusion mode available with gssutils. For more information, see Elastic Storage Server 5.2 or later: Fusion Mode and gssutils - ESS Installation and Deployment Toolkit.
- SSR checkout complete
- LBS arrival on site
- Plug-n-Play mode demonstrated
- Decisions made on block size, host names, IP addresses (/etc/hosts generated)
- Check high speed switch settings or firmware
- Firmware updated on ESS nodes
- Fusion mode used to bring system to cluster creation
- Network bonds created
- Cluster created
- Recovery groups, NSDs, file system created
- Stress test performed
- Final checks performed
- GUI setup (w/SNMP alerts if desired)
- Call home setup
- Nodes attached to RHN and security updates applied
Upgrade
To upgrade to an ESS 5.3.0 system at the customer site, it is recommended that you use gssutils. For more information, see gssutils - ESS Installation and Deployment Toolkit.
- SSR checkout complete
- Check high speed switch settings or firmware
- Ensure that there are no hardware issues
- Ensure client / protocol node compatibility
- Ensure no heavy IO operations are being performed
- Upgrade ESS (rolling upgrade or with cluster down)
- Always ensure you have quorum (if rolling upgrade)
- Always carefully balance the recovery groups and scale management functions as you upgrade each node (if rolling upgrade)
- Final checks performed
- Determine if any mmperfmon changes required
- Ensure GUI SNMP alerts and call home still working