ESS known issues

Known issues in ESS

For information about ESS 5.3.7.x known issues, see Known issues in ESS 5.3.7.x Quick Deployment Guide.

The following table describes the known issues in IBM Elastic Storage® System (ESS) and how to resolve these issues.
Issue Resolution or action

Enablement of IPv6 for RoCE by using Service Network includes problematic entry in /etc/sysconfig/networkscripts/ifcfg-bond-bond0.

Product
  • ESS 3200
  • ESS 3500
  • ESS 5147-102
  • ESS 5000
  1. Create a bond by using essgennetworks utility.
  2. Remove the ‘comment out’ line from the /etc/sysconf/network-scripts/ifcfg-bond-bond0 file of the node.
  3. Run nmcli con reload.
  4. Run nmcli con down bond-bond0; nmcli con up bond-bond0.
  5. Run all other essgennetworks commands to enable RoCE as per procedure.
    Note: With this workaround the last RoCE enablement command should work as designed.
    Example:
    essgennetworks -N essio51 --suffix=-ce --interface
    enP48p1s0f0,enP48p1s0f1 --bond bond0 --enableRoCE --mtu
    9000
    [INFO] Starting network generation...
    [INFO] nodelist: essio51
    [INFO] suffix used for network hostname:-ce
    [INFO] Interface(s) available on node essio51-ce
    [INFO] Considered interface(s) of node essio51-ce are
    ['enP48p1s0f0', 'enP48p1s0f1', 'bond0'] with RDMA Port
    ['mlx5_2', 'mlx5_3', 'mlx5_bond_0'] for this operation
    [INFO] Supported Mellanox RoCE card found at node
    essio51
    [INFO] Supported version of Mellanox OFED found at node
    essio51-ce
    [INFO] Bond validation passed and found bonds bond0 has
    been created using same physical network adapter at node
    essio51-ce
    [INFO] Bond MTU validation passed and found bonds MTU
    set to 9000 at node essio51-ce
    [INFO] Interface bond0 have the IPv4 Address assigned at
    node essio51-ce
    [INFO] Interface bond0 have the IPv6 Address assigned at
    node essio51-ce
    [INFO] Interface MTU also set to 9000 at node essio51-ce
    [INFO] Interface enP48p1s0f0 have the IPv4 Address
    assigned at node essio51-ce
    [INFO] Interface enP48p1s0f1 have the IPv4 Address
    ESS3500
    5147-102
    ESS3200
    ESS5000
    Page 2 of 9
    assigned at node essio51-ce
    [INFO] Interface enP48p1s0f0 have the IPv6 Address
    assigned at node essio51-ce
    [INFO] Interface enP48p1s0f1 have the IPv6 Address
    assigned at node essio51-ce
    [INFO] Enabling RDMA for Ports ['mlx5_bond_0', 'mlx5_2',
    'mlx5_3']
    [INFO] Enabled RDMA i.e. RoCE using bond bond0
    [INFO] Enabled RDMA i.e. RoCE using interfaces
    enP48p1s0f0,enP48p1s0f1
    [INFO] Please recycle the GPFS daemon on those nodes
    where RoCE has been enabled.
During the GUI setup in a dual-EMS environment, the backup EMS is shown twice on location specification.
Product
  • ESS 3200
  • ESS 3500
  • ESS 5147-102
  • ESS 5000
  • During GUI wizard setup or during GUI edit components, you will be prompted twice for specifying rack location of the backup EMS. Do the following steps to resolve this issue:
    1. During the first prompt for the rack location of the management servers, leave the location as black for the backup EMS.
    2. Click Next to go to the panel specifying the location for other nodes and choose the rack location for the backup EMS in this panel.
  • During the GUI wizard setup or during GUI edit components, when action are run via the GUI after users specify the rack locations, the GUI action could fail due to the backup EMS being added twice into the component database. Do the following steps to resolve this issue:
    1. Ignore the error and select click Close button to close the window.
    2. Click Finish to continue.

The mmhealth command fails to report failed cable (cable missing) between ESS 3500 HBA and 5147-102 IOM.
Product
  • ESS 3500
  • ESS 5147-102
If SAS cable is pulled between HBA adapter and IOM of enclosure, the mmhealth node show command will not flag any error. Do the following steps to resolve this issue:
  1. Monitor the mmfs.log.latest file.
  2. If any suspicious errors are printed to indicating disk paths and/or IOM are missed, then run mmgetpdisktopology and pass the output to topsummary to find out which path(s) are missed.
The mmhealth callhome command repeatedly reports that the callhome_ptfupdates_failed event on a Daisy Chain cluster (cluster with more than four 5147-102 enclosures attached).
Product
  • ESS 3500
  • ESS 5147-102
After updating ESS to 6.1.6.1, the mmhealth node show command on EMS shows the callhome_ptfupdates_failed event.

Run the mmsysmoncontrol restart. The error will be cleared temporarily. However, the error appears again a couple of hours after mmsysmonitor is restarted. This error does not affect the call home function.

During Dual EMS GUI setup, the backup EMS is no longer shown in the hardware panel.
Product
  • ESS 3200
  • ESS 3500
  • ESS 5147-102
  • ESS 5000
  1. Run the mmlscomp command from EMS to find the component ID for the backup EMS.
  2. Run the mmdelcomp <componend_id> command (component_id is the component ID obtained from above step 1.)
  3. Return to the GUI Hardware panel and click Edit Rack Components, which is on left-top of the server list on the right side.
  4. Choose Yes, discover new servers and enclosures first. This takes many minutes.

  5. Click Next to follow the screen prompt to complete Hardware Components setup.

Admin mode central command deprecated in ESS 6.1.5.1.
Product
  • ESS 3200
  • ESS 3500
  • ESS 5147-102
  • ESS 5000
  • ESS Legacy
The admin mode central command is part of suite of optional security enchantments that can be turned on (along with firewall, sudo, SELinux). The current design of admin mode central is being re-worked and thus it is temporarily deprecated.
Call home deployment will fail if gpfsadmin username is created/exists as part ofSUDO enablement before the call home deployment.
Product
  • ESS 5000
  • If SUDO is already configured before deploying esscallhomeconf, then the workaround is to:
    1. Disable SUDO (in relevant nodes where it is enabled).
    2. Deploy configuration of Callhome (in a default root mode).
  • If SUDO is not enabled yet, then the recommendation is to:
    1. Deploy Callhome first (Can be achieved through GUI setup or manually via esscallhomeconf. For more information, see ESS documentation.
    2. Enable SUDO on the cluster node(s).
For RoCE enablement, the bond interface creation may have problematic entry in /etc/sysconf/networkscripts/ifcfg-bond-bond0. The identified parameter is: IPV6_ADDR_GEN_MODE.
  • ESS 5000
  1. Create bond using essgennetworks utility.
  2. Remove the ‘comment out’ line in the /etc/sysconf/network-scripts/ifcfg-bond-bond0 file of the spcific node.
  3. Run nmcli con reload.
  4. Run nmcli con down bond-bond0; nmcli con up bondbond0.
  5. Run all other essgennetworks commands to enable RoCE as per procedure.
    Note: With this workaround the last RoCE enablement command should work as designed.
    Example:
    essgennetworks -N essio51 --suffix=-ce --interface
    enP48p1s0f0,enP48p1s0f1 --bond bond0 --enableRoCE
    --mtu 9000
    [INFO] Starting network generation...
    [INFO] nodelist: essio51
    [INFO] suffix used for network hostname:-ce
    [INFO] Interface(s) available on node essio51-ce
    [INFO] Considered interface(s) of node essio51-ce are
    ['enP48p1s0f0', 'enP48p1s0f1', 'bond0'] with RDMA Port
    ['mlx5_2', 'mlx5_3', 'mlx5_bond_0'] for this operation
    [INFO] Supported Mellanox RoCE card found at node
    essio51
    [INFO] Supported version of Mellanox OFED found at
    node essio51-ce
    [INFO] Bond validation passed and found bonds bond0
    has been created using same physical network adapter
    at node essio51-ce
    [INFO] Bond MTU validation passed and found bonds
    MTU set to 9000 at node essio51-ce
    [INFO] Interface bond0 have the IPv4 Address
    assigned at node essio51-ce
    [INFO] Interface bond0 have the IPv6 Address
    assigned at node essio51-ce
    [INFO] Interface MTU also set to 9000 at node essio51-
    ce
    [INFO] Interface enP48p1s0f0 have the IPv4 Address
    assigned at node essio51-ce
    [INFO] Interface enP48p1s0f1 have the IPv4 Address
    assigned at node essio51-ce
    ESS5000
    Page 5 of 9
    [INFO] Interface enP48p1s0f0 have the IPv6 Address
    assigned at node essio51-ce
    [INFO] Interface enP48p1s0f1 have the IPv6 Address
    assigned at node essio51-ce
    [INFO] Enabling RDMA for Ports ['mlx5_bond_0',
    'mlx5_2','mlx5_3']
    [INFO] Enabled RDMA i.e. RoCE using bond bond0
    [INFO] Enabled RDMA i.e. RoCE using interfaces
    enP48p1s0f0,enP48p1s0f1
    [INFO] Please recycle the GPFS daemon on those
    nodes where RoCE has been enabled.
Start of changeThe mmvdisk sed enroll command might proceed when it is issued after creating user vdisk sets instead of blocking it.
Product
  • ESS 3500
End of change
Start of change
  • Issue the mmvdisk sed enroll command after creating a recovery group and before creating user vdisk sets.
  • Contact IBM Support, if you issued the mmvdisk sed enroll command after creating user vdisk sets.
End of change
Start of change

The ESS system reboots unexpected because mpt3sas messages fill logs.

The following error appears:
System crashed with 'swiotlb buffer is full' then 'scsi_dma_map failed' errors
Product
  • ESS 3500
End of change
Start of changeTo resolve this error, go to the Red Hat known issues.

.44 mpt3sas driver will be available in ESS 6.1.6.1.

End of change

Customer may encounter false positive intermittent fan module failures in /var/log/messages. It is also possible that a call home will be generated for each fan module.

Errors typically seen:
mmsysmon[7819]: [W] Event raised: Fan
fan_module1_id4 has a fault.
mmsysmon[7819]: [W] Event raised: Fan
fan_module1_id4 state is FAILED.
Product
  • ESS 3500

If this is seen contact IBM support to verify false positive condition.

Start of changeRunning essrun ONLINE update might fail on the mmchfirmware -N localhost --type drive section.
Product
  • ESS 5000
End of change
Start of changeManually issue the mmchfirmware after the deployment.End of change
Start of change

In ESS 5000, if using Y-Type cables (used in HDR switches), for running high-speed network, the mmhealth node show -N ioNode1-ib,ioNode2-ib NETWORK might show ib_rdma_port_width_low(mlx5_0/1, mlx5_1/1, mlx5_4/1).

Product
  • ESS 5000
End of change
Start of change
  1. Check existing anomaly in HDR-Y cables.
  2. Contact IBM Support for help to update /usr/lpp/mmfs /lib/mmsysmon/NetworkService.py and /usr/lpp/mmfs/lib/mmsysmon/network.json with appropriate code.
  3. After patching, restart mmsysmon to apply changes. Example: systemctl restart mmsysmon.
  4. Issue the mmhealth command to verify whether the condition is alleviated.

End of change
Start of change
When EMS is updated from previous releases to ESS 6.1.5.1, setting up SELlinux on EMS by issuing the essrun selinux enable command in a container fails. The following error appears:
Failed to resolve typeattributeset statement at /var/lib/selinux/targeted/tmp/modules/400/pcpupstream/cil:42 The issue may be related to a bug in Red Hat kernel 8.6
Product
  • ESS legacy
  • ESS 3000
  • ESS 5000
  • ESS 3200
  • ESS 3500
  • ESS 3500 (4u102)
End of change
Start of change
  1. Reboot EMS.
  2. Restart the container.
  3. Ensure that selinux-policy is up-to-date by issuing the yum update selinux-policy command.
  4. Reinstall pcp-selinux by issuing the yum reinstall pcp-selinux command.
End of change
Start of changeWhen creating additional file systems in a tiered storage environment you might encounter a MIGRATION callback error.
mmaddcallback: Callback identifier "MIGRATION" already exists or was specified multiple times.
If a callback exists, file system creation will fail.
Product
  • ESS 3000
  • ESS 3200
  • ESS 5000
  • ESS 3500
  • ESS 3500 (4u102)
End of change
Start of change

Delete the callback and create the file system again.

End of change