ESS known issues

Known issues in ESS

For information about ESS 5.3.7.x known issues, see Known issues in ESS 5.3.7.x Quick Deployment Guide.

The following table describes the known issues in IBM Elastic Storage® System (ESS) and how to resolve these issues.
Issue Resolution or action
Start of change

The POWER 9 firmware, which is included in the container, is incorrect.

Product
  • ESS 5000
  • POWER9 EMS
  • POWER9 Protocol
End of change
Start of changeTo upgrade the POWER 9 firmware, complete the following steps:
  1. Download the firmware from the box folder.
    Note: You must have IBM credentials and access to this folder.
  2. Complete the procedure in the ESS Deployment Guide
End of change
Start of change

Unable to fetch the customer detail from ESA Agent. When configuring ESA you might see the following message:

[ERROR] Unable to fetch the customer detail from ESA Agent.
Product
  • ESS Legacy
  • ESS 3000
  • ESS 5000
  • ESS 3200
End of change
Start of change
  1. In the tools/bin/esscallhomeconf script change cmd = "/opt/ibm/esa/bin/activator -d" to cmd = "'/opt/ibm/esa/bin/activator -d'".
  2. Save the file and try again.
End of change
After initial deployment, the EMS may be missing from the GUI Hardware panel.
Product
  • Start of changeESS LegacyEnd of change
  • ESS 3000
  • ESS 3200
  • ESS 5000
  • Log in to the EMS node as root, and then run the command mmlscomp to find the Component ID for the EMS node:
    Example:
    [root@ems9 bin]# mmlscomp --type server 
    Server Components 
    Comp ID Part Number Serial Number Name Node Number 
    ------- ---------------- ------------- ------------------------- ----------- 
    25 ESS3200-5141-FN1 78E400FA ESS3200-5141-FN1-78E400FA 1 
    26 ESS3200-5141-FN1 78E400FB ESS3200-5141-FN1-78E400FB 2 
    29 5105-22E 78ABC6A 5105-22E-78ABC6A 6. 
    ^ ----- 29 is the component ID for the EMS in the example 
  • From the EMS command line, run the mmdelcomp command to delete the ems component:
    Example:
    [root@ems9 bin]# mmdelcomp 29
    INFO: Deleting component 29
    mmcomp: Propagating the cluster configuration data to all 
    affected nodes. This is an asynchronous process.
  • Return to the Hardware Panel in GUI, click the Edit Component tab. The Edit Rack Components Wizard appears. Select the following option:
    • Yes, discover new servers and enclosure first. This takes many minutes

    Continue to click Next without changing any parameters until you get to the Rack Locations section of the Edit Rack Components page. Ensure to respecify the location of the EMS node on the page, and click Next.

    Go to the final page and click Finish, ESS applies the change.

After initial deployment the EMS may show SERVER2U instead of 5105-22E as the MTM.
Product
  • Start of changeESS LegacyEnd of change
  • ESS 3000
  • ESS 3200
  • ESS 5000
  • Navigate to the Hardware Panel of the GUI and select the image of the EMS to verify that the MTM shows as SERVER2U.
  • If the condition is seen, log in as root to the EMS command line and issue the following command to determine the Component ID of the EMS node:
    Example:
    [root@ems ~]# mmlscomp --type server
    Server Components
    
    Comp ID  Part Number       Serial Number  Name                       Node Number
    -------  ----------------  -------------  -------------------------  -----------
          1  ESS3200-5141-FN1  78E400KA       ESS3200-5141-FN1-78E400KA       3
          2  ESS3200-5141-FN1  78E400HA       ESS3200-5141-FN1-78E400HA       1
          3  ESS3200-5141-FN1  78E400HB       ESS3200-5141-FN1-78E400HB       2
          4  ESS3200-5141-FN1  78E400KB       ESS3200-5141-FN1-78E400KB       4
         12  SERVER2U          78A5A5A        SERVER2U-78A5A5A        5
           ^ ----- 12 is the component ID for the EMS in the example 
    
  • Delete the ems server using either the Comp ID, Serial Number or Name (using Comp ID for this example): :
    Example:
    [root@ems ~]# mmdelcomp 12
    INFO: Deleting component 12
    mmcomp: Propagating the cluster configuration data to all 
    affected nodes. This is an asynchronous process. 
  • Return to the Hardware Panel in GUI, click the Edit Component tab. The Edit Rack Components Wizard appears. Select the following option:
    • Yes, discover new servers and enclosure first. This takes many minutes

    Continue to click Next without changing any parameters until you get to the Rack Locations section of the Edit Rack Components page. Ensure to respecify the location of the EMS node on the page, and click Next.

    Go to the final page and click Finish, ESS applies the change.

The Ansible tool essrun cannot add more than one building block at a time in a cluster.
Product
  • Start of changeESS LegacyEnd of change
  • ESS 3000
  • ESS 3200
  • ESS 5000
If it is necessary to add more than one building block in a cluster, the following two options are available:
  • Use the essrun command and add each building block individually.
  • Use the mmvdisk command to add the building blocks.
During upgrade, if the container had an unintended loss of connection with the target canister(s), there might be a timeout of up to 2 hours in the Ansible® update task.
Product
  • ESS 3000
Wait for the timeout and retry the essrun update task.
When running essrun commands, you might see messages such as these:
Thursday 16 April 2020 20:52:44 +0000
(0:00:00.572) 0:13:19.792 ********
Thursday 16 April 2020 20:52:45 +0000
(0:00:00.575) 0:13:20.367 ********
Thursday 16 April 2020 20:52:46 +0000
(0:00:00.577) 0:13:20.944 ********
Product
  • Start of changeESS LegacyEnd of change
  • ESS 3000
  • ESS 3200
  • ESS 5000
This is a restriction in the Ansible timestamp module. It shows timestamps even for the “skipped” tasks. If you want to remove timestamps from the output, change the ansible.cfg file inside the container as follows:
  1. vim /etc/ansible/ansible.cfg
  2. Remove ,profile_tasks on line 7.
  3. Save and quit: esc + :wq
After reboot of an ESS 5000 node, systemd could be loaded incorrectly.
Users might see the following error when trying to start GPFS:
Failed to activate service 'org.freedesktop.systemd1': timed out
Product
  • ESS 5000
Power off the system and then power it on again.
  1. Run the following command from the container:
    rpower <node name> off
  2. Wait for at least 30 seconds and run the following command to verify that the system is off:
    rpower <node name> status
  3. Restart the system with the following command:
    rpower <node name> on
In ESS 5000 SLx series, after pulling a hard drive out for a long time wherein the drive has finished draining, when you re-insert the drive, the drive could not be recovered.
Product
  • ESS 5000
Run the following command from EMS or IO node to revive the drive:
mmvdisk pdisk change --rg RGName --pdisk PdiskName --revive

Where RGName is the recovery group that the drive belongs to and PdiskName is the drive's pdisk name.

After the deployment is complete, if firmware on the enclosure, drive, or HBA adapter does not match the expected level, and if you run essinstallcheck, the following mmvdisk settings related error message is displayed:
[ERROR] mmvdisk settings do NOT match best practices. 
Run mmvdisk server configure --verify --node-class  ess5k_ppc64le_mmvdisk to debug.  
Product
  • ESS 3000
  • ESS 5000

The error about mmvdisk settings can be ignored. The resolution is to update the mismatched firmware levels on enclosure, adapter, or HBA adapters to the correct levels.

You can run the mmvdisk configuration check command to confirm.

The mmvdisk settings do not match best practices. Run the mmvdisk server configure --verify --node-class <nodeclass> command.

List the mmvdisk node classes: mmvdisk nc list
Note: essinstallcheck detects inconsistencies from mmvdisk best practices for all node classes in the cluster and stops immediately if an issue is found.
When running essinstallcheck you might see an error message similar to:
System Firmware could not be obtained which will lead to a false-positive PASS message when the script completes.
Product
  • ESS 5000

Run vpdupdate on each IO node.

Rerun essinstallcheck which should properly query the firmware level.
During command-less disk replacement, there is a limit on how many disks can be replaced at one time.
Product
  • ESS 3000
  • ESS 5000
For command-less disk replacement using commands, only replace up to 2 disks at a time. If command-less disk replacement is enabled, and more than 2 disks are replaceable, replace the 1st 2 disks, and then use the commands to replace the 3rd and subsequent disks.
Issue reported with command-less disk replacement warning LEDs.
Product
  • ESS 5000
The replaceable disk will have the amber led on, but not blinking. Disk replacement should still succeed.
After upgrading an ESS 3000 node to version, the pmsensors service needs to be manually started.
Product
  • ESS 3000
  • ESS 3200
After the ESS 3000 upgrade is complete, the pmsensors service does not automatically start. You must manually start the service for performance monitoring to be restored. On each ESS 3000 canister, run the following command:
systemctl start pmsensors
For checking the status of the service, run the following command:
systemctl status --no-pager pmsensors
ESS commands such as essstoragequickcheck, essinstallcheck must be run using -N localhost. If using the hostname such as -N ess3k1a, an error occurs.
Product
  • Start of changeESS LegacyEnd of change
  • ESS 3000
  • ESS 3200
  • ESS 5000
There is currently an issue with running the ESS deployment commands by using the hostname of a node. The workaround is to run checks locally on each node by using localhost. For example, instead of using essstoragequickcheck -N ess3k1a, use the following command:
essstoragequickcheck -N localhost
The canister_failed event does not surface amber LED on the canister or the enclosure LED front panel.
Product
  • ESS 3200
Root cause: The failed canister is not the master canister, and the other canister is not up/running.

Action required: No

During essrun config load, the following error message appears:
cat: /tmp/bmcPassUser_ems: No such file or directory may be seen.
Product
  • Start of changeESS LegacyEnd of change
  • ESS 3000
  • ESS 3200
  • ESS 5000
When you run the essrun config load command, always put the EMS (POWER9) system first in the list, as shown in the following example:
essrun -N ems,ess3200a,ess3200b config load
Start of change
When you run the essrun config check command, the following warning might appear:
# essrun -N essio1 config check
[WARNING]: log file at /var/log/ess/6.1.2.1/essansible.json is not writeable and we cannot create it, aborting
Product
  • Start of changeESS LegacyEnd of change
  • ESS 3000
  • ESS 3200
  • ESS 5000
End of change
Start of changeNo action required. The log file and the folder are created when the warning appears. It is safe to disregard this warning. End of change
Start of change

Call home setup by using the ESS GUI is not working.

Product
  • ESS Legacy
  • ESS 3000
  • ESS 3200
  • ESS 5000
End of change
Start of change

The ESS GUI recently added support for configuring the call home. However, some issues were found during the call home setup. Do not use the GUI for the call home setup. Set up the call home by using the command-line interface.

End of change
Start of change

Migration from ESS Legacy releases (5.3.7.x) to the container version (ESS 6.1.x.x) might revert values in the mmvdisk to default settings.

Product
  • ESS Legacy
End of change
Start of change

For more information about this issue, see IBM Support

.End of change
Start of change

When the essrun -N node1,node2 config load command is run, the top and bottom entries were not created in the /vpd/Inventory file.

Product
  • ESS Legacy
  • ESS 3000
  • ESS 3200
  • ESS 5000
End of change
Start of change

Update the main.yml file.

[CONTAINER]# sed -i 
"s/clusterHostname.stdout/daemonHostname.stdout/g" /opt/ibm/ess/deploy/ansible/roles/configureenv/tasks/main.yml
.End of change
Start of changeWhen the build is extracted by using the --dir option, an error occurred.
Product
  • ESS Legacy
  • ESS 3000
  • ESS 3200
  • ESS 5000
End of change
Start of changeUse only the --start-conatiner option.End of change