IBM Support

Setting or Resetting drive Ready RLM Bit

Troubleshooting


Problem

Some ESS customers may find inconsistencies with the default Ready LED state for drives in an Elastic Storage Server cluster.  While this problem has no impact to ESS operations, it may lead to incorrect conclusions about the health of a particular drive or multiple drives in a drive enclosure. 

Symptom

A Ready light may be on solid for one drive where all others are off or blinking intermittently.

Cause

Under certain circumstances, it is normal to expect that some Ready lights me default to a state other than off for some ESS drive parts.

Environment

This issue only applies to Elastic Storage Server disk enclosures.

Diagnosing The Problem

IBM does not recommend the use of the drive LEDs for servicing the  Elastic Storage Server.  Instead, Disk Hospital functions should be used as outlined by the Knowledge Center. 
https://www.ibm.com/support/knowledgecenter/en/SSYSP8_5.1.0/com.ibm.spectrum.scale.raid.v4r23.adm.doc/bl1adv_introdiskhospital.htm

Resolving The Problem

The command mmgetpdisktopology can be used to get a list of disk devices, but also lists enclosures, adapters, and other SES devices.  To find disks, you must use a command such as egrep to scan for disk enclosure slot locations.  An example (assuming bash shell) is below:
mmgetpdisktopology | sed -e '/:NVRAM:/d' -e '/:eui./d' -e '/LSI/d' -e '/name:/d' -e '/host:/d'
Grep here is using two expressions to find disks in enclosures that have an internal drawer device (such as 1818-80E), and those that do not (such as a 5147-024).  The output may look something like the following sample:
sdrx,sdsu:0:/dev/sdrx,/dev/sdsu:C0A832565C61EC94|e3d1s22|BB01R|C0A832335C61EC69|DA1||7200:naa.5000C500A77A0663:U78CB.001.WZS0C5W-P1-C2-L180-L0,U78CB.001.WZS0C5W-P1-C10-L180-L0:[4.0.180.0],[2.0.180.0]:/dev/sg529,/dev/sg552:IBM-ESXS:ST14000NM0288 E:ECH4:ZHZ1BJTF0000C92057ER:01LU841:13902809137152:naa.500062B200F130A0,naa.500062B200F16160:50050cc11a432e7f.50050cc11a824eff:SHG1000179Y0MRT:L/R:1-22:5000c500a77a0661.5000c500a77a0662:sg183,sg168:
sdrd,sdrw:0:/dev/sdrd,/dev/sdrw:C0A832565C61EC95|e3d1s12|BB01R|C0A832335C61EC69|DA1||7200:naa.5000C500A77A06DB:U78CB.001.WZS0C5W-P1-C2-L170-L0,U78CB.001.WZS0C5W-P1-C10-L170-L0:[4.0.170.0],[2.0.170.0]:/dev/sg509,/dev/sg528:IBM-ESXS:ST14000NM0288 E:ECH4:ZHZ1BK390000C9201NHR:01LU841:13902809137152:naa.500062B200F130A0,naa.500062B200F16160:50050cc11a432e3f.50050cc11a824ebf:SHG1000179Y0MRT:L/R:1-12:5000c500a77a06d9.5000c500a77a06da:sg183,sg168:
Only two lines of output were copied out for this example.  The enclosure, drawer and slot location is between the first pair of vertical bars in each line (|e3d1sNN|).  Further to the right on each line will be the hardware device names that are used in the following commands:
:/dev/sg529,/dev/sg552:
:/dev/sg509,/dev/sg528:
One could get a list of one of these disk paths alone by using something like the following (further refining may be necessary depending on the scope desired):
encdisks=mmgetpdisktopology | sed -e '/:NVRAM:/d' -e '/:eui./d' -e '/LSI/d' -e '/name:/d' -e '/host:/d' | sed -e 's|.*/dev/sg|/dev/sg|' -e 's|:.*||g'
These two lines represent two separate disk drives.  The two separate /dev entries on each line represent separate paths to each drive.  So only one of the two paths needs to be used for each drive (such as /dev/sg529 for the first drive, and /dev/sg509 for the second drive).  No action is needed on the second path to either drive.  It also does not matter if the second path is used to perform the action.
A bash for loop could be used to turn on or off the ready light for all selected drives:
for disk in $encdisks
do
  sg_wr_mode <parameters from below> $disk &
done
To turn the Ready (RLM) light off, use the following command (example comes from /dev/sg529 only, and would need to be repeated for other drives as needed:
sg_wr_mode -p 0x19 -c 0,0,10 -m 0,0,10 --save  /dev/sgNNN
Conversely, to turn the Ready (RLM) light on, use the following command (again, example comes from /dev/sg529 only):
sg_wr_mode -p 0x19 -c 0,0,00 -m 0,0,10 --save /dev/sgNNN
If using a loop, substitute the appropriate loop variable ($disk) for /dev/sgNNN.  Visual inspection of the drive ready light can confirm the results.  Remember that disk activity may cause the drive Ready light to blink.


Document Location

Worldwide

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STHMCM","label":"IBM Elastic Storage Server"},"Component":"","Platform":[{"code":"PF043","label":"Red Hat"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
08 July 2019

UID

ibm10872638