IBM Support

LPM Validation failure: HSCLA356 HSCLA29A with OS RC 69 and "Start initiator failed. errno=79"

Troubleshooting


Problem

LPM validation fails against target VIOS(s)

This applies to VIOS 3.1.

Symptom


Message
HSCLA356 The RMC command issued to partition <VIOS_partition> failed. This means that destination VIOS partition <VIOS_partition> cannot host the virtual adapter 9 on the migrating partition.
HSCLA29A The RMC command issued to partition <VIOS_partition> failed.
The partition command is:
migmgr -f find_devices -t vscsi -C 0x3 -a ACTIVE_LPM -d 1
The RMC return code is:
0
The OS command return code is:
69
The OS standard out is:
Running method '/usr/lib/methods/mig_vscsi'
69

VIOS_DETAILED_ERROR
Executed find_devices on VIOS '<VIOS_partition>' (hostname: <VIOS_hostname>)
Client Target WWPNs: 500xxxxxxxxx0162 500xxxxxxxxx0142 500xxxxxxxxx0152
domain_id for fscsi0 is: 75
Start initiator failed. errno=79
scsi_sciolst error info: adap_flags: 0x1, failure_type: 1,
fail_reason_code: 5, fail_reason_exp: 13, einval_arg: 0
Start Initiator failed for wwpn c05xxxxxxxxx006f on /dev/fscsi0
Unable to get WWPN list for
domain_id for fscsi2 is: 58
Start initiator failed. errno=79
scsi_sciolst error info: adap_flags: 0x1, failure_type: 1,
fail_reason_code: 5, fail_reason_exp: 13, einval_arg: 0
Start Initiator failed for wwpn c05xxxxxxxxx006f on /dev/fscsi2
Unable to get WWPN list for
rc = 69 MIG_LACK_RESOURCE
End Detailed Message.
The OS standard err is:

The search was performed for the following device description:
...snip...
<activeWWPN>0xc05xxxxxxxxx006e</activeWWPN>
<inActiveWWPN>0xc05xxxxxxxxx006f</inActiveWWPN>
...

Cause

SAN switch issue.

Environment

PowerVM VIOS 3.1 managed by HMC

Diagnosing The Problem

There is a known issue with Cisco SAN switches.  Determine if LPM environment involves Cisco.  If yes, find out what version is being used. 

Resolving The Problem

Probable Cause #1

If LPM environment involves Cisco SAN switch, there is a known issue in Cisco switch fixed in versions:
5.2(6a)S6
5.2(7.11)S0
6.1(2.27)
Contact Cisco Support for more details.

Customers that have hit the Cisco defect have reported the following errors to be logged on the target VIOS during the LPM operation:

Sep 20 10:20:13 fcs3  T FCA_ERR14  (1 of 2) ELS to FFFFFE failed with LS_RJT, (0x00; non-ELS), Logical busy, Invalid N_Port/F_Port Name
Sep 20 10:20:13 fcs3  T FCA_ERR14  (2 of 2) ELS failed with LS_RJT

These errors indicate the initiator WWNs are attempting to perform FDISC or rather fabric login and are being rejected by the switch fabric login server @ 0xFFFFFE with LS_RJT reason code (Logical busy) reason explanation (invalid N_Port/F_Port Name).   In such case, the vendor should be engaged to determine why they are returning the above LS_RJT when the host is attempting an FDISC / fabric login.

Probable Cause #2

Configuration issue on the SAN switch.  If probable cause #1 is not applicable and the target VIOS(s) are logging the FCA_ERR14 errors mentioned in Probable Cause #1, contact your local Switch Support Representative for further investigation.

Probable Cause #3

SAN switch misconfiguration.
A switch on the SAN might be configured to use features that extend the Fibre Channel standard in ways that are not compatible with Live Partition Mobility. Disabling the feature solves some problems related to failed Fibre Channel login operations.  For more details, see «FLOGI quiesce timeout» interaction with LPM or contact your Switch Support Representative

[{"Type":"MASTER","Line of Business":{"code":"LOB57","label":"Power"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSPHKW","label":"PowerVM Virtual I\/O Server"},"ARM Category":[{"code":"a8m50000000L0NHAA0","label":"PowerVM VIOS-\u003EPARTITION MOBILITY\/LPM-\u003ELPM + AIX VFC"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Document Information

Modified date:
29 September 2023

UID

ibm10735777