A fix is available
APAR status
Closed as program error.
Error description
"hcr_cmd timeout" in error log, possibly followed by system crash, with a call chain similar to the following: ■003CBE28rmmap_add_io+000168 (000000000001C007, 0000000000000080, 00000003C0400080, 0000000000000001, 0000002500000025) ■003E8410io_map_init+0002F0 (??, ??, ??) ■003E85B0io_map_init_global@AF45_40+000110 (??, ??, ??) ■F1000000A0292ED4mxib_db_alloc+0002F4 (F100068800B30000, 0000000040000003, F00000002FF424C0) ■F1000000A0293484mxib_db_create+000284 (F100068800B30000, F100068800AB2200, 0000000040000003, F100068800AA9000) ■F1000000A0298718mxib_ctl+000538 (8000001A00000000, 0000000500000005, F100068800AA9000, 0000000040000003, 0000000000000000, 0000000000000000) ■00014D70.hkey_legacy_gate+00004C () ■004DF8ACrdevioctl+0000CC (??, ??, ??, ??, ??, ??) ■006805C0spec_ioctl+000080 (??, ??, ??, ??, ??, ??) ■004EBA10vnop_ioctl+000050 (??, ??, ??, ??, ??, ??) ■00556F3Cvno_ioctl+00009C (??, ??, ??, ??, ??) ■005D478Cfp_ioctl+00006C (??, ??, ??, ??) ■00014F50.kernel_add_gate_cstack+000030 () ■F1000000A02A9224mxibHcaOpened+0001C4 (F00000002FF42B38) ■049FF270IbHcaOpen+000950 (??, ??, ??, ??, ??) ■049F9750IcmOpenQp1Stage1+000BB0 (??, ??) If the system does not crash and a f/w upgrade is attempted at this time, f/w corruption may occur.
Local fix
Problem summary
"hcr_cmd timeout" in error log, possibly followed by system crash, with a call chain similar to the following: 003CBE28 rmmap_add_io+000168 (000000000001C007, 0000000000000080, 00000003C0400080, 0000000000000001, 0000002500000025) 003E8410 io_map_init+0002F0 (??, ??, ??) 003E85B0 io_map_init_global@AF45_40+000110 (??, ??, ??) F1000000A0292ED4 mxib_db_alloc+0002F4 (F100068800B30000, 0000000040000003, F00000002FF424C0) F1000000A0293484 mxib_db_create+000284 (F100068800B30000, F100068800AB2200, 0000000040000003, F100068800AA9000) F1000000A0298718 mxib_ctl+000538 (8000001A00000000, 0000000500000005, F100068800AA9000, 0000000040000003, 0000000000000000, 0000000000000000) 00014D70 .hkey_legacy_gate+00004C () 004DF8AC rdevioctl+0000CC (??, ??, ??, ??, ??, ??) 006805C0 spec_ioctl+000080 (??, ??, ??, ??, ??, ??) 004EBA10 vnop_ioctl+000050 (??, ??, ??, ??, ??, ??) 00556F3C vno_ioctl+00009C (??, ??, ??, ??, ??) 005D478C fp_ioctl+00006C (??, ??, ??, ??) 00014F50 .kernel_add_gate_cstack+000030 () F1000000A02A9224 mxibHcaOpened+0001C4 (F00000002FF42B38) 049FF270 IbHcaOpen+000950 (??, ??, ??, ??, ??) 049F9750 IcmOpenQp1Stage1+000BB0 (??, ??) If the system does not crash and a f/w upgrade is attempted at this time, f/w corruption may occur.
Problem conclusion
Give sufficient timeout to HCA commands. Ensure that start_adapter returns correct error code at all times
Temporary fix
Comments
APAR Information
APAR number
IZ73800
Reported component name
AIX 5.3
Reported component ID
5765G0300
Reported release
530
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Submitted date
2010-03-30
Closed date
2010-03-30
Last modified date
2013-04-17
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
AIX 5.3
Fixed component ID
5765G0300
Applicable component levels
R530 PSY U830242
UP10/05/18 I 1000
PTF to Fileset Mapping
U830242 devices.pciex.b3154a63.rte 5.3.9.7
[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG11P","label":"APARs - AIX 5.3 environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"530","Edition":"","Line of Business":{"code":"","label":""}}]
Document Information
Modified date:
17 April 2013