IBM Support

NMI and SERR - LSI Logic MegaRAID 8480e SAS Controller

Troubleshooting


Problem

The server is attached to two EXP3000 storage enclosures (one on each channel) and the logical drives are spanning channels. The system will reboot approximately every three weeks with a SERR in system log and a command timeout from the MegaRAID adapterin the log files.

Resolving The Problem

Source

RETAIN tip: H193699

Symptom

The server is attached to two EXP3000 storage enclosures (one on each channel) and the logical drives are spanning channels.

The system will reboot approximately every three weeks with a SERR in system log and a command timeout from the MegaRAID adapter in the log files.

Affected configurations

This tip is not machine specific.

The system is configured with one or more of the following IBM Options:

  • LSI Logic MegaRAID 8480 SAS Controller, Option part number 39R8850

The 1.03.20-0400 firmware for the 8480e MegaRAID is affected.

This tip is not Operating System specific.

Solution

Apply the package version 5.1.1-0075, firmware version 1.03.30-0459.

The file is available from the IBM System x Support web site at the following URL:

Additional information

XSCALE based MegaRAID controllers show high Deferred Procedure Call (DPC) latency with occasional peak in the order of 190mS.

This issue is not seen on 1078 based controllers.

The issue manifests itself as BlueScreen due to a parity error /SERR reported by the host. Some servers have deployed ICH7 root complex that has PCI-e completion timeout value programmed lower than 50mS. NMI is caused by command completion timeout followed by unexpected completion. There is not a way to program the PCI-e completion timeout value in pre- PCI-e 2.0 silicon.

The device driver performs a read to Messaging Unit. In case of XSCALE IOP, a PCIE transaction to the MU has to travel through the PCIX bus to get to the ATU unit and then to the MU. It is possible that some other device on the PCIX bus won arbitration and the PCIE transaction has to wait until it gets the bus granted.

To alleviate this, bus arbitration policy of IOP was changed so as to bias it in favor of MU. Default arbitration policy provides fair and equal priority among contending agents.

Document Location

Worldwide

Operating System

System x Hardware Options:All operating systems listed

[{"Type":"HW","Business Unit":{"code":"BU051","label":"N\/A"},"Product":{"code":"SUPPORT","label":"IBM Worldwide Support"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"LOB33","label":"N\/A"}}]

Document Information

Modified date:
02 November 2020

UID

ibm1MIGR-5076976