IBM Support

On POWER9 systems with firmware FW950, removing a PCI adapter from a system that is running might fail

Flashes (Alerts)


Abstract

On POWER9™ systems with firmware FW950, the dynamic LPAR (DLPAR) operation of removing a PCI adapter might fail. PCI adapters that are affected by this issue request a legacy interrupt within their PCI configuration space, even though the legacy interrupts might not be used, as MSI-X interrupts are preferred. If you are running firmware FW950 on your server, this bug is observed.

Content

Linux Releases Affected

Red Hat® Enterprise Linux® 8.1
Red Hat Enterprise Linux 8.2
Red Hat Enterprise Linux 8.3
SUSE Linux Enterprise Server 12, Service Pack 5
SUSE Linux Enterprise Server 15, Service Pack 1
SUSE Linux Enterprise Server 15, Service Pack 2

IBM Systems Affected
All IBM® POWER9 systems.

I/O Devices Affected

You can verify if a PCI adapter is affected by running the following command:
$ lspci -v -s <PCI device address>
An output that is similar to the following example is displayed:
...
        Flags: bus master, fast devsel, latency 0, IRQ XX, NUMA node x
...
If a “IRQ XX” string is displayed in the output, a legacy interrupt is enabled on the device and you might observe the issue during the DLPAR operation.

Description

If this bug is observed, an error message similar to the following example is displayed:
HSCL2929 The dynamic removal of I/O resources failed: The I/O slot dynamic partitioning operation failed.
Following is an example of how the IDs of the failed I/O slots and the reasons for failure are displayed:
<date> <time> caDlparCommand:execv to drmgr
Validating PHB DLPAR capability...yes.
Isolation failed for XXXXXXXX with -9001
Valid outstanding translations exist.

Workaround

To work around the issue you can use one of the following methods:
  1. You can pass the xive=off argument on the command line of the Linux kernel from the boot loader.
  2. You can configure the processor mode of the LPAR to "POWER9_Base" instead of "POWER9". You can do this by using the graphical user interface (GUI) of the Hardware Management Console (HMC).
Fix Outlook
IBM is working with Red Hat and SUSE  to release a fix for this issue. The fix for this issue should come as part of a future RHEL or SLES release. Open a support ticket with Red Hat or SUSE if a hot fix is needed before the next release.

[{"Line of Business":{"code":"LOB26","label":"Storage"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SGMV157","label":"IBM Support for Red Hat Enterprise Linux Server"},"ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)"}]

Document Information

Modified date:
18 January 2021

UID

ibm16381266