Flashes (Alerts)
Abstract
On IBM POWER9 systems that are enabled with Virtual Persistent Memory, Enhanced Error Handling (EEH) errors might be observed.
Content
Linux® Releases Affected
SUSE Linux ® Enterprise Server 15, Service Pack 1
SUSE Linux ® Enterprise Server 15, Service Pack 1
IBM Systems Affected
All IBM POWER9 systems
I/O Devices Affected
All dedicated I/O adapters that are assigned to a logical partition with virtual persistent memory enabled, might be impacted by this issue. Virtualized I/O, such as virtual Ethernet, VNIC, virtual Fibre Channel, virtual SCSI, and SR-IOV are not affected by this issue.
Symptoms
When you perform I/O operation between a network or a storage device and a virtual persistent memory, EEH errors might be observed. The following example shows an EEH error:
[ 1988.400852] EEH: Frozen PHB#15-PE#800000 detected
[ 1988.400865] EEH: PE location: N/A, PHB location: N/A
[ 1988.400870] EEH: Frozen PHB#15-PE#800000 detected
[ 1988.400874] EEH: Call Trace:
[ 1988.400881] EEH: [c000000000047968] eeh_dev_check_failure.part.2+0x2b8/0x530
[ 1988.400890] EEH: [c00000000004849c] eeh_check_failure+0xfc/0x140
[ 1988.400900] EEH: [d00000000458a6f0] ipr_eh_abort+0x4d8/0x920 [ipr]
[ 1988.400906] EEH: [c000000000948290] scmd_eh_abort_handler+0x100/0x360
[ 1988.400912] EEH: [c000000000183804] process_one_work+0x304/0x5d0
[ 1988.400917] EEH: [c00000000018433c] worker_thread+0xcc/0x7a0
[ 1988.400923] EEH: [c00000000018e3dc] kthread+0x1ac/0x1c0
[ 1988.400929] EEH: [c00000000000b75c] ret_from_kernel_thread+0x5c/0x80
When you perform I/O operation between a network or a storage device and a virtual persistent memory, EEH errors might be observed. The following example shows an EEH error:
[ 1988.400852] EEH: Frozen PHB#15-PE#800000 detected
[ 1988.400865] EEH: PE location: N/A, PHB location: N/A
[ 1988.400870] EEH: Frozen PHB#15-PE#800000 detected
[ 1988.400874] EEH: Call Trace:
[ 1988.400881] EEH: [c000000000047968] eeh_dev_check_failure.part.2+0x2b8/0x530
[ 1988.400890] EEH: [c00000000004849c] eeh_check_failure+0xfc/0x140
[ 1988.400900] EEH: [d00000000458a6f0] ipr_eh_abort+0x4d8/0x920 [ipr]
[ 1988.400906] EEH: [c000000000948290] scmd_eh_abort_handler+0x100/0x360
[ 1988.400912] EEH: [c000000000183804] process_one_work+0x304/0x5d0
[ 1988.400917] EEH: [c00000000018433c] worker_thread+0xcc/0x7a0
[ 1988.400923] EEH: [c00000000018e3dc] kthread+0x1ac/0x1c0
[ 1988.400929] EEH: [c00000000000b75c] ret_from_kernel_thread+0x5c/0x80
Workaround
Workaround for this issue is to boot the system with the disable_ddw=1 kernel parameter by completing the following steps:
Workaround for this issue is to boot the system with the disable_ddw=1 kernel parameter by completing the following steps:
- Edit the file /etc/default/grub. In this file, append disable_ddw=1 to the GRUB_CMDLINE_LINUX_DEFAULT entry.
- Run the grub2-mkconfig -o /boot/grub2/grub.cfg command to update the bootloader configuration.
Fix Outlook
IBM is working with SUSE to release a fix for this issue. The fix would come as part of a future SLES maintenance release. Open a support ticket with SUSE if a hot fix is needed before the next corresponding SUSE maintenance release.
See SUSE bug number 1167867.
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SGDMMD","label":"Power System AC922 Server (8335-GTC)"},"ARM Category":[],"Platform":[{"code":"PF048","label":"SUSE"}],"Version":"All versions","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}}]
Product Synonym
SUSE Linux® Enterprise Server 15, Service Pack 1
Was this topic helpful?
Document Information
Modified date:
07 December 2021
UID
ibm16194589