IBM Support

VMware PSOD (Purple Screen of of Death) with backtrace 'Heartbeat NMI' with Emulex Adapter - IBM BladeCenter HS23

Troubleshooting


Problem

VMware ESXi crashes with a kernel panic stop error on a purple screen reporting a saved backtrace of 'Heartbeat NMI' (where NMI stands for Non-Maskable Interrupt). VMware 'vmkernel.log' or 'vmkernel-log.1' will show messages similar to these: -WARNING: Heartbeat: 645: PCPU 22 didn't have a heartbeat for 21 seconds; *may* be locked up. -ALERT: NMI: 579: NMI IPI recvd. We Halt. -Panic: 909: Saved backtrace: pcpu 22 Heartbeat NMI

Resolving The Problem

Source

RETAIN tip: H21462

Symptom

VMware ESXi crashes with a kernel panic stop error on a purple screen reporting a saved backtrace of 'Heartbeat NMI' (where NMI stands for Non-Maskable Interrupt).

VMware log file vmkernel.log or vmkernel-log.1 has messages similar to these:

 

-WARNING: Heartbeat: 645: PCPU 22 didn't have a heartbeat for 21 seconds; *may* be locked up.

-ALERT: NMI: 579: NMI IPI recvd. We Halt.

-Panic: 909: Saved backtrace: pcpu 22 Heartbeat NMI

VMware purple screen with error messages

Affected configurations

The system can be any of the following IBM servers:

The system is configured with one or more of the following IBM options:

This tip is not software specific.

The system has the symptom described above.

Solution

Update to Emulex's firmware 10.2.261.36 and driver 10.2.x.x released with '14a' code package.

The firmware and driver files are available by selecting the appropriate Product Group, type of System, Product name, Product machine type, and Operating system on IBM Support's Fix Central web page, at the following URL:

http://www.ibm.com/support/fixcentral/

Additional information

The reason for the stop error is that the OS kernel is not receiving a reply to an IOCTL (a device specific, input/output control, system call) in an allotted amount of time.

The delayed response was a result of the Emulex adapter's processor (BE3 ARM) being tied up managing burst traffic or a broadcast storm and could not process the IOCTL in enough time to satisfy the Operating System (OS).

As a result, the OS has a kernel panic and halts at a stop error.

Document Location

Worldwide

Operating System

BladeCenter:VMware ESX Server

System x Hardware Options:VMware ESX Server

BladeCenter:VMware ESX Server 3

BladeCenter:VMware ESX Server 4

BladeCenter:VMware vSphere 5.0

BladeCenter:VMware vSphere 5.0 x64

PureFlex System and Flex System:VMware vSphere 5.0

PureFlex System and Flex System:VMware vSphere 5.0 x64

PureFlex System and Flex System:VMware ESX Server 4

PureFlex System and Flex System:VMware ESXi 4

System x Hardware Options:VMware ESX Server 3

System x Hardware Options:VMware ESX Server 4

System x Hardware Options:VMware ESXi 4

System x Hardware Options:VMware vSphere 5.0

System x Hardware Options:VMware vSphere 5.0 x64

PureFlex System and Flex System:VMware vSphere 5.5

[{"Type":"HW","Business Unit":{"code":"BU056","label":"Miscellaneous"},"Product":{"code":"HW239","label":"BladeCenter HS23"},"Platform":[{"code":"PF032","label":"VM"}],"Line of Business":{"code":"LOB18","label":"Miscellaneous LOB"}},{"Type":"HW","Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"QU01GRH","label":"PureFlex System and Flex System->x240 Compute Node->8737"},"Platform":[{"code":"PF032","label":"VM"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU050","label":"BU NOT IDENTIFIED"},"Product":{"code":"QU01GRM","label":"PureFlex System and Flex System->x440 Compute Node->7917"},"Platform":[{"code":"PF032","label":"VM"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QU01SAH","label":"System x Hardware Options->Ethernet->10 Gb->00Y3264"},"Platform":[{"code":"PF032","label":"VM"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU050","label":"BU NOT IDENTIFIED"},"Product":{"code":"QU02BWN","label":"PureFlex System and Flex System->x222 Compute Node->7916"},"Platform":[{"code":"PF032","label":"VM"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU056","label":"Miscellaneous"},"Product":{"code":"HW239","label":"BladeCenter HS23"},"Platform":[{"code":"PF032","label":"VM"}],"Line of Business":{"code":"LOB18","label":"Miscellaneous LOB"}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QUOEAGS","label":"System x Hardware Options->Ethernet->10 Gb->90Y3566"},"Platform":[{"code":"PF032","label":"VM"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"QUOEAVI","label":"System x Hardware Options->Ethernet->10 Gb->90Y9332"},"Platform":[{"code":"PF032","label":"VM"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"QUOEB13","label":"System x Hardware Options->Ethernet->10 Gb->90Y3550"},"Platform":[{"code":"PF032","label":"VM"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QUOEB1G","label":"System x Hardware Options->Ethernet->10 Gb->00Y3266"},"Platform":[{"code":"PF032","label":"VM"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"QUOEB1N","label":"System x Hardware Options->Ethernet->10 Gb->81Y3120"},"Platform":[{"code":"PF032","label":"VM"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"QUOEBOM","label":"System x Hardware Options->Ethernet->10 Gb->49Y7951"},"Platform":[{"code":"PF032","label":"VM"}],"Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
30 January 2019

UID

ibm1MIGR-5093255