Identifying a service action by using system event logs

Use the Intelligent Platform Management Interface (IPMI) program to examine system event logs (SELs) to identify a service action.

Procedure

  1. Use the ipmitool command to examine SELs.
    • To list SELs by using an in-band network, use the following command:

      ipmitool sel elist

    • To list SELs remotely over the LAN, use the following command:
      ipmitool -I lanplus -U <username> -P <password> -H <BMC IP addres or BMC hostname> sel elist
  2. Scan the SELs for an event with the value OEM record de. Did you find a SEL with the value OEM record de?
    If Then
    Yes: Continue with the next step.
    No Go to step 4.
  3. The OEM record de specific log information is indicated by the rightmost digits of the SEL with the value OEM record de. Use Table 1 to determine the service action to perform.
    Table 1. OEM record de specific log information and service action
    OEM record de specific log information Service action
    00xxxxxxxxxx Go to Getting fixes and update the system firmware to the most recent level of firmware that is available. If this SEL event continues to be logged, go to Collecting diagnostic data. Then, go to Contacting IBM service and support.
    01xxxxxxxxxx Go to EPUB_PRC_FIND_DECONFIGURE_PART isolation procedure.
    04xxxxxxxxxx Go to EPUB_PRC_SP_CODE isolation procedure.
    05xxxxxxxxxx Go to EPUB_PRC_PHYP_CODE isolation procedure.
    08xxxxxxxxxx Go to EPUB_PRC_ALL_PROCS isolation procedure.
    09xxxxxxxxxx Go to EPUB_PRC_ALL_MEMCRDS isolation procedure.
    0Axxxxxxxxxx Go to Getting fixes and update the system firmware to the most recent level of firmware that is available. If this SEL event continues to be logged, go to Collecting diagnostic data. Then, go to Contacting IBM service and support.
    10xxxxxxxxxx Go to EPUB_PRC_LVL_SUPPORT isolation procedure.
    11xxxxxxxxxx Go to Getting fixes and update the system firmware to the most recent level of firmware that is available. If this SEL event continues to be logged, go to Collecting diagnostic data. Then, go to Contacting IBM service and support.
    16xxxxxxxxxx Go to Getting fixes and update the system firmware to the most recent level of firmware that is available. If this SEL event continues to be logged, go to Collecting diagnostic data. Then, go to Contacting IBM service and support.
    1Cxxxxxxxxxx Go to Getting fixes and update the system firmware to the most recent level of firmware that is available. If this SEL event continues to be logged, go to Collecting diagnostic data. Then, go to Contacting IBM service and support.
    22xxxxxxxxxx Go to EPUB_PRC_MEMORY_PLUGGING_ERROR isolation procedure.
    2Dxxxxxxxxxx Go to EPUB_PRC_FSI_PATH isolation procedure.
    30xxxxxxxxxx Go to EPUB_PRC_PROC_AB_BUS isolation procedure.
    31xxxxxxxxxx Go to EPUB_PRC_PROC_XYZ_BUS isolation procedure.
    34xxxxxxxxxx Go to Getting fixes and update the system firmware to the most recent level of firmware that is available. If this SEL event continues to be logged, go to Collecting diagnostic data. Then, go to Contacting IBM service and support.
    37xxxxxxxxxx Go to EPUB_PRC_EIBUS_ERROR isolation procedure.
    3Fxxxxxxxxxx Go to EPUB_PRC_POWER_ERROR isolation procedure.
    4Dxxxxxxxxxx Go to Getting fixes and update the system firmware to the most recent level of firmware that is available. If this SEL event continues to be logged, go to Collecting diagnostic data. Then, go to Contacting IBM service and support.
    4Fxxxxxxxxxx Go to EPUB_PRC_MEMORY_UE isolation procedure.
    55xxxxxxxxxx Go to EPUB_PRC_HB_CODE isolation procedure.
    56xxxxxxxxxx Go to EPUB_PRC_TOD_CLOCK_ERR isolation procedure.
    5Cxxxxxxxxxx Go to EPUB_PRC_COOLING_SYSTEM_ERR isolation procedure.
    5Dxxxxxxxxxx Go to Getting fixes and update the system firmware to the most recent level of firmware that is available. If this SEL event continues to be logged, go to Collecting diagnostic data. Then, go to Contacting IBM service and support.
    5Exxxxxxxxxx Go to Getting fixes and update the system firmware to the most recent level of firmware that is available. If this SEL event continues to be logged, go to Collecting diagnostic data. Then, go to Contacting IBM service and support.
    This ends the procedure.
  4. Scan the SELs for an event with the value OEM record df. Did you find a SEL with the value OEM record df?
    If Then
    Yes: Continue with the next step.
    No Go to step 10.
  5. One or more events might be logged around the same time as the event with the value OEM record df. These events require a service action if they meet the following criteria:
    • A service action keyword is present. For a list of service action keywords, see Identifying service action keywords in system event logs.
    • Asserted is in the description.
    • OEM record is not in the description.
    • The event has a time stamp close to the time stamp of the event with the value OEM record df.
  6. Did you find any SEL events that require a service action as defined in step 5?
    If Then
    Yes: Continue with the next step.
    No: Go to Collecting diagnostic data. Then, go to Contacting IBM service and support.
  7. Did you find only one SEL event that requires a service action as defined in step 5?
    If Then
    Yes: Continue with the next step.
    No: Go to step 9.
  8. Record the SEL record ID for the event you identified in step 5. The SEL record ID is indicated by the leftmost digits of the SEL. Use the ipmitool command to display the SEL details.
    • To display SEL details by using an in-band network, use the following command:

      ipmitool sel get <SEL record ID>

      Note: The SEL record ID must be entered in hexadecimal format. For example: 0x1a.
    • To display SEL details remotely over the LAN, use the following command:

      ipmitool -I lanplus -U <username> -P <password> -H <BMC IP address or BMC hostname> sel get <SEL record ID>

      Note: The SEL record ID must be entered in hexadecimal format. For example: 0x1a.
    The sensor ID field contains sensor information in the format sensor name (sensor ID). Record the sensor name, sensor ID, and event description. Then, use the following information to determine the service action to perform:

    This ends the procedure.

  9. You identified more than one event in step 5. The service actions for all of the events that were identified in step 5 must be performed to successfully complete the repair. Record the SEL record IDs for the events that you identified in step 5. The SEL record ID is indicated by the leftmost digits of the SEL. Use the ipmitool command to display SEL details for each SEL record ID that you recorded.
    • To display SEL details by using an in-band network, use the following command:

      ipmitool sel get <SEL record ID>

      Note: The SEL record ID must be entered in hexadecimal format. For example: 0x1a.
    • To display SEL details remotely over the LAN, use the following command:

      ipmitool -I lanplus -U <username> -P <password> -H <BMC IP address or BMC hostname> sel get <SEL record ID>

      Note: The SEL record ID must be entered in hexadecimal format. For example: 0x1a.
    The sensor ID field contains sensor information in the format sensor name (sensor ID). Record the sensor name, sensor ID, and event description. Then, use this information to determine the service action to perform:

    This ends the procedure.

  10. Scan the SEL for an event with the value OEM record c0.
  11. Did you find an event with the value OEM record c0?
    If Then
    Yes: Continue with the next step.
    No: Go to step 13.
  12. The OEM record c0 specific log information is indicated by the rightmost digits of the SEL with the value OEM record c0. Use Table 2 to determine the service action to perform.
    Table 2. OEM record c0 specific log information, description, and service action
    OEM record c0 specific log information Description Service action
    2aff6ffxxxxx A session audit event occurred No service action is required.
    cdxx6fffffff An automatic shutdown event occurred due to high system temperature
    • Search for SEL events that are related to high system temperature and resolve them.
    • Ensure that the room temperature meets the requirements that are specified for the system.
    • Ensure that there are no air flow obstructions at the front or at the rear of the system.
    ceff6fffffff A machine check event occurred Search for serviceable SEL events and resolve them.
    cfff6fffffff An unexpected problem occurred with the voltage regulator output If a machine check event is present with a time stamp close to the time stamp of this event, search for serviceable SEL events and resolve them. If a machine check event is not present with a time stamp close to the time stamp of this event, reboot the system to recover from the system hang. If the problem persists, replace the system backplane.
  13. One or more SEL events might require a service action. These events require a service action if they meet the following criteria:
  14. Did you find one or more SEL events that require a service action as defined in step 13?
    If Then
    Yes: Continue with the next step.
    No: This ends the procedure.
  15. The service actions for all of the events that were identified in step 13 must be performed to successfully complete the repair. Record the SEL record IDs for the events that you identified in step 13. The SEL record ID is indicated by the leftmost digits of the SEL. Use the ipmitool command to display SEL details for each SEL record ID that you recorded.
    • To display SEL details by using an in-band network, use the following command:

      ipmitool sel get <SEL record ID>

      Note: The SEL record ID must be entered in hexadecimal format. For example: 0x1a.
    • To display SEL details remotely over the LAN, use the following command:

      ipmitool -I lanplus -U <username> -P <password> -H <BMC IP address or BMC hostname> sel get <SEL record ID>

      Note: The SEL record ID must be entered in hexadecimal format. For example: 0x1a.
    The sensor ID field contains sensor information in the format sensor name (sensor ID). Record the sensor name, sensor ID, and event description. Then, use this information to determine the service action to perform:

    This ends the procedure.