IBM Support

Time Drift on POWER8 and POWER9 Servers

White Papers


Abstract

Time of day drift can occur on POWER8 and POWER9 servers.

Content

Introduction
Some clients noticed that the Power server time drifts seconds per day when compared to other systems, wall clock, or an NTP reference. They might observe it over a period of days, weeks, or months. The client can be asking questions.

  • “Why is the server behaving differently?”
  • “Is there something wrong with my server?”
  • “Why must I use NTP when I never had to before?”
  • “Why is IBM not told me they changed the TOD accuracy of the server?”

It does not indicate hardware needing replacement. The immediate suggestion is that clients use Simple Network Time Protocol (SNTP)/NTP as the power system Time of Day (TOD) can be expected to drift seconds per day when NTP synchronization is not used.

Background
While a “Time of Day” (TOD) function is a part of the power server for many generations, it is never been part of the server base design requirements that the server be able to keep time accurately on its own. While it needs to be a reasonably good timekeeper, it’s not part of the design to be an accurate timekeeper. Accurate timekeeping is reliant upon external synchronization sources such as Network Time Protocol (NTP).

That said, clients have come to observe in the past that the server was, in fact, a very good timekeeper. Some clients have even come to depend on independent server good timekeeping.

Starting in POWER8, the timekeeping accuracy observed by clients under certain conditions is degraded. The TOD is derived from the same oscillators used to operate the system itself. Oscillators were selected by the system designers to support increasingly high-speed communication buses between components in the servers. At the same time, the lack of any requirement that these same oscillators be good timekeepers allowed the selection of components that optimized characteristics that favored high speed interfaces over timekeeping. The result is that customers starting with POWER8 but also in POWER9 noticed that the server, on its own, does not keep time well.

The specified maximum drift of the clock that the processor uses for its reference is not changed since the POWER6 servers. When we examine the POWER6 and POWER7  servers, the components used seem to drift less than the specification allows. These servers used a single output oscillator with an integrated crystal as the processor reference clock. The integration allowed for the oscillator supplier to individually tune out the drift in each part. To satisfy requirements for higher speeds and more clock frequencies, POWER8 and POWER9 servers moved to a multiple-output clock generator with an outboard crystal. The outboard crystal eliminates the possibility to tune out the drift. The change in implementation was driven by the performance requirements of the computer but resulted in a degradation in observed timekeeping even while still meeting the specified maximum drift.

Solutions
The only sure method to eliminate TOD drift is to deploy NTP and it is the IBM- recommended method to synchronize partition and system time and date for several generations of Power servers. Configuring and deploying NTP is outside the scope of the document but is described by OS documentation in IBM Knowledge Center. The reference section has pointers to some of the documentation.

Reasonable Expectations and Best Practices
The current hardware design keeps the TOD to under +/- 3 seconds per day while powered on and operational. That is, the server might gain or lose up to 3 seconds per day on average independent of workload conditions. The number does depend on other factors such as machine temperature and aging affects so that over time, the server could drift more. It is possible to observe differences in drift between different machines of the same machine type and model and even with similar dates of manufacture.

If such drift is not a problem for the customer and they are able to adjust the time manually then no further action is required.

If the frequency of manual adjustment is a concern or if more accuracy is required than described, the client deploys NTP to ensure that the server automatically adjust its time to maintain synchronization with official time.

There are regulatory controls which some clients might be required to comply with. Two such regulations are FINRA and MiiFID II. These require very highly accurate synchronization such as NTP or PTP.

Still, think the Server is Broken?
You might have a server, which is drifting beyond the expectations described. What do you do? If you have a server, which keeps time much worse than outlined, then open a hardware support ticket.
Collect and upload the following data to the ticket:

  •  Full iqyylog (HMC attached)
  • An FSP dump (from primary FSP on redundant FSP servers)
  • An accurate description of the drift observed, how it was measured, and what the server was doing during the period during which the measurements were taken.
    • To what standard was the server time compared with and how (i.e. phone, clock, etc)
    • Server was running the entire time, when powered off, if powered off, was AC removed, etc.
    • If server was powered off, then time is measured at that point. Same for AC removal. Time measurement is taken once AC is restored and/or server powered on.

NOTE: The observation starts with a comparison between the observed system and a reference system that is frequently updated by NTP (If it isn't frequently updated it isn't a reference system but another drifting system). The same comparison has to be made some substantial period later, the reference system does not have to be the same one as long as it is also frequently updated by NTP. (A laptop with 110 ppm** updated to NTP every 24 hours can be up to nearly 10 secs off). “Frequently” and “Substantial” are relative to the required accuracy of the comparison.
** ppm = parts per million and is an engineering notation used to describe small proportions of a substance or other quantity. In this context it represents the closeness to the ideal time. Smaller ppm are closer to the ideal time. In the example, 110ppm represents a drift of +/-110 seconds for every 1 million seconds that transpires.

NTP References
How to implement with IBMi
Here are some useful links for time management of IBM i and Power servers. Use NTP in IBM i to synchronize the IBM i LPAR TOD. Designate a partition as the Time Reference Partition (TRP) to synchronize the FSP and Hypervisor TOD with the TRP.

  • https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_73/rzati/rzatikickoff.htm -- Time Management topic in IBM Knowledge Center for IBMi 7.3
  • https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_73/rzakt/rzaktpdf.pdf?view=kc -- PDF for IBM i 7.3 NTP
  • https://www.ibm.com/support/knowledgecenter/TI0003N/p8hat/p8hat_hmcprepclocks.htm -- Synchronizing FSP and Hypervisor TOD with a Time Reference Partition

How to implement with AIX
Here are some useful links for time management of AIX and Power servers. Use NTP in AIX to synchronize the AIX LPAR TOD. Designate a partition as the Time Reference Partition (TRP) to synchronize the FSP and Hypervisor TOD with the TRP.

  • ntp.conf: https://www.ibm.com/support/knowledgecenter/ssw_aix_72/com.ibm.aix.files/aixfiles 191.htm
  • xntpd: https://www.ibm.com/support/knowledgecenter/ssw_aix_72/com.ibm.aix.cmds6/xntp d.htm
  • Technote at https://www-01.ibm.com/support/docview.wss?uid=isg3T1000653 - although it's written for AIX 5.x, is still applicable. Configure a basic NTP setup between an NTP client and server

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"HW1A1","label":"IBM Power Systems"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SGDMMD","label":"Power System AC922 Server (8335-GTC)"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"TI0007I","label":"IBM Power System E980 (9080-M9S)"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"TI0007E","label":"Power System E950 Server (9040-MR9)"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"TI0005H","label":"IBM Power System L922 (9008-22L)"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"HW1A1","label":"IBM Power Systems"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"TI0005E","label":"Power System S914 Server"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"TI0005G","label":"Power System S922 Server (9009-22A)"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}}]

Document Information

Modified date:
07 December 2021

UID

ibm10967665