IBM Support

POWER CPU Memory Affinity 4 - Aggressive Intelligent Threads

How To


Summary

Power Systems gain their massive performance with lots a technology this series details many of them.

Objective

Nigels Banner

Originally, written in 2012 for the DeveloperWorks AIXpert Blog for POWER7 but updated in 2019 for POWER8 and POWER9.

Steps

This article is a follow-on from a blog from Chris Gibson highlighting a question & concern from one of his customers in Australia. They were comparing POWER6 and POWER7 based computers and the utilisation numbers from the SMT Logical Processors and the graphs look different. I looked at some nmon data (what else!) and they look different. Then I ran a simple generated workload test, duplicated the graphs and then I explain them. Note, these are my personal observations rather than an official AIX developers insider statement.
I ran a workload:
  ncpu -p8 -z 25 -h1 -s 900
  • This reads 8 processes, sleeping 25% of the time but pause for 1 second after each 1 second of CPU time and then stop after 900 seconds.
  • This gives us a bunch of programs running and starting and stopping fairly randomly. The also provides a safety net, so this artificial workload will stop - even if I forget to kill the program!
I then collected nmon data with:
  nmon -f -s5 -c60
  • This command means: collect a snapshot every 5 seconds for 60 snapshots (5 minutes worth).
On POWER6 Power 570
  • Virtual machine called blue.
  • Running AIX 6.1 TL6 on latest firmware.
  • With Entitlement=0.4, uncapped, SMT=2 and virtual processes = 4.
The CPU_SUMM graph looks like this:
blue on POWER6
On POWER7 Power 770
  • Virtual machine called diamond5.
  • Running AIX 7.1 TL1 on latest firmware.
  • With Entitlement=0.4, uncapped, SMT=4 and virtual processes = 4.
The CPU_SUMM graph looks like this:
purple POWER7
Comments:
  • The POWER7 virtual machine has twice the logical CPUs as expected as it is running SMT=4 instead on SMT=2
  • The POWER6 graphs show a fairly even split of work between logical CPU CPU001 and CPU002 (these two combined make up the first POWER6 physical CPU-core) - this is because it is in SMT=2 mode and there is no real favourite between the SMT threads. One thread is as good as the other.
  • The POWER7 graph show that for logical CPUs CPU001, CPU002, CPU002, CPU003 (these four combined make up the first POWER7 physical CPU-core), that the first logical CPU is very much more favoured than the second and third and fourth logical CPUs are not used much at all.
  • The POWER7 behaviour is Intelligent SMT Threading mode switching in action. It knows there are not enough processes running (low run queue) to use SMT=4 so it has switched to SMT=2 and moves the processes to the first two logical CPUs. Then it notices that there are not even enough processes running for a fair chunk of the time to need SMT=2 so it switches to SMT=1 and moves the processes to the first logical CPU. This means the single running progress is getting the internal resources for the whole physical CPU-core with no contention from other threads and so gets a speed boost.
  • Both POWER6 and POWER7 were using roughly 2.5 physical CPUs but it is clear with POWER7 that we could remove a physical CPU-core or even two physical CPU-cores as you can clearly see there are plenty of unused SMT logical CPUs to run work on.
  • Once more for the record for shared CPUs: It is impossible to average the logical CPU utilisation stats to work out how busy are your physical CPU-cores because the logical CPUs are con-currently executing on the shared internal compute units of the physical CPU-core. You can't find the 2.5 in the graphs above.
The customer question was: Is there something wrong with POWER7?

The answer is: Nothing wrong, actually, there is something very right with POWER7!

Note: POWER8 and POWER9 based computers follow the POWER7 mode of operation but except they have a newer higher SMT=8 mode.


 

Additional Information


Other places to find Nigel Griffiths IBM (retired)

Document Location

Worldwide

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG10","label":"AIX"},"Component":"","Platform":[{"code":"PF002","label":"AIX"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"HW1W1","label":"Power -\u003EPowerLinux"},"Component":"","Platform":[{"code":"PF016","label":"Linux"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG60","label":"IBM i"},"Component":"","Platform":[{"code":"PF012","label":"IBM i"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}}]

Document Information

Modified date:
13 June 2023

UID

ibm11126389