IBM Support

POWER CPU Memory Affinity 5 - Low Entitlement has a Bad Side Effect

How To


Summary

Power Systems gain their massive performance with lots a technology this series details many of them.

Objective

Nigels Banner

Originally, written in 2012 for the DeveloperWorks AIXpert Blog for POWER7 but updated in 2019 for POWER8 and POWER9.

Steps

With a shared processor virtual machine (I am calling this "VM" but was called LPAR!) there are various suggestions of setting Entitlement ("Desired processing units" on the LPAR profile on the HMC, I am calling this "E") and Virtual Processor numbers (I am calling this "VP").

For Capped, the Entitlement is the maximum guaranteed CPU time that you can't go over and you round up the Entitlement to the next whole number (there is no point of having higher VP as AIX will fold them away as high numbers of CPU are inefficient).

For Uncapped I have seen many possibilities like the following options. For illustration purposes let us take a VM which averages in busy periods 16 physical CPUs, peaks regularly for a few minutes to 18 physical CPUs and we have a shared processor pool of say 48 physical CPUs. Four scenarios:
  1. Maximum flexibility - "Virtual Processors are free, right!"
    1. E as low as possible (as a minimum you need a tenth of a CPU per VP) then VP as high as possible 
    2. Example: VP = 48 and E=48/10 = 4.8
  2. It is production, so it must get what it needs
    1. E around the average CPU use of the VM (as monitored over a week or month) then VP=E * 2.
    2. Example: E=16 and VP=48
  3. It is vital production so it needs to peak and perform
    1. E to cover the regular peaks (as monitored) and then VP a good handful of extra CPUs.
    2. Example: E=18 and VP=36
  4. It is vital but limit the spread across the machine, so VP=Entitlement rounded up plus 1 or 2
    • E to cover the peaks and the minimum extra VP
    • Example: E=18 and VP=20
 All of these involve some compromise but there are two hidden side effects that you should know about. The first is covered below and the virtual Processor numbers in the next AIXpert blog (part 6).
To Low an Entitlement
 If you go for suggestion 1 (above) the VM is guaranteed the E CPU time but then needs to compete in the pool for further CPU time. If the weight factor is the same for all VM then it will get its fair share. You may have decided that it is production (or more important than others in some way) and made the weight factor larger - a good move this just gets the shared of CPU time larger but there is a scheduling side effect that can slow your VM down.

 Let me go through the below diagrams to show you the effect:

Below we have the 10 millisecond dispatch cycle which the Hypervisor uses to run virtual machines (VM) on the physical processor - we have just one processor here but a number of VMs to run on it. The Entitlement is guaranteed to each VM within these 10 millisecond windows.
Low E 1
Below the other VM run and either use up there Entitlement can get forced off the processor or run out of work and voluntarily yield the processor for some other work.
Low E 2
Below - we have 4 milliseconds "spare" to allocate.
Low E 3
Below - we have some virtual machines that don't want any CPU time (they are waiting to an external event = interrupt) but some are still runnable and would like more CPU time. The weight factor is used to allocate more time and we let them run again.
Low E 4
Below - but again some stop and in this example only our VM is still runnable and wants more CPU cycles so it is scheduled again and runs to the end of the 10 millisecond dispatch cycle.
Low E 5
Below - we have used all the CPU cycles available between the VM running and every VM has its Entitlement and uncapped VM got extra CPU cycles based on the weight factor so every one is happy - now we get to start over in the next dispatch cycle.
Low E 6
Below - You may have notices that our VM was scheduled 3 times in 10 milliseconds.
Low E 7
Below is the bit you may not know. The virtual machines are all running on the same physical CPU which has a single Level 1 and Level 2 cache for the program code and data of the programs running in the VM. They are highly secure so one VM can't access the cache lines of another VM (this is normal virtual memory features) but as one VM runs it brings in its memory and knocks out cache lines for other VM. So when our VM starts it may have to reload cache lines that are now "missing" while it waited for CPU time.

The below is a grossly exaggerated worst case scenario to make the point but don't forget to keep the pictures simple our VM got on the CPU only 3 times - it could be much higher numbers of on and off the CPU.
Low E 8
Below we highlight the cache warming up again periods
Low E 9
Below - is the alternative is we give the virtual machine a high and more appropriate Entitlement - then it does not have to give up the CPU until it has used its entire allocated CPU time.
Low E 10
Conclusions:
  • Don't forget 10 milliseconds is a long time on a CPU - 4 GHz (assuming one instruction per cycle) means 4 billion instructions per second and in the 10 milliseconds we get 400,000,000 instructions complete.
  • We have simplified the illustration a lot but I hope the principle is clear - getting the Entitlement "about right" rather than the smallest possible makes for a more efficient use of the CPU time (reduced cache warming time).
  • This is also why the Hypervisor designed insists that the virtual machines have a minimum of a tenth of a CPU - this allows for a quick warm-up time followed by lots of hot cache time to get some work done.


In the next AIXpert Blog, we look at the issues of to Many a Virtual Processors.
- - - The End - - -

Additional Information


Other places to find Nigel Griffiths IBM (retired)

Document Location

Worldwide

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG10","label":"AIX"},"Component":"","Platform":[{"code":"PF002","label":"AIX"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"HW1W1","label":"Power -\u003EPowerLinux"},"Component":"","Platform":[{"code":"PF016","label":"Linux"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG60","label":"IBM i"},"Component":"","Platform":[{"code":"PF012","label":"IBM i"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}}]

Document Information

Modified date:
13 June 2023

UID

ibm11126407