It seems that not many nmon users know that nmon for AIX will detect and show on-screen or save to the nmon file many AIX events.
- That is the problem about being the nmon designer and developer for so many years - it is called expert blindness.
- It is totally obvious to me until a respected nmon user asks a question to which my initial reaction is how can you not know the answer to that?
It can detect these among a number of other events and record them:
- Live partition Mobility (LPM) is where you move a running Logical Partition (LPAR / Virtual Machine (VM) between two POWER Servers.
- Dynamic LPAR (DLPAR) CPU changes (add or remove) like the Entitlement (E) or Virtual CPU count (VP).
- Dynamic LPAR (DLPAR) Memory changes (add or remove).
Probably more details than you need to know:
How does a program do this?
These events are known as AIX Dynamic Resource changes (DR for short but not to be confused with Disaster Recovery).
nmon has to set up a signal handler function and then make a system call to let the kernel know it wants to get a software signal when the events happen - it then gets an asynchronous interrupt a millisecond or two after the event occurs.
nmon makes a note of the event details and returns from the signal handler immediately and handles the data at the next update to the screen or writing to the output file.
These are recorded in the nmon for AIX output file as below BBBR rows. Changes go through three phases:
- Check: in this phase, the running application processes are warned that there is a possible change being made. The process can ignore this (the default) or elect to cancel the change and stop it from happening.
- Pre: in this phase, processes are warned the change is about to happen so get ready now if the process wants to take some action. It might for example in the removal of resources, try to help by releasing memory or try to reduce the CPU time it is taking or use fewer threads.
- Post:- in this phase, processes are told that the resource action is complete and to take action if needed. It might for example the case of added resources taking extra memory for caching data or trying to increase the CPU time or spin off more threads.
Below I take a number of actions and watch them appear in the nmon for the AIX output file.
I extracted the lines we are covering in this blog with the following command for my particular nmon output file
# grep BBBR blue_170622_1120.nmon
there are a lot of columns and the header line stretches out far too wide to be useful so below I list them vertically:
-
BBBR, = the section of the nmon file
000, = used to get the records in the right order
when,
add,
remove,
cpu,
mem,
check,
pre,
doit,
post,
posterror,
force,
bindproc,
softpset,
hardpset,
plock,
pshm,
ent_cap,
var_wgt,
splpar_capable,
splpar_shared,
splpar_capped,
cap_constrained,
migrate,
hibernate,
partition,
wpar,
checkpoint,
restart,
logical_cpu,
bind_cpu,
memory_change,
capacity,
delta_cap,
old_serialno,
current_serialno,
lpar_number,
lpar_name
Below is the LPM from a POWER8 server with serial number F9D494 to a POWER8 server with serial number F90EC7 - I have added colour to highlight the important bits.
The action is: Migrate partition and the Serial number changes.
BBBR,000,when,add,remove,cpu,mem,check,pre,doit,post,posterror,force,bindproc,softpset,hardpset,plock,pshm,ent_cap,var_wgt,splpar_capable, . . . . . . . .
BBBR,001,11:23:09,-,-,-,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,001,11:23:09,-,-,-,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,002,11:23:41,-,-,-,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,002,11:23:41,-,-,-,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,003,11:24:37,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,003,11:24:37,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,004,11:25:13,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F9D494,F90EC7,17,w3-blue
BBBR,004,11:25:13,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F9D494,F90EC7,17,w3-blue
BBBR,005,11:25:14,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,005,11:25:14,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,006,11:25:14,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,006,11:25:14,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,007,11:25:14,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,007,11:25:14,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,008,11:25:14,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,008,11:25:14,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
For end to end the LP took just 2 minutes.
Below I Migrate the VM back to the original POWER8 server
BBBR,009,11:31:43,-,-,-,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,009,11:31:43,-,-,-,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,010,11:31:57,-,-,-,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,010,11:31:57,-,-,-,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,011,11:32:31,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,011,11:32:31,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,012,11:33:10,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F90EC7,F9D494,17,w3-blue
BBBR,012,11:33:10,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F90EC7,F9D494,17,w3-blue
BBBR,013,11:33:10,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,013,11:33:10,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,014,11:33:10,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,014,11:33:10,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,015,11:33:10,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,015,11:33:10,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,016,11:33:10,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,016,11:33:10,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
Below I perform DLPAR change from 4 Virtual CPUs (VP) to 3 Virtual CPUs (VP) - this results in this SMT=4 virtual machine with 16 CPU core threads being reduced to 12.
It does this in a series of single CPU core threads being removed rather than a large jump.
Note the action
columns and the reducing value in the change columns
columns going down to 15 ... ending at 12.
BBBR,017,11:47:15,-,remove,cpu,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,15,15,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,017,11:47:15,-,remove,cpu,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,15,15,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,018,11:47:15,-,remove,cpu,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,15,15,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,018,11:47:15,-,remove,cpu,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,15,15,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,019,11:47:15,-,remove,cpu,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,15,15,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,019,11:47:15,-,remove,cpu,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,15,15,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,020,11:47:15,-,remove,cpu,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,14,14,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,020,11:47:15,-,remove,cpu,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,14,14,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,021,11:47:15,-,remove,cpu,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,14,14,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,021,11:47:15,-,remove,cpu,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,14,14,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,022,11:47:15,-,remove,cpu,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,14,14,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,022,11:47:15,-,remove,cpu,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,14,14,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,023,11:47:15,-,remove,cpu,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,13,13,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,023,11:47:15,-,remove,cpu,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,13,13,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,024,11:47:15,-,remove,cpu,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,13,13,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,024,11:47:15,-,remove,cpu,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,13,13,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,025,11:47:15,-,remove,cpu,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,13,13,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,025,11:47:15,-,remove,cpu,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,13,13,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,026,11:47:15,-,remove,cpu,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,12,12,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,026,11:47:15,-,remove,cpu,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,12,12,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,027,11:47:15,-,remove,cpu,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,12,12,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,027,11:47:15,-,remove,cpu,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,12,12,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,028,11:47:15,-,remove,cpu,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,12,12,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,028,11:47:15,-,remove,cpu,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,12,12,0,0,0,F9D494,F9D494,17,w3-blue
Below I perform DLPAR change from Entitlement 2.0 processing units (CPU cores in human speak) to 3.0 processing units. Adding resources can be done in one jump as it will not cause a performance issues when new resources arrive.
Not the AIX kernel does not do floating point maths (like most UNIX kernels for performance reasons in User to Kernel mode changes) ) so Entitlement of 2.00 it represented as 200.
So note the action
- add
- ent_cap (entitled_capacity)
columns and the change columns
- capacity = 200 is the old Entitlement
- delta_cap = 100 is the 100 = 1.0 added Entitlement
BBBR,029,11:48:10,add,-,-,-,check,-,-,-,-,-,-,-,-,-,-,ent_cap,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,200,100,F9D494,F9D494,17,w3-blue
BBBR,029,11:48:10,add,-,-,-,check,-,-,-,-,-,-,-,-,-,-,ent_cap,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,200,100,F9D494,F9D494,17,w3-blue
BBBR,030,11:48:10,add,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,ent_cap,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,200,100,F9D494,F9D494,17,w3-blue
BBBR,030,11:48:10,add,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,ent_cap,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,200,100,F9D494,F9D494,17,w3-blue
BBBR,031,11:48:10,add,-,-,-,-,-,-,post,-,-,-,-,-,-,-,ent_cap,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,200,100,F9D494,F9D494,17,w3-blue
BBBR,031,11:48:10,add,-,-,-,-,-,-,post,-,-,-,-,-,-,-,ent_cap,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,200,100,F9D494,F9D494,17,w3-blue
Later I added 2 GB of memory to the LPAR
Not the AIX kernel does not do floating point maths (like most UNIX kernels for performance reasons in User to Kernel mode changes) ) so Entitlement of 2.00 it represented as 200.
So note the action
columns and the change column
- memory_change = 2048.0 is the 2 GBs of memory
BBBR,001,17:31:19,add,-,-,mem,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,2048,0,0,F9D494,F9D494,17,w3-blue
BBBR,001,17:31:19,add,-,-,mem,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,2048,0,0,F9D494,F9D494,17,w3-blue
BBBR,002,17:31:19,add,-,-,mem,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,2048,0,0,F9D494,F9D494,17,w3-blue
BBBR,002,17:31:19,add,-,-,mem,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,2048,0,0,F9D494,F9D494,17,w3-blue
BBBR,003,17:31:20,add,-,-,mem,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,2048,0,0,F9D494,F9D494,17,w3-blue
BBBR,003,17:31:20,add,-,-,mem,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,2048,0,0,F9D494,F9D494,17,w3-blue
You may have noticed there are many other possible Dynamic events than we have shown here but I think you get the idea from these four favourite DLPAR changes.
I hope this helps, cheers Nigel Griffiths.