IBM Support

nmon for AIX detects LPM & DLPAR Changes

How To


Summary

The nmon performance monitor detects and reports Live Partition Mobility and Dynamic LPAR (VM) changes like CPU, memory and adapter add or remove

Objective

Nigels Banner

Steps

It seems that not many nmon users know that nmon for AIX will detect and show on-screen or save to the nmon file many AIX events.

  • That is the problem about being the nmon designer and developer for so many years - it is called expert blindness.
  • It is totally obvious to me until a respected nmon user asks a question to which my initial reaction is how can you not know the answer to that?

It can detect these among a number of other events and record them:

  1. Live partition Mobility (LPM) is where you move a running Logical Partition (LPAR / Virtual Machine (VM) between two POWER Servers.
  2. Dynamic LPAR (DLPAR) CPU changes (add or remove) like the Entitlement (E) or Virtual CPU count (VP).
  3. Dynamic LPAR (DLPAR)  Memory changes (add or remove).

Probably more details than you need to know:

How does a program do this?
These events are known as AIX Dynamic Resource changes (DR for short but not to be confused with Disaster Recovery).   
nmon has to set up a signal handler function and then make a system call to let the kernel know it wants to get a software signal when the events happen - it then gets an asynchronous interrupt a millisecond or two after the event occurs. 
nmon makes a note of the event details and returns from the signal handler immediately and handles the data at the next update to the screen or writing to the output file.


These are recorded in the nmon for AIX output file as below BBBR rows.   Changes go through three phases:

  1. Check: in this phase, the running application processes are warned that there is a possible change being made.  The process can ignore this (the default) or elect to cancel the change and stop it from happening.
  2. Pre: in this phase, processes are warned the change is about to happen so get ready now if the process wants to take some action.  It might for example in the removal of resources, try to help by releasing memory or try to reduce the CPU time it is taking or use fewer threads.
  3. Post:- in this phase, processes are told that the resource action is complete and to take action if needed.   It might for example the case of added resources taking extra memory for caching data or trying to increase the CPU time or spin off more threads.

Below I take a number of actions and watch them appear in the nmon for the AIX output file.

 I extracted the lines we are covering in this blog with the following command for my particular nmon output file

# grep BBBR blue_170622_1120.nmon

there are a lot of columns and the header line stretches out far too wide to be useful so below I list them vertically:

  1. BBBR, = the section of the nmon file
    000, = used to get the records in the right order
    when,
    add,
    remove,
    cpu,
    mem,
    check,
    pre,
    doit,
    post,
    posterror,
    force,
    bindproc,
    softpset,
    hardpset,
    plock,
    pshm,
    ent_cap,
    var_wgt,
    splpar_capable,
    splpar_shared,
    splpar_capped,
    cap_constrained,
    migrate,
    hibernate,
    partition,
    wpar,
    checkpoint,
    restart,
    logical_cpu,
    bind_cpu,
    memory_change,
    capacity,
    delta_cap,
    old_serialno,
    current_serialno,
    lpar_number,
    lpar_name

 

Below is the LPM from a POWER8 server with serial number F9D494 to a POWER8 server with serial number F90EC7 - I have added colour to highlight the important bits.

The action is: Migrate partition and the Serial number changes.

BBBR,000,when,add,remove,cpu,mem,check,pre,doit,post,posterror,force,bindproc,softpset,hardpset,plock,pshm,ent_cap,var_wgt,splpar_capable, . . . . . . . .
BBBR,001,11:23:09,-,-,-,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,001,11:23:09,-,-,-,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,002,11:23:41,-,-,-,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,002,11:23:41,-,-,-,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,003,11:24:37,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,003,11:24:37,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,004,11:25:13,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F9D494,F90EC7,17,w3-blue
BBBR,004,11:25:13,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F9D494,F90EC7,17,w3-blue
BBBR,005,11:25:14,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,005,11:25:14,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,006,11:25:14,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,006,11:25:14,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,007,11:25:14,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,007,11:25:14,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,008,11:25:14,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,008,11:25:14,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue

For end to end the LP took just 2 minutes.

Below I Migrate the VM back to the original POWER8 server

BBBR,009,11:31:43,-,-,-,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,009,11:31:43,-,-,-,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,010,11:31:57,-,-,-,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,010,11:31:57,-,-,-,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,011,11:32:31,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,011,11:32:31,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F90EC7,F90EC7,17,w3-blue
BBBR,012,11:33:10,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F90EC7,F9D494,17,w3-blue
BBBR,012,11:33:10,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,migrate,-,partition,-,-,-,0,0,0,0,0,F90EC7,F9D494,17,w3-blue
BBBR,013,11:33:10,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,013,11:33:10,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,014,11:33:10,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,014,11:33:10,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,015,11:33:10,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,015,11:33:10,-,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,016,11:33:10,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,016,11:33:10,-,-,-,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,0,0,F9D494,F9D494,17,w3-blue

Below I perform DLPAR change from 4 Virtual CPUs (VP)  to 3 Virtual CPUs (VP) - this results in this SMT=4 virtual machine with 16 CPU core threads being reduced to 12.

It does this in a series of single CPU core threads being removed rather than a large jump.

Note the action

  • remove
  • CPU

columns and the reducing value in the change columns

  • logical_cpu,
  • bind_cpu

columns going down to 15 ... ending at 12.

BBBR,017,11:47:15,-,remove,cpu,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,15,15,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,017,11:47:15,-,remove,cpu,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,15,15,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,018,11:47:15,-,remove,cpu,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,15,15,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,018,11:47:15,-,remove,cpu,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,15,15,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,019,11:47:15,-,remove,cpu,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,15,15,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,019,11:47:15,-,remove,cpu,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,15,15,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,020,11:47:15,-,remove,cpu,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,14,14,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,020,11:47:15,-,remove,cpu,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,14,14,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,021,11:47:15,-,remove,cpu,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,14,14,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,021,11:47:15,-,remove,cpu,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,14,14,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,022,11:47:15,-,remove,cpu,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,14,14,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,022,11:47:15,-,remove,cpu,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,14,14,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,023,11:47:15,-,remove,cpu,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,13,13,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,023,11:47:15,-,remove,cpu,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,13,13,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,024,11:47:15,-,remove,cpu,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,13,13,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,024,11:47:15,-,remove,cpu,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,13,13,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,025,11:47:15,-,remove,cpu,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,13,13,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,025,11:47:15,-,remove,cpu,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,13,13,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,026,11:47:15,-,remove,cpu,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,12,12,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,026,11:47:15,-,remove,cpu,-,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,12,12,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,027,11:47:15,-,remove,cpu,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,12,12,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,027,11:47:15,-,remove,cpu,-,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,12,12,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,028,11:47:15,-,remove,cpu,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,12,12,0,0,0,F9D494,F9D494,17,w3-blue
BBBR,028,11:47:15,-,remove,cpu,-,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,12,12,0,0,0,F9D494,F9D494,17,w3-blue

Below I perform DLPAR change from Entitlement 2.0 processing units (CPU cores in human speak) to 3.0 processing units. Adding resources can be done in one jump as it will not cause a performance issues when new resources arrive.

Not the AIX kernel does not do floating point maths (like most UNIX kernels for performance reasons in User to Kernel mode changes) ) so Entitlement of 2.00 it represented as 200.

So note the action

  • add
  • ent_cap (entitled_capacity)

columns and the change columns

  • capacity = 200 is the old Entitlement
  • delta_cap = 100 is the 100 = 1.0 added Entitlement
BBBR,029,11:48:10,add,-,-,-,check,-,-,-,-,-,-,-,-,-,-,ent_cap,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,200,100,F9D494,F9D494,17,w3-blue
BBBR,029,11:48:10,add,-,-,-,check,-,-,-,-,-,-,-,-,-,-,ent_cap,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,200,100,F9D494,F9D494,17,w3-blue
BBBR,030,11:48:10,add,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,ent_cap,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,200,100,F9D494,F9D494,17,w3-blue
BBBR,030,11:48:10,add,-,-,-,-,pre,-,-,-,-,-,-,-,-,-,ent_cap,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,200,100,F9D494,F9D494,17,w3-blue
BBBR,031,11:48:10,add,-,-,-,-,-,-,post,-,-,-,-,-,-,-,ent_cap,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,200,100,F9D494,F9D494,17,w3-blue
BBBR,031,11:48:10,add,-,-,-,-,-,-,post,-,-,-,-,-,-,-,ent_cap,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,0,200,100,F9D494,F9D494,17,w3-blue

 

Later I added 2 GB of memory to the LPAR

Not the AIX kernel does not do floating point maths (like most UNIX kernels for performance reasons in User to Kernel mode changes) ) so Entitlement of 2.00 it represented as 200.

So note the action

  • add
  • mem = memory

columns and the change column

  • memory_change = 2048.0 is the 2 GBs of memory
BBBR,001,17:31:19,add,-,-,mem,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,2048,0,0,F9D494,F9D494,17,w3-blue
BBBR,001,17:31:19,add,-,-,mem,check,-,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,2048,0,0,F9D494,F9D494,17,w3-blue
BBBR,002,17:31:19,add,-,-,mem,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,2048,0,0,F9D494,F9D494,17,w3-blue
BBBR,002,17:31:19,add,-,-,mem,-,pre,-,-,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,2048,0,0,F9D494,F9D494,17,w3-blue
BBBR,003,17:31:20,add,-,-,mem,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,2048,0,0,F9D494,F9D494,17,w3-blue
BBBR,003,17:31:20,add,-,-,mem,-,-,-,post,-,-,-,-,-,-,-,-,-,splpar_capable,splpar_shared,-,-,-,-,-,-,-,-,0,0,2048,0,0,F9D494,F9D494,17,w3-blue


 You may have noticed there are many other possible Dynamic events than we have shown here but I think you get the idea from these four favourite DLPAR changes.

 I hope this helps, cheers Nigel Griffiths.

Additional Information


Other places to find content from Nigel Griffiths IBM (retired)

Document Location

Worldwide

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG10","label":"AIX"},"Component":"","Platform":[{"code":"PF002","label":"AIX"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"HW1W1","label":"Power -\u003EPowerLinux"},"Component":"","Platform":[{"code":"PF016","label":"Linux"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
14 June 2023

UID

ibm11115625