IBM Support

How can I understand performance issues from nmon data?

How To


Summary

Learning UNIX and Linux server performance tuning is not a simple topics but takes time and experience.

Objective

Nigels Banner

Inject some reality into those entering the world of performance tuning and setting expecting this to be easy take a couple of hours.

Environment

AIX or Linux to learn and experiment on to learn the tools, stats and how a computer actually works.

Steps

I get this question about twice a month so I thought I would answer it here and refer to this Article.

This is a non-trivial question as it is complicated.

If you are a "just give me the basics on the back of this envelope" sort of person then do one of the three below:

  1. If you want the real easy answer hire me for 10,000 €$£ (Euro Dollar Pounds) a day plus expenses and I will do it for you and teach you at the same time using "1 to 1" skills transfer.
  2. Or phone your local IBM Representative and get our local BP or IBM Lab services involved.
  3. If you have a critical production server/LPAR/VM with serious performance issues and a paid-up support contract raise a Problem Management Report (PMR) - now called a Support Case.  Then prepare a snap and perfPMR (for AIX) ready to upload (when asked) and then confess to supporting what you just changed and messed up :-)  Oh and if you are not at the supported level of HMC, System Firmware, AIX or Linux OS level then prepare to work the weekend upgrading.

First check if you have something broken = don't tune the engine if a wheel has fallen off!  On AIX use errlog and on Linux use dmesg to see if the OS is reporting problems.

If you want to learn, then you first have to understand how UNIX computers actually work, CPU, CPU cores, CPU core threads, logical + physical + virtual CPUs, memory, memory caches, virtual memory, paging space, disks, adapters, device drivers, networks, C code at a detail level (best to have 5 years C coding experience),  kernel knowledge etc.

After all, you can't start tuning a car engine unless you know how all the car parts work together.

A Take a courses on AIX Performance Tuning and another on Linux Performance Tuning plus read 2 or 3 large books on each including one on UNIX/Linux Kernel internals.

B Read all the manual pages for all the performance tools - nmon/njmon is reporting in a clear way the performance stats that you can get from these tools ps, vmstat, sar, iostat, lparstat, mpstat, lvmstat, filemon, svmon  plus on AIX the o commands (aso, ioo,lvmo, nfso, no, raso, schedo, vmo, ohno - one of these is a joke!).

C Run nmon -h and actually read every line - three times - you need to understand and remember it all.

D Read the older IBM Redbooks on Tuning, Benchmarking, Databases and Performance and new ones on the internal component of POWER8/POWER9 servers.

 E Watch all my YouTube videos look for my YouTube video channel (roughly 190 videos):

F Read all my AIXpert Blogs for the last 4 years - actually all of them. 

G Get to a few of the IBM Technical University conferences and take the performance tuning sessions from worldwide experts.  

H Then ...  work in a Benchmarking Centre for a half dozen years or actually watch many large busy servers running for many weeks

  • Can you explain ALL the numbers?
  • Which are "out of whack" ?
  • Which are outside of the Best Practice settings?
  • Ask yourself: What is holding back the server?

All this should take you about 5 to 10 years, if you focus on it.

If I was starting now?

I would look at implementing njmon (note the "j") for JSON stats collection and add the stats to 

  • InfluxDB + Grafana,
  • Splunk,
  • ELK (the Elastic search tools),
  • or other similar live stats databases and browser based graphing engines. 

If you have lots of nmon files from lots of LPARs/VMs you have a data management issue and need to avoid Excel.  So, investigate using nmonchart (really fast at making the graphs and 100% automated)  or even nmon2JSON and the tools above for data management and graphing. Plus you can quickly merge in other JSON data sources like my own "nextract" HMC data.

There are other tools that can accept nmon data but njmon captures far more stats and can live streamed to the above tools for live exploring the data. 

If you can't find links to the above - then you just failed the Performance Tuning IQ test :-)

- You will also have to develop your sense of humour too!

Cheers, Nigel

ps: Below is an example of real-time graphics from njmon -> InfluxDB -> Grafana to give you an idea of what you may be missing!

njmon grafana

    Additional Information


    Other places to find content from Nigel Griffiths IBM (retired)

    Document Location

    Worldwide

    [{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG10","label":"AIX"},"Component":"","Platform":[{"code":"PF002","label":"AIX"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"HW1W1","label":"Power -\u003EPowerLinux"},"Component":"","Platform":[{"code":"PF016","label":"Linux"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"","label":""}}]

    Document Information

    Modified date:
    14 June 2023

    UID

    ibm11114101