Troubleshooting
Problem
Change /etc/nagios/nrpe on the cluster
Resolving The Problem
How to customize the defined thresholds for Nagios checks in /etc/nagios/nrpe.cfg
When Nagios complains about the process count on essential idle nodes due to the high number of per core kernel threads,
you may need to customize the defined thresholds for Nagios checks in /etc/nagios/nrpe.cfg
Note: compute-centos-5.4-x86_64 is used as the example node group
1) Edit /etc/nagios/nrpe.tpl (template), such as the following:
command[check_total_procs]=/usr/lib64/nagios/plugins/check_procs -w 150 -c 200 -s RSZT ==>
command[check_total_procs]=/usr/lib64/nagios/plugins/check_procs -w 150 -c 200 -s RSZT ==>
command[check_total_procs]=/usr/lib64/nagios/plugins/check_procs -w 200 -c 300 -s RSZT
2) Execute "addhost -u" to trigger the regeneration of the /etc/cfm/compute-centos-5.4-x86_64/etc/nagios/nrpe.cfg,
based on the template, on the installer.
This also invokes the synchronization of the file to the nodes within the node group under /etc/nagios/nrpe.cfg.
This will trigger restarting the of the nagios daemons on the nodes.
3) Run the following to restart the nrpe on all the monitored nodes.
pdsh -a service nrpe restart
[{"Product":{"code":"SSZUCA","label":"IBM Spectrum Cluster Foundation"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"--","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"2.0","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}},{"Product":{"code":"SSZUCA","label":"IBM Spectrum Cluster Foundation"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":null,"Platform":[{"code":"","label":""}],"Version":"","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]
Was this topic helpful?
Document Information
Modified date:
16 September 2018
UID
isg3T1015783