IBM Support

Why does the parameter JOB_IDLE configured in lsb.queues sometimes not take effect ?

Troubleshooting


Problem

Why does the parameter JOB_IDLE configured in lsb.queues sometimes not take effect?

Symptom

JOB_IDLE configured in lsb.queues specifies a threshold for idle job exception handling.
The value should be a number between 0.0 and 1.0 representing CPU time/runtime.
If the job idle factor is less than the specified threshold, LSF will invoke LSF_SERVERDIR/eadmin to trigger the action to send an email for a job idle exception.

The invoke interval is controlled by parameter EADMIN_TRIGGER_DURATION set in lsb.params.

Sometimes the administrator will not get an email for job idle exception from LSF after set JOB_IDLE for the specific queue normal in lsb.queues.

For example:
1. Set up JOB_IDLE= 0.6 for the specific queue normal in lsb.queues,

2. Set EADMIN_TRIGGER_DURATION = 2 (min),

3. Submit a job whose runtime is 1000s, whose job_idle will be 0, totally less than the threshold 0.6.

But the administrator will not get an email for job idle exception from LSF.

Cause

There is a parameter DETECT_IDLE_JOB_AFTER which should be set in lsb.params.

Diagnosing The Problem

Syntax
DETECT_IDLE_JOB_AFTER=time_minutes

Description
The minimum job run time before mbatchd reports that the job is idle.

Default
20 (mbatchd checks if the job is idle after 20 minutes of run time)

Resolving The Problem

The default value of parameter DETECT_IDLE_JOB_AFTER is 20 minutes. When the job's run time is less than 20 minutes, it finishes before mbatchd has chance to report that job is idle, so the administrator failed to receive an email for job idle exception from LSF.

In above scenario, if you set DETECT_IDLE_JOB_AFTER=1, you can successfully receive an email for job idle exception from LSF.

[{"Product":{"code":"SSETD4","label":"Platform LSF"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"--","Platform":[{"code":"PF016","label":"Linux"}],"Version":"9.1.0;9.1.1;9.1.2;9.1.3;Version Independent","Edition":"Advanced;Enterprise;Standard","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
17 June 2018

UID

isg3T1026333