Remote queue workload job-forwarding scheduler
Enhanced scheduler decisions can be customized to consider characteristics of remote queues before forwarding a job. Remote queue attributes such as queue priority, number of preemptable jobs, and queue workload are sent to the submission scheduler. The decisions made by the scheduler, based on this information, depend on the setting of MC_PLUGIN_SCHEDULE_ENHANCE in lsb.params.
Queue workload and configuration is considered in conjunction with remote resource availability (MC_PLUGIN_REMOTE_RESOURCE=Y is automatically set in lsf.conf).
Defining MC_PLUGIN_SCHEDULE_ENHANCE as a valid value, the submission scheduler supports the same remote resources as MC_PLUGIN_REMOTE_RESOURCE: -R "type==type_name", and -R "same[type]"
Remote queue counter collection
The submission cluster receives up-to-date information about each queue in remote clusters. This information is considered during job forwarding decisions.
Queue information is collected by the submission cluster when MC_PLUGIN_SCHEDULE_ENHANCE (on the submission cluster) is set to a valid value. Information is sent by each execution cluster when MC_PLUGIN_UPDATE_INTERVAL (on the execution cluster) is defined, and the submission cluster is collecting queue information.
Some jobs may be forwarded between counter update intervals. The submission scheduler increments locally stored counter information as jobs are forwarded, and reconciles incoming counter updates to account for all jobs.
The following counter information is collected for each queue:
Queue ID
Queue priority
Total slots: The total number of slots (on all hosts) jobs are dispatched to from this queue. This includes slots on hosts with the status ok, and with the status closed due to running jobs.
Available slots: The free slots, or slots (out of the total slots) which do not currently have a job running.
Running slots: The number of slots currently running jobs from the queue.
Pending slots: The number of slots required by jobs pending on the queue.
Preemptable available slots: The number of slots the queue can access through preemption.
Preemptable slots
Preemptable queue counters (1...n):
Preemptable queue ID
Preemptable queue priority
Preemptable available slots
After a MultiCluster connection is established, counters take the time set in MC_PLUGIN_UPDATE_INTERVAL to update. Scheduling decisions made before this first interval has passed do not accurately account for remote queue workload.
The parameter MC_PLUGIN_SCHEDULE_ENHANCE was introduced in LSF Version 7 Update 6. All clusters within a MultiCluster configuration must be running a version of LSF containing this parameter to enable the enhanced scheduler.
Remote queue selection
The information considered by the job-forwarding scheduler when accounting for workload and remote resources depends on the setting of MC_PLUGIN_SCHEDULE_ENHANCE in lsb.params. Valid settings for this parameter are:
RESOURCE_ONLY
Jobs are forwarded to the remote queue with the requested resources and the largest (available slots)-(pending slots).
COUNT_PREEMPTABLE
Jobs are forwarded as with RESOURCE_ONLY, but if no appropriate queues have free slots, the best queue is selected based on the largest (preemptable available slots)-(pending slots).
COUNT_PREEMPTABLE with HIGH_QUEUE_PRIORITY
Jobs are forwarded as with COUNT_PREEMPTABLE, but jobs are forwarded to the highest priority remote queue.
COUNT_PREEMPTABLE with PREEMPTABLE_QUEUE_PRIORITY
Jobs are forwarded as with COUNT_PREEMPTABLE, but queue selection is based on which queues can preempt lowest priority queue jobs.
COUNT_PREEMPTABLE with PENDING_WHEN_NOSLOTS
Jobs are forwarded as with COUNT_PREEMPTABLE, but if no queues have free slots even after preemption, submitted jobs pend.
COUNT_PREEMPTABLE with HIGH_QUEUE_PRIORITY and PREEMPTABLE_QUEUE_PRIORITY
If no appropriate queues have free slots, the best queue is selected based on:queues that can preempt lowest priority queue jobs
the number of preemptable jobs
the pending job workload
COUNT_PREEMPTABLE with HIGH_QUEUE_PRIORITY and PENDING_WHEN_NOSLOTS
If no appropriate queues have free slots, queues with free slots after jobs are preempted are considered.
If no queues have free slots even after preemption, submitted jobs pend.
COUNT_PREEMPTABLE with PREEMPTABLE_QUEUE_PRIORITY and PENDING_WHEN_NOSLOTS
If no appropriate queues have free slots, queues are considered based on:the most free slots after preempting lowest priority queue jobs and preemptable jobs
If no queues have free slots even after preemption, submitted jobs pend.
COUNT_PREEMPTABLE with HIGH_QUEUE_PRIORITY and PREEMPTABLE_QUEUE_PRIORITY and PENDING_WHEN_NOSLOTS
If no appropriate queues have free slots, queues are considered based on:the most free slots after preempting lowest priority queue jobs and preemptable jobs
If no queues have free slots even after preemption, submitted jobs pend.
DYN_CLUSTER_WEIGHTING
Sets a policy to select the best receiving queue for forwarded jobs. LSF considers queue preference, the queue with the least actual available slots, and the pending ratio in selecting the receiving queue.
In the queue filtering phase, LSF performs an additional check against the IMPT_SLOTBLKG limit in lsb.queues. If a receive queue reaches its IMPT_SLOTBLKG limit, that queue is removed from the candidate queue list.
In the candidate queues ordering phase, LSF orders the candidate receive queues based on whether some queues can meet the job's slot requirements. If some queues can meet the job's slot requirements, the queue that has the highest preference is selected; if multiple queues have the same preference, the queue that has the least number of available job slots is selected as the receive queue. If no queues can meet the job's slot requirements, the queue with the lowest pending ratio is selected; if multiple queues have the same pending ratio, the queue with the highest preference is selected as the receive queue.
Note: DYN_CLUSTER_WEIGHTING cannot be combined with any other option specified in MC_PLUGIN_SCHEDULE_ENHANCE.
The figure shown illustrates the scheduler decision-making process for valid settings of MC_PLUGIN_SCHEDULE_ENHANCE.
When the scheduler looks for maximum values, such as for (available slots)-(pending slots), these values can be negative so long as they are within the pending job limit for a receive-jobs queue set by IMPT_JOBBKLG in lsb.queues.

Limitations
Advance reservation
When an advance reservation is active on a remote cluster, slots within the advance reservation are excluded from the number of available slots. Inactive advance reservations do not affect the number of available slots since the slots may still be available for backfill jobs.
Same boolean resource within hostgroups
Hosts in a hostgroup configured without the required same boolean resources can cause ineffectual job-forwarding decisions from the scheduler.
For example, a job may be forwarded to a queue accessing a hostgroup with many slots available, only some of which have the boolean resource required. If there are not enough slots to run the job it will return to the submission cluster, which may continue forwarding the same job back to the same queue.
Same host type within hostgroups
A remote queue hostgroup satisfies host type requirements when any one of the hosts available is the host type requested by a job. As for boolean resources, the submission cluster assumes all slots within a hostgroup are of the same host type. Other hostgroup configurations can result in unexpected job-forwarding decisions.
Configure remote resource and preemptable job scheduling
About this task
Submission cluster scheduler considers whether remote resources exist, and only forwards jobs to a queue with free slots or space in the MultiCluster pending job threshold (IMPT_JOBBKLG).
If no appropriate queues with free slots or space for new pending jobs are found, the best queue is selected based on the number of preemptable jobs and the pending job workload.
Procedure
Configure remote resource and free slot scheduling
About this task
Submission cluster scheduler considers whether remote resources exist, and only forwards jobs to a queue with free slots or space in the MultiCluster pending job threshold (IMPT_JOBBKLG).If no appropriate queues with free slots or space for new pending jobs are found, the best queue is selected based on which queues can preempt lower priority jobs.
If no queues have free slots even after preemption, jobs pend on the submission cluster.
Procedure
Configure remote resource, preemptable job, and queue priority free slot scheduling
About this task
All scheduler options are configured.
Submission cluster scheduler considers whether remote resources exist, and only forwards jobs to a queue with free slots or space in the MultiCluster pending job threshold (IMPT_JOBBKLG).
If no appropriate queues with free slots or space for new pending jobs are found, the best queue is selected based on the number of free slots after preempting low priority jobs and preemptable jobs.
If no queues have free slots even after preemption, jobs pend on the submission cluster.
Procedure
Examples
MultiCluster job forwarding is enabled from a send-queue on Cluster1 to the receive-queues HighPriority@Cluster2 and HighPriority@Cluster3. Both clusters have lower priority queues from running local jobs, and the high priority queues can preempt jobs from the lower priority queues. The scheduler on Cluster1 has the following information about the remote clusters:
Example 1: MC_PLUGIN_SCHEDULE_ENHANCE=COUNT_PREEMPTABLE:
Cluster2 (100 total slots)
queue=HighPriority, priority=60, running slots=20, pending slots=20
queue=LowPriority, priority=20, running slots=50, pending slots=0
Cluster3 (100 total slots)
queue=HighPriority, priority=70, running slots=30, pending slots=5
queue=LowPriority, priority=20, running slots=60, pending slots=0
Cluster2 has a total of 70 running slots out of 100 total slots, with 20 pending slots. The number of (available slots) -(pending slots) for Cluster2 is 10. Cluster3 has a total of 90 running slots out of 100 total slots, with 5 pending slots. The number of (available slots) -(pending slots) for Cluster3 is 5. Thus a job forwarded from Cluster1 is sent to HighPriority@Cluster2.
Example 2: MC_PLUGIN_SCHEDULE_ENHANCE=COUNT_PREEMPTABLE PREEMPTABLE_QUEUE_PRIORITY:
Cluster2 (100 total slots)
queue=HighPriority, priority=50, running slots=20, pending slots=20
queue=LowPriority, priority=30, running slots=80, pending slots=0
Cluster3 (100 total slots)
queue=HighPriority, priority=50, running slots=30, pending slots=15
queue=LowPriority, priority=20, running slots=70, pending slots=0
In both Cluster1 and Cluster2, running jobs occupy all 100 slots. LowPriority@Cluster2 has a queue priority of 30, while LowPriority@Cluster3 has a queue priority of 20. Thus a job forwarded from Cluster1 is sent to HighPriority@Cluster3, where slots can be preempted from the lowerest priority queue.
Example 3: MC_PLUGIN_SCHEDULE_ENHANCE=COUNT_PREEMPTABLE HIGH_QUEUE_PRIORITY PREEMPTABLE_QUEUE_PRIORITY:
Cluster2 (100 total slots)
queue=HighPriority, priority=60, running slots=20, pending slots=20
queue=LowPriority, priority=20, running slots=50, pending slots=0
Cluster3 (100 total slots)
queue=HighPriority, priority=70, running slots=30, pending slots=5
queue=LowPriority, priority=20, running slots=60, pending slots=0
Cluster2 has a total of 70 running slots out of 100 total slots, with 20 pending slots. The number of (available slots) -(pending slots) for Cluster2 is 10. Cluster3 has a total of 90 running slots out of 100 total slots, with 5 pending slots. The number of (available slots) -(pending slots) for Cluster3 is 5.
Although (available slots)-(pending slots) is higher for Cluster2, Cluster3 contains a higher priority queue. Thus a job forwarded from Cluster1 is sent to HighPriority@Cluster3.
Example 4: MC_PLUGIN_SCHEDULE_ENHANCE=COUNT_PREEMPTABLE HIGH_QUEUE_PRIORITY PREEMPTABLE_QUEUE_PRIORITY:
Cluster2 (100 total slots)
queue=HighPriority, priority=60, running slots=20, pending slots=20
queue=LowPriority, priority=20, running slots=80, pending slots=0
Cluster3 (100 total slots)
queue=HighPriority, priority=60, running slots=30, pending slots=5
queue=LowPriority, priority=20, running slots=70, pending slots=0
In both Cluster1 and Cluster2, running jobs occupy all 100 slots. In this case (preemptable available slots)-(pending slots) is considered. For HighPriority@Cluster2 this number is (80-20)=60; for HighPriority@Cluster3 this number is (70-5)=65. Both queues have the same priority, thus a job forwarded from Cluster1 is sent to HighPriority@Cluster3.