-R
Runs the job on a host that meets the specified resource requirements.
Categories
resource
Synopsis
bsub -R "res_req" [-R "res_req" ...]Description
A resource requirement string describes the resources a job needs. LSF uses resource requirements to select hosts for job execution. Resource requirement strings can be simple (applying to the entire job), compound (applying to the specified number of slots), or alternative.
- A selection section (select). The selection section specifies the criteria for selecting execution hosts from the system.
- An ordering section (order). The ordering section indicates how the hosts that meet the selection criteria should be sorted.
- A resource usage section (rusage). The resource usage section specifies the expected resource consumption of the task.
- A job spanning section (span). The job spanning section indicates if a parallel job should span across multiple hosts.
- A same resource section (same). The same section indicates that all processes of a parallel job must run on the same type of host.
- A compute unit resource section (cu). The compute unit section specifies topological requirements for spreading a job over the cluster.
- A CPU and memory affinity resource section (affinity). The affinity section specifies CPU and memory binding requirements for tasks of a job.
select[selection_string] order[order_string] rusage[
usage_string [, usage_string][|| usage_string] ...]
span[span_string] same[same_string] cu[cu_string]] affinity[affinity_string]
The square brackets must be typed as shown for each section. A blank space must separate each resource requirement section.
bsub -R "rhel6 || rhel7" myjob
bsub -R "swp > 15 && hpux order[ut]" myjob
You can omit the select keyword (and its square brackets), but if you include a select section, it must be the first string in the resource requirement string. If you do not give a section name, the first resource requirement string is treated as a selection string (select[selection_string]).
For example, the following resource requirements are equivalent:
bsub -R "type==any order[ut] same[model] rusage[mem=1]" myjob
bsub -R "select[type==any] order[ut] same[model] rusage[mem=1]" myjob
If you need to include a hyphen (-) or other non-alphabetic characters within the string, enclose the text in single quotation marks, for example, bsub -R "select[hname!='host06-x12']".
The selection string must conform to the strict resource requirement string syntax described in Administering IBM® Spectrum LSF. The strict resource requirement syntax only applies to the select section. It does not apply to the other resource requirement sections (order, rusage, same, span, or cu). LSF rejects resource requirement strings where an rusage section contains a non-consumable resource.
If RESRSV_LIMIT is set in lsb.queues, the merged application-level and job-level rusage consumable resource requirements must satisfy any limits set by RESRSV_LIMIT, or the job will be rejected.
Any resource for run queue length, such as r15s, r1m or r15m, specified in the resource requirements refers to the normalized run queue length.
By default, memory (mem) and swap (swp) limits in select[] and rusage[] sections are specified in KB. Use the LSF_UNIT_FOR_LIMITS parameter in the lsf.conf file to specify a larger unit for these limits (MB, GB, TB, PB, EB, or ZB).
- KB or K (kilobytes)
- MB or M (megabytes)
- GB or G (gigabytes)
- TB or T (terabytes)
- PB or P (petabytes)
- EB or E (exabytes)
- ZB or Z (zettabytes)
The specified unit is converted to the appropriate value specified by the LSF_UNIT_FOR_LIMITS parameter. The converted limit values round up to a positive integer. For resource requirements, you can specify unit for mem, swp and tmp in select and rusage section.
By default, the tmp resource is not supported by the LSF_UNIT_FOR_LIMITS parameter. Use the parameter LSF_ENABLE_TMP_UNIT=Y to enable the LSF_UNIT_FOR_LIMITS parameter to support limits on the tmp resource.
If the LSF_ENABLE_TMP_UNIT=Y and LSF_UNIT_FOR_LIMIT=GB parameters are set, the following conversion happens.
bsub -C 500MB -M 1G -S 1TB -F 1M -R "rusage[mem=512MB:swp=1GB:tmp=1TB]" sleep 100
bsub -C 1 -M 1 -S 1024 -F 1024 -R "rusage[mem=0.5:swp=1:tmp=1024]" sleep 100
Multiple resource requirement strings
bsub -R "select[swp > 15]" -R "select[hpux] order[r15m]"
-R rusage[mem=100]" -R "order[ut]" -R "same[type]"
-R "rusage[tmp=50:duration=60]" -R "same[model]" myjob
When application-level and queue-level cu sections are also defined, the job-level cu section takes precedence and overwrites the application-level requirement definition, which in turn takes precedence and overwrites the queue-level requirement definitions.
bsub -n 64 -R "cu[excl:type=enclosure:maxcus=4]" myjob
bsub -R "bigmem" myjob
bsub -R "defined(bigmem)" myjob
bsub -R "select[defined(verilog_lic)] rusage[verilog_lic=1]" myjob
bsub -R "rusage[mem=20, license=1:duration=2]" myjob
bsub -R "rusage[mem=20:swp=50:duration=1h, license=1:duration=2]" myjob
bsub -R "rusage[mem=20,swp=50:duration=1h, license=1:duration=2]" myjob
bsub -R "rusage[swp=50:duration=2h:decay=1, license=1:duration=2]" myjob
bsub -R "rusage[mem=20:duration=30:decay=1, lic=1:duration=30]" myjob
bsub -R "rusage[mem=(50 10):duration=(10):decay=(0)]" myjob
In the following example, you are running an application version 1.5 as a resource called app_lic_v15 and the same application version 2.0.1 as a resource called app_lic_v201. The license key for version 2.0.1 is backward compatible with version 1.5, but the license key for version 1.5 does not work with 2.0.1.
- If you can only run your job using one version of the application, submit the job without
specifying an alternative resource. To submit a job that only uses
app_lic_v201:
bsub -R "rusage[app_lic_v201=1]" myjob
- If you can run your job using either version of the application, try to reserve version 2.0.1 of
the application. If it is not available, you can use version 1.5. To submit a job that tries
app_lic_v201 before trying
app_lic_v15:
bsub -R "rusage[app_lic_v201=1||app_lic_v15=1]" myjob
- If different versions of an application require different system resources, you can specify
other resources in your rusage strings. To submit a job that uses 20 MB
of memory for app_lic_v201 or 20 MB of memory and 50 MB of swap space
for
app_lic_v15:
bsub -R "rusage[mem=20:app_lic_v15=1||mem=20:swp=50:app_lic_v201=1]" myjob
bsub -R "rusage[bwidth=1:threshold=5]" myjob
For example, a job is submitted that consumes 1 unit of bandwidth (the resource bwidth), but the job should not be scheduled to run unless the bandwidth on the host is equal to or greater than 5. In this example, bwidth is a decreasing resource and the threshold value is interpreted as a floor. If the resource in question was increasing, then the threshold value would be interpreted as a ceiling.
An affinity resource requirement string specifies CPU and memory binding requirements for a resource allocation that is topology aware. An affinity[] resource requirement section controls the allocation and distribution of processor units within a host according to the hardware topology information that LSF collects.
Resource reservation method
Specify the resource reservation method in the resource usage string by using the /job, /host, or /task keyword after the numeric value. The resource reservation method specified in the resource string overrides the global setting that is specified in the ReservationUsage section of the lsb.resources file. You can only specify resource reservation methods for consumable resources. Specify the resource reservation methods as follows:
- value/job
Specifies per-job reservation of the specified resource. This is the equivalent of specifying PER_JOB for the METHOD parameter in the ReservationUsage section of the lsb.resources file.
- value/host
Specifies per-host reservation of the specified resource. This is the equivalent of specifying PER_HOST for the METHOD parameter in the ReservationUsage section of the lsb.resources file.
- value/task
Specifies per-task reservation of the specified resource. This is the equivalent of specifying PER_TASK for the METHOD parameter in the ReservationUsage section of the lsb.resources file.
- Basic
syntax:
resource_name=value/method:duration=value:decay=value
For example,
rusage[mem=10/host:duration=10:decay=0]
- Multi-phase memory
syntax:
resource_name=(value ...)/method:duration=(value ...):decay=value
For example,
rusage[mem=(50 20)/task:duration=(10 5):decay=0]
Compound resource requirements
In some cases different resource requirements may apply to different parts of a parallel job. The first execution host, for example, may require more memory or a faster processor for optimal job scheduling. Compound resource requirements allow you to specify different requirements for some slots within a job in the queue-level, application-level, or job-level resource requirement string.
Compound resource requirement strings can be set by the application-level or queue-level RES_REQ parameter, or used with the bsub -R option when a job is submitted. The bmod -R option accepts compound resource requirement strings for pending jobs but not running jobs.
Special rules take effect when compound resource requirements are merged with resource requirements defined at more than one level. If a compound resource requirement is used at any level (job, application, or queue) the compound multi-level resource requirement combinations apply.
Compound resource requirement strings are made up of one or more simple resource requirement strings as follows:
num1*{simple_string1} + num2*{simple_string2} + ...
where numx is the number of slots affected and simple_stringx is a simple resource requirement string.
The same resource requirement can be used within each component expression (simple resource requirement). For example, for static string resource res1 and res2, a resource requirement such as the following is permitted:
"4*{select[io] same[res1]} + 4*{select[compute] same[res1]}"
With this resource requirement, there are two simple subexpressions, R1 and R2. For each of these subexpressions, all slots must come from hosts with equal values of res1. However, R1 may occupy hosts of a different value than those occupied by R2.
"{4*{select[io]} + 4*{select[compute]}} same[res1]"
This syntax allows users to express that both subexpressions must reside on hosts that have a common value for res1.
In general, there may be more than two subexpressions in a compound resource requirement. The global same will apply to all of them.
"{4*{same[res1]} + 4*{same[res1]}} same[res2]"
In addition, a compound resource requirement expression with a global same may be part of a larger alternative resource requirement string.
- Submitting a job: bsub -R "res_req_string" <other_bsub_options> a.out
- Configuring application profile (lsb.applications file): RES_REQ = "res_req_string"
- Queue configuration (lsb.queues file): RES_REQ = "res_req_string"
Syntax:
- A single compound resource requirement:
"{compound_res_req} same[same_string]"
- A compound resource requirement within an alternative resource
requirement:
"{{compound_res_req} same[same_string]} || {alt_res_req}"
- A compound resource requirement within an alternative resource
requirement with delay:
"{alt_res_req} || {{compound_res_req} same[same_string]}@delay"
The delay option is a positive integer.
- Compound resource requirements cannot contain the || operator. Compound resource requirements cannot be defined (included) in any multiple -R options.
- Compound resource requirements cannot contain the compute unit (cu) keywords balance or excl, but works normally with other cu keywords (including pref, type, maxcus, and usablecuslots).
- Resizable jobs can have compound resource requirements, but only the portion of the job represented by the last term of the compound resource requirement is eligible for automatic resizing. When you use the bresize release to release slots, you can release only slots represented by the last term of the compound resource requirement. To release slots in earlier terms, run the bresize release command repeatedly to release slots in subsequent last terms.
- Compound resource requirements cannot be specified in the definition of a guaranteed resource pool.
- Resource allocation for parallel jobs using compound resources is done for each compound resource term in the order listed instead of considering all possible combinations. A host rejected for not satisfying one resource requirement term will not be reconsidered for subsequent resource requirement terms.
- (final res_req number of slots) = (total number of job slots)-(num1+num2+ ...)
- num_slots=(num1+num2+num3+ ...)
For jobs with the minimum and maximum number of slots specified with the bsub -n min, max command, the number of slots in the compound resource requirement must be compatible with the minimum and maximum specified.
You can specify the number of slots or processors through the resource requirement specification. For example, you can specify a job that requests 10 slots or processors: 1 on a host that has more than 5000 MB of memory, and an additional 9 on hosts that have more than 1000 MB of memory:
bsub -R "1*{mem>5000} + 9*{mem>1000}" a.out
- In the rusage[] section, use the ngpus_physical resource to request the number of physical GPUs, together with the gmodel option specify the GPU model, the gmem option to specify the amount of reserved GPU memory, the glink option to request only GPUs with special connections (xGMI connections for AMD GPUs or NVLink connections for Nvidia GPUs), and the mig option to specify Nvidia Multi-Instance GPU (MIG) device requirements.
- In the span[] section, use the gtile keyword to specify the number of GPUs requested on each socket.
bsub -R "1*{span[gtile=!] rusage[ngpus_physical=2:gmem=1G]} + 4*{span[ptile=1] rusage[ngpus_physical=1:gmem=10G]}" ./app
Alternative resource requirements
In some circumstances more than one set of resource requirements may be acceptable for a job to be able to run. LSF provides the ability to specify alternative resource requirements.
An alternative resource requirement consists of two or more individual simple or compound resource requirements. Each separate resource requirement describes an alternative. When a job is submitted with alternative resource requirements, the alternative resource picked must satisfy the mandatory first execution host. If none of the alternatives can satisfy the mandatory first execution host, the job will PEND.
Alternative resource requirement strings can be specified at the application-level or queue-level RES_REQ parameter, or used with bsub -R when a job is submitted. bmod -R also accepts alternative resource requirement strings for pending jobs.
The rules for merging job, application, and queue alternative resource requirements are the same as for compound resource requirements.
- Multiple bsub -R commands
- Taskstarter jobs, including those with the tssub command
- Hosts from HPC integrations that use toplib, including cpuset and Blue Gene hosts.
- Compute unit (cu) sections specified with balance or excl keywords.
If a job with alternative resource requirements specified is re-queued, it will have all alternative resource requirements considered during scheduling. If a @D delay time is specified, it is interpreted as waiting, starting from the original submission time. For a restart job, @D delay time starts from the restart job submission time.
An alternative resource requirement consists of two or more individual resource requirements. Each separate resource requirement describes an alternative. If the resources cannot be found that satisfy the first resource requirement, then the next resource requirement is tried, and so on until the requirement is satisfied.
Alternative resource requirements are defined in terms of a compound resource requirement, or an atomic resource requirement:
bsub -R "{C1 | R1 } || {C2 | R2 }@D2 || ... || {Cn | Rn }@Dn"
- The OR operator (||) separates one alternative resource from the next.
- The C option is a compound resource requirement.
- The R option is a resource requirement which is the same as the current
LSF resource requirement, except when there is:
- No rusage OR (||).
- No compute unit requirement cu[...]
- The D option is a positive integer:
- @D is optional: Do not evaluate the alternative resource requirement until D minutes after submission time, and requeued jobs still use submission time instead of requeue time. There is no D1 because the first alternative is always evaluated immediately.
- D2 <= D3 <= ... <= Dn
- Not specifying @D means that the alternative will be evaluated without delay if the previous alternative could not be used to obtain a job's allocation.
For example, you may have a sequential job, but you want alternative resource requirements (that is, if LSF fails to match your resource, try another one).
bsub -R "{ select[type==any] order[ut] same[model] rusage[mem=1] } ||
{ select[type==any] order[ls] same[ostype] rusage[mem=5] }" myjob
You can also add a delay before trying the second alternative:
bsub -R "{ select[type==any] order[ut] same[model] rusage[mem=1] } ||
{ select[type==any] order[ls] same[ostype] rusage[mem=5] }@4" myjob
You can also have more than 2 alternatives:
bsub -R "{select[type==any] order[ut] same[model] rusage[mem=1] } ||
{ select[type==any] order[ut] same[model] rusage[mem=1] } ||
{ select[type==any] order[ut] same[model] rusage[mem=1] }@3 ||
{ select[type==any] order[ut] same[model] rusage[mem=1] }@6" myjob
Some parallel jobs might need compound resource requirements. You can specify alternatives for parallel jobs the same way. That is, you can have several alternative sections each with brace brackets ({ }) around them separated by ||):
bsub -n 2 -R "{ 1*{ select[type==any] order[ut] same[model] rusage[mem=1]} + 1
*{ select[type==any] order[ut] same[model] rusage[mem=1] } } ||
{ 1*{ select[type==any] order[ut] same[model] rusage[mem=1]} +
1*{ select[type==any] order[ut] same[model]
rusage[mem=1] } }@6" myjob
Alternatively, the compound resource requirement section can have both slots requiring the same resource:
bsub -n 2 -R "{ 1*{ select[type==any] order[ut] same[model] rusage[mem=1]}
+1*{ select[type==any] order[ut] same[model] rusage[mem=1] } } ||
{ 2*{ select[type==any] order[ut] same[model] rusage[mem=1] } }@10" myjob
An alternative resource requirement can be used to indicate how many tasks the job requires. For example, a job may request 4 tasks on Solaris host types, or 8 tasks on Linux86 hosts types. If the -n option is provided at the job level, then the values specified must be consistent with the values implied by the resource requirement string:
bsub -R " {8*{type==LINUX86}} || {4*{type==SOLARIS}}" a.out
If they conflict, the job submission is rejected:
bsub -n 3 -R " {8*{type==LINUX86}} || {4*{type==SOLARIS}}" a.out
- In the rusage[] section, use the ngpus_physical resource to request the number of physical GPUs, together with the gmodel option specify the GPU model, the gmem option to specify the amount of reserved GPU memory, the glink option to request only GPUs with special connections (xGMI connections for AMD GPUs or NVLink connections for Nvidia GPUs), and the mig option to specify Nvidia Multi-Instance GPU (MIG) device requirements.
- In the span[] section, use the gtile keyword to specify the number of GPUs requested on each socket.
bsub -R "{4*{span[ptile=1] rusage[ngpus_physical=1:gmodel=K80]}} || {1*{span[gtile=1] rusage[ngpus_physical=2:gmodel=P100]}}" ./app