bhosts
Displays hosts and their static and dynamic resources.
Synopsis
bhosts [-w | -l | -e | -o "[field_name | all][:[-][output_width]] ... [delimiter='character']" [-json]] [-a] [-attr] [-alloc] [-cname] [-x] [-X] [-R "res_req"] [host_name ... | host_group ... | compute_unit ...]Description
By default, returns the following information about all hosts: Host name, host status, job state statistics, and job slot limits.
The bhosts command displays output for condensed host groups and compute units. These host groups and compute units are defined by CONDENSE in the HostGroup and ComputeUnit sections of the lsb.hosts file. Condensed host groups and compute units are displayed as a single entry with the name as defined by GROUP_NAME or NAME in the lsb.hosts file.
When EGO adds more resources to a running resizable job, the bhosts command displays the added resources. When EGO removes resources from a running resizable job, the bhosts command displays the updated resources.
The -l and -X options display noncondensed output.
The -s option displays information about the numeric shared resources and their associated hosts.
With LSF multicluster capability, displays the information about hosts available to the local cluster. Use the -e option to see information about exported hosts.
Options
- -a
- Shows information about all hosts, including hosts relinquished to a resource provider (such as EGO or OpenStack) through LSF resource connector. Default output includes only standard LSF hosts.
- -aff
- Displays host topology information for CPU and memory affinity scheduling.
- -alloc
- Shows counters for slots in RUN, SSUSP, USUSP, and RSV. The slot allocation is different depending on whether the job is an exclusive job or not.
- -attr
- Displays information on attributes that are attached to the host. These attributes were created with the battr create command, or automatically created according to attribute requests.
- -cname
- In LSF Advanced Edition,
includes the cluster name for execution cluster hosts in output. The output that is displayed is
sorted by cluster and then by host name.Note: This command option is deprecated and might be removed in a future version of LSF.
- -e
- LSF multicluster capability only. Displays information about resources that were exported to another cluster.
- -gpu [-l]
- Displays GPU information on the host.
The -l option shows more detailed information about the GPUs.
- -json
-
Displays the customized output in JSON format.
When specified, bhosts -o displays the customized output in the JSON format.
This option applies only to output for the bhosts -o command for customized output. This option has no effect when used with bhosts without the -o option and the LSB_BHOSTS_FORMAT environment variable and parameter are not defined.
- -l
- Displays host information in a long multi-line format. In addition to the default fields,
displays information about the CPU factor, the current load, and the load thresholds. Also displays
the value of slots for each host. The
slots value is the greatest number of unused slots
on a host.
The bhosts -l option also displays information about the dispatch windows.
When PowerPolicy is enabled in the lsb.threshold file, the bhosts -l command also displays host power states. Final power states are on or suspend. Intermediate power states are restarting, resuming, and suspending. The final power state under administrator control is closed_Power. The final power state under policy control is ok_Power. If the host status becomes unknown (power operation due to failure), the power state is shown as a dash (-).
If you specified an administrator comment with the -C option of the host control commands (badmin hclose -C or badmin hopen -C), the -l option displays the comment text. If there are any lock IDs that are attached to a closed host, these lock IDs are displayed with any attached comments in a tabular format.
If enhanced energy accounting using Elasticsearch has been enabled (with LSF_ENABLE_BEAT_SERVICE in lsf.conf), output will show the Current Power usage in watts, and total Energy Consumed in Joule and kWh.
If attributes are attached to the host, the -l option shows detailed information on these attributes.
- -noheader
-
Removes the column headings from the output.
When specified, bhosts displays the values of the fields without displaying the names of the fields. This option is useful for script parsing, when column headings are not necessary.
This option applies to output for the bhosts command with no options, and to output for all bhosts options with output that uses column headings, including the following options: -a, -alloc, -cname, -e, -o, -R, -s, -w, -x, -X.
This option does not apply to output for bhosts options that do not use column headings, including the following options: -aff, -json, -l.
- -o
-
Sets the customized output format.
- Specify which bhosts fields (or aliases instead of the full field names), in which order, and with what width to display.
- Specify only the bhosts field name or alias to set its output to unlimited width and left justification.
- (Available starting in Fix Pack 14) Specify all to display all fields. Specify the colon (:) with an output width that applies to all fields.
- Specify the colon (:) without a width to set the output width to the recommended width for that field.
- Specify the colon (:) with a width to set the maximum number of characters to display for the field. When its value exceeds this width, bhosts truncates the ending characters.
- Specify a hyphen (-) to set right justification when bhosts displays the output for the specific field. If not specified, the default is to set left justification when bhosts displays the output for a field.
- Specify
a second colon (:) with a unit to specify a unit prefix for the output for
the following fields: mem, max_mem,
avg_mem, memlimit,
swap, swaplimit,
corelimit, stacklimit, and
hrusage (for hrusage, the unit prefix is
for mem and swap resources only).
This unit is KB (or K) for kilobytes, MB (or M) for megabytes, GB (or G) for gigabytes, TB (or T) for terabytes, PB (or P) for petabytes, EB (or E) for exabytes, ZB (or Z) for zettabytes), or S to automatically adjust the value to a suitable unit prefix and remove the "bytes" suffix from the unit. The default is to automatically adjust the value to a suitable unit prefix, but keep the "bytes" suffix in the unit.
The display value keeps two decimals but rounds up the third decimal. For example, if the unit prefix is set to G, 10M displays as 0.01G.
The unit prefix specified here overrides the value of the LSB_UNIT_FOR_JOBS_DISPLAY environment variable, which also overrides the value of the LSB_UNIT_FOR_JOBS_DISPLAY parameter in the lsf.conf file.
- Use delimiter= to set the delimiting character to display between different headers and fields. This delimiter must be a single character. By default, the delimiter is a space.
Output customization applies only to the output for certain bhosts options:- LSB_BHOSTS_FORMAT and bhosts -o both apply to output for the bhosts command with no options, and for bhosts options with output that filter information, including the following options: -a, -alloc, -cname, -R, -x, -X.
- LSB_BHOSTS_FORMAT and bhosts -o do not apply to output for bhosts options that use a modified format, including the following options: -aff, -e, -l, -s, -w.
The bhosts -o option overrides the LSB_BHOSTS_FORMAT environment variable, which overrides the LSB_BHOSTS_FORMAT setting in lsf.conf.
The following are the field names used to specify the bhosts fields to display, with valid widths and any supported aliases (which you can use instead of the field names). Units of measurement for the fields are an automatically chosen units of bytes (such as gigabytes, megabytes, and so on), depending on the field name.
Table 1. Output fields for bhosts Field name Width Alias host_name 20 hname status 15 stat cpuf 10 jl_u 8 jlu max 8 njobs 8 run 8 ssusp 8 ususp 8 rsv 8 dispatch_window 50 dispwin ngpus 8 ng ngpus_alloc 8 ngu ngpus_excl_alloc 8 ngx ngpus_shared_alloc 8 ngs ngpus_shared_jexcl_alloc 8 ngsjx ngpus_excl_avail 8 ngfx ngpus_shared_avail 8 ngfs attribute 50 attr mig_alloc 5 comments Note: If combined with the bhosts -json option, this field displays full details of host closure events such as event time, administrator ID, lock ID, and comments, as shown in the bhosts -l option.128 available_mem (Available starting in Fix Pack 14)
15 reserved_mem (Available starting in Fix Pack 14)
15 total_mem (Available starting in Fix Pack 14)
15 Field names and aliases are not case-sensitive. Valid values for the output width are any positive integer 1 - 4096.
For example,
bhosts -o "host_name cpuf: jl_u:- max:-6 delimiter='^'"
This command displays the following fields:
- HOST_NAME with unlimited width and left justified.
- CPUF with a maximum width of ten characters (which is the recommended width) and left justified.
- JL_U with a maximum width of eight characters (which is the recommended width) and right justified.
- MAX with a maximum width of six characters and right justified.
- The ^ character is displayed between different headers and fields.
- -w
- Displays host information in wide format. Fields are displayed without truncation.
For condensed host groups and compute units, the -w option displays the overall status and the number of hosts with the ok, unavail, unreach, and busy status in the following format:
host_group_status num_ok/num_unavail/num_unreach/num_busy
Where- host_group_status is the overall status of the host group or compute unit. If a single host in the group or unit is ok, the overall status is also ok.
- num_ok, num_unavail, num_unreach, and num_busy are the number of hosts that are ok, unavail, unreach, and busy.
For example, if five hosts are ok, two unavail, one unreach, and three busy in a condensed host group hg1, the following status is displayed:hg1 ok 5/2/1/3
If any hosts in the host group or compute unit are closed, the status for the host group is displayed as closed, with no status for the other states:hg1 closed
The status of LSF resource connector hosts that are closed because of a resource provider reclaim request is closed_RC.
- -rc [-l]
- Displays the current status of hosts requested from and provisioned by LSF
resource connector, as well as a brief history of each provisioned host. Note: Requires LSF Fix Pack 4.
The -rc and -rconly options make use of the third-party mosquitto message queue application. LSF resource connector publishes additional provider host information that is displayed by these bhosts options. The mosquitto binary file is included as part of the LSF distribution.
To use the -rc option, LSF resource connector must be enabled with the LSB_RC_EXTERNAL_HOST_FLAG parameter in the lsf.conf file.
If you use the MQTT message broker that is distributed withLSF, you must configure the LSF_MQ_BROKER_HOSTS and MQTT_BROKER_HOST parameters in the lsf.conf file. The LSF_MQ_BROKER_HOSTS and MQTT_BROKER_HOST parameters must specify the same host name. The LSF_MQ_BROKER_HOSTS parameter enables LIM to start the mosquitto daemon.
If you use an existing MQTT message broker, you must configure the MQTT_BROKER_HOST parameter. You can optionally specify an MQTT broker port with the MQTT_BROKER_PORT parameter.
Use the ps command to check that the MQTT message broker daemon (mosquitto) is installed and running: ps -ef | grep mosquitto.
Configure the EBROKERD_HOST_CLEAN_DELAY to specify a delay, in minutes, after which the ebrokerd daemon removes information about relinquished or reclaimed hosts. This parameter allows the bhosts -rc and bhosts -rconly commands to get LSF resource connector provider host information for some time after they are unprovisioned.
The following additional columns are shown in the host list:- RC_STATUS
- LSF
resource connector status.
- Preprovision_Started
- Resource connector started the pre-provisioning script for the new host.
- Preprovision_Failed
- The pre-provisioning script returned an error.
- Allocated
- The host is ready to join the LSF cluster.
- Reclaim_Received
- A host reclaim request was received from the provider (for example, for an AWS spot instance).
- RelinquishReq_Sent
- LSF started to relinquish the host.
- Relinquished
- LSF finished relinquishing the host.
- Deallocated_Sent
- LSF sent a return request to the provider.
- Postprovision_Started
- LSF started the post-provisioning script after the host was returned.
- Done
- The host life cycle is complete.
- PROV_STATUS
- Provider status. This status depends the provider. For example, AWS has pending, running, shutting down, terminated, and others. Check documentation for the provider to understand the status that is displayed.
- UPDATED_AT
- Time stamp of the latest status change.
- INSTANCE_ID
- ID of the created machine instance. This provides a unique ID for each cloud instance of the LSF resource connector host.
For hosts provisioned by resource connector, these columns show appropriate status values and a time stamp. A dash (-) is displayed in these columns for other hosts in the cluster.
For example,bhosts -rc HOST_NAME STATUS JL/U MAX NJOBS RUN SSUSP USUSP RSV RC_STATUS PROV_STATUS UPDATED_AT
INSTANCE_ID
ec2-35-160-173-192 ok - 1 0 0 0 0 0 Allocated running 2017-04-07T12:28:46CDTi-0244f608fe7b5e014
lsf1.aws. closed - 1 0 0 0 0 0 - - -The -l option shows more detailed information about provisioned hosts:bhosts -rc -l HOST ec2-35-160-173-192.us-west-2.compute.amazonaws.com STATUS CPUF JL/U MAX NJOBS RUN SSUSP USUSP RSV RC_STATUS PROV_STATUS UPDATED_AT
INSTANCE_ID
DISPATCH_WINDOW ok 60.00 - 1 0 0 0 0 0 Allocated running 2017-04-07T12:28:46CDTi-0244f608fe7b5e014
- CURRENT LOAD USED FOR SCHEDULING: r15s r1m r15m ut pg io ls it tmp swp mem slots Total 1.0 0.0 0.0 1% 0.0 33 0 3 5504M 0M 385M 1 Reserved 0.0 0.0 0.0 0% 0.0 0 0 0 0M 0M 0M - - -rconly
- Shows the status of all hosts provisioned by LSF
resource connector, no matter if they have joined the cluster or not.Note: Requires LSF Fix Pack 4.
To use the -rconly option, LSF resource connector must be enabled with the LSB_RC_EXTERNAL_HOST_FLAG parameter in the lsf.conf file. If you use the MQTT message broker that is distributed withLSF, you must configure the LSF_MQ_BROKER_HOSTS and MQTT_BROKER_HOST parameters in the lsf.conf file. The LSF_MQ_BROKER_HOSTS and MQTT_BROKER_HOST parameters must specify the same host name. The LSF_MQ_BROKER_HOSTS parameter enables LIM to start the mosquitto daemon.
If you use an existing MQTT message broker, you must configure the MQTT_BROKER_HOST parameter. You can optionally specify an MQTT broker port with the MQTT_BROKER_PORT parameter.
Use the ps command to check that the MQTT message broker daemon (mosquitto) is installed and running: ps -ef | grep mosquitto.
- -x
- Display hosts whose job exit rate is high and exceeds the threshold that is configured by the
EXIT_RATE parameter in the lsb.hosts file for longer than
the value specified by the JOB_EXIT_RATE_DURATION parameter that is configured
in the lsb.params file. By default, these hosts are closed the next time
LSF checks host exceptions and runs eadmin.
Use with the -l option to show detailed information about host exceptions.
If no hosts exceed the job exit rate, the bhosts -x command has the following output:
There is no exceptional host found
- -X
- Displays uncondensed output for host groups and compute units.
- -R "res_req"
- Displays only information about hosts that satisfy the resource requirement expression. Note: Do not specify resource requirements by using the rusage keyword to select hosts because the criteria are ignored by LSF.
LSF supports ordering of resource requirements on all load indices, including external load indices, either static or dynamic.
- -s |-sl [resource_name ...] [-loc]
- Displays information about
the specified resources. The bhosts -s option shows only consumable resources. This option does not display information about GPU resources (that is, this
option does not display gpu_<num>n resources).
Use the -gpu option to view GPU information on the host..
Returns the resource (such as
fpga
), the total and reserved amounts of these resources (such as3
), and the resource locations (by hostname), if you use the-s option .As of Fix Pack 14, specifying the -sl option returns same resource information as the -s option, with the addition of the following information:- Specific name for each resource (for example, if there are three types of the
fpga
resources, you can assign three names:card1
,card2
, andcard3
). The names describes the specific resource and is assigned to the job upon dispatch. - Which of these names has been assigned to the resource (for example,
card1
).
Note that if the LOCATION parameter in the lsf.cluster.clustername file is set to all to indicate that the resource is shared by all hosts in the cluster, the LOCATION field in the bhosts -s command output also displays ALL. To display the individual names of all the hosts in the cluster in the bhosts -s command output, specify the -loc option together with the -s option.
When LSF License Scheduler is configured to work with LSF Advanced Edition submission and execution clusters, LSF Advanced Edition considers LSF License Scheduler cluster mode and project mode features to be shared features. When you run the bhosts -s command from a host in the submission cluster, it shows no TOTAL and RESERVED tokens available for the local hosts in the submission cluster, but shows the number of available tokens for TOTAL and the number of used tokens for RESERVED in the execution clusters.
- Specific name for each resource (for example, if there are three types of the
- host_name ... | host_group ... | compute unit ...
- Displays only information about the specified hosts. Do not use quotation marks to specify
multiple hosts.
For host groups and compute units, the names of the member hosts are displayed instead of the name of the host group or compute unit. Do not use quotation marks to specify multiple host groups or compute units.
- cluster_name
- LSF multicluster capability only. Displays information about hosts in the specified cluster.
- -h
- Prints command usage to stderr and exits.
- -V
- Prints LSF release version to stderr and exits.
Output: Host-based default
Displays the following fields:
- HOST_NAME
- The name of the host. If a host has running batch jobs, but the host is removed from the
configuration, the host name is displayed as
lost_and_found.
For condensed host groups, the HOST_NAME value is the name of host group.
- STATUS
- With LSF multicluster
capability, not
shown for fully exported hosts.The status of the host and the sbatchd daemon. Batch jobs can be dispatched only to hosts with an ok status. Host status has the following values:
- ok
- The host is available to accept batch jobs.
For condensed host groups, if a single host in the host group is ok, the overall status is also shown as ok.
If any host in the host group or compute unit is not ok, bhosts displays the first host status that it encounters as the overall status for the condensed host group. Use the bhosts -X command to see the status of individual hosts in the host group or compute unit.
- unavail
- The host is down, or LIM and the sbatchd daemon on the host are unreachable.
- unreach
- LIM on the host is running but the sbatchd daemon is unreachable.
- closed
- The host is not allowed to accept any remote batch jobs. The host can be closed for several reasons.
- closed_Cu_excl
- This host is a member of a compute unit that is running an exclusive compute unit job.
- JL/U
- With LSF multicluster
capability, not
shown for fully exported hosts.
The maximum number of job slots that the host can process on a per user basis. A dash (-) indicates no limit.
For condensed host groups or compute units, the JL/U value is the total number of job slots that all hosts in the group or unit can process on a per user basis.
The host does not allocate more than JL/U job slots for one user at the same time. These job slots are used by running jobs, as well as by suspended or pending jobs with reserved slots.
For preemptive scheduling, the accounting is different. These job slots are used by running jobs and by pending jobs with reserved slots.
- MAX
- The maximum number of job slots available. A dash (-) indicates no
limit.
For condensed host groups and compute units, the MAX value is the total maximum number of job slots available in all hosts in the host group or compute unit.
These job slots are used by running jobs, as well as by suspended or pending jobs with reserved slots.
If preemptive scheduling is used, suspended jobs are not counted.
A host does not always have to allocate this many job slots if jobs are waiting. The host must also satisfy its configured load conditions to accept more jobs.
- NJOBS
- The
number of tasks for all jobs that are dispatched to the host. The NJOBS
value includes running, suspended, and chunk jobs.
For condensed host groups and compute units, the NJOBS value is the total number of tasks that are used by jobs that are dispatched to any host in the host group or compute unit.
If the -alloc option is used, total is the sum of the RUN, SSUSP, USUSP, and RSV counters.
- RUN
- The
number of tasks for all running jobs on the host.
For condensed host groups and compute units, the RUN value is the total number of tasks for running jobs on any host in the host group or compute unit. If the -alloc option is used, total is the allocated slots for the jobs on the host.
- SSUSP
- The number of tasks
for all system suspended jobs on the host.
For condensed host groups and compute units, the SSUSP value is the total number of tasks for all system suspended jobs on any host in the host group or compute unit. If the -alloc option used, total is the allocated slots for the jobs on the host.
- USUSP
- The
number of tasks for all user suspended jobs on the host. Jobs can be suspended by the user or by the
LSF administrator.
For condensed host groups and compute units, the USUSP value is the total number of tasks for all user suspended jobs on any host in the host group or compute unit. If the -alloc option used, total is the allocated slots for the jobs on the host.
- RSV
- The
number of tasks for all pending jobs with reserved slots on the host.
For condensed host groups and compute units, the RSV value is the total number of tasks for all pending jobs with reserved slots on any host in the host group or compute unit. If the -alloc option used, total is the allocated slots for the jobs on the host.
Output: Host-based -l option
- loadSched, loadStop
- The scheduling and suspending thresholds for the host. If a threshold is not defined, the
threshold from the queue definition applies. If both the host and the queue define a threshold for a
load index, the most restrictive threshold is used.
The migration threshold is the time that a job dispatched to this host can remain suspended by the system before LSF attempts to migrate the job to another host.
- STATUS
- The long format that is shown by the -l option gives the possible reasons for a
host to be closed. If a power policy is enabled in the lsb.threshold file, it
shows the power state:
- closed_Adm
- The host is closed by the LSF administrator or root with the badmin hclose command. No job can be dispatched to the host, but jobs that are running on the host are not affected.
- closed_Busy
- The host is overloaded. At least one load index exceeds the configured threshold. Indices that exceed their threshold are identified by an asterisk (*). No job can be dispatched to the host, but jobs that are running on the host are not affected.
- closed_Cu_Excl
- This host is a member of a compute unit that is running an exclusive compute unit job (submitted with the bsub -R "cu[excl]" command).
- closed_EGO
- For EGO-enabled SLA scheduling, host is closed because it was not allocated by EGO to run LSF jobs. Hosts that are allocated from EGO display the status ok.
- closed_Excl
- The host is running an exclusive job (submitted with the bsub -x command).
- closed_Full
- The maximum number of job slots on the host was reached. No job can be dispatched to the host, but jobs that are running on the host are not affected.
- closed_LIM
- LIM on the host is unreachable, but the sbatchd daemon is running.
- closed_Lock
- The host is locked by the EGO administrator or root by using lsadmin limlock command. Running jobs on the host are suspended by EGO (SSUSP state). Use the lsadmin limunlock command to unlock LIM on the local host.
- closed_Wind
- The host is closed by a dispatch window that is defined in the lsb.hosts file. No job can be dispatched to the host, but jobs that are running on the host are not affected.
- closed_RC
- The LSF resource connector host is closed because of a resource provider reclaim request. Hosts are also marked as closed_RC before they are returned to a resource provider (such as EGO, OpenStack, Amazon Web Services) when maximum time-to-live (the LSB_RC_EXTERNAL_HOST_MAX_TTL parameter in the lsf.conf file) or host idle time (the LSB_RC_EXTERNAL_HOST_IDLE_TIME parameter in the lsf.conffile) was reached.
- on
- The host power state is on.Note: Power state on does not mean that the host state is ok, which depends on whether the lim and sbatchd daemons can be connected by the management host.
- off
- The host is powered off by policy or manually.
- suspend
- The host is suspended by policy or manually with badmin hpower.
- restarting
- The host is resetting when resume operation failed.
- resuming
- The host is being resumed from standby state, which is triggered by either policy or cluster administrator.
- suspending
- The host is being suspended which is triggered by either policy or cluster administrator.
- closed_Power
- The host is put into power saving (suspend) state by the cluster administrator.
- ok
- Host suspend was triggered by power policy.
- CPUF
- Displays the CPU normalization factor of the host (see lshosts(1)).
- DISPATCH_WINDOW
- Displays the dispatch windows for each host. Dispatch windows are the time windows during the week when batch jobs can be run on each host. Jobs that are already started are not affected by the dispatch windows. When the dispatch windows close, jobs are not suspended. Jobs already running continue to run, but no new jobs are started until the windows reopen. The default for the dispatch window is no restriction or always open (that is, twenty-four hours a day and seven days a week). For the dispatch window specification, see the description for the DISPATCH_WINDOWS keyword under the -l option in the bqueues command.
- CURRENT LOAD
-
Displays the total and reserved host load.
- Reserved
- You specify reserved resources by using the bsub -R option. These resources are reserved by jobs that are running on the host.
- Total
- The total load has different meanings, depending on whether the load index is increasing or
decreasing.
For increasing load indices, such as run queue lengths, CPU usage, paging activity, logins, and disk I/O, the total load is the consumed plus the reserved amount. The total load is calculated as the sum of the current load and the reserved load. The current load is the load that is shown by the lsload command.
For decreasing load indices, such as available memory, idle time, available swap space, and available space in the tmp directory, the total load is the available amount. The total load is the difference between the current load and the reserved load. This difference is the available resource as shown by the lsload command.
- LOAD THRESHOLD
-
Displays the scheduling threshold (loadSched) and the suspending threshold (loadStop). Also displays the migration threshold if defined and the checkpoint support if the host supports checkpointing.
The format for the thresholds is the same as for batch job queues. For an explanation of the thresholds and load indices, see the description for the QUEUE SCHEDULING PARAMETERS keyword under the -l option of the bqueues command.
- THRESHOLD AND LOAD USED FOR EXCEPTIONS
-
Displays the configured threshold of EXIT_RATE for the host and its current load value for host exceptions.
- ADMIN ACTION COMMENT
-
If the EGO administrator specified an administrator comment with the -C option of the badmin host control commands hclose or hopen, the comment text is displayed.
- PE NETWORK INFORMATION
-
Displays network resource information for IBM Parallel Edition (PE) jobs that are submitted with the bsub -network option, or to a queue (defined in the lsb.queuesfile) or an application profile (defined in the lsb.applications file) with the NETWORK_REQ parameter defined.
The following example shows PE NETWORK INFORMATION:bhosts -l ... PE NETWORK INFORMATION NetworkID Status rsv_windows/total_windows 1111111 ok 4/64 2222222 closed_Dedicated 4/64 ...
NetworkID is the physical network ID returned by PE.
One of the following network Status values is displayed:- ok
- Normal status.
- closed_Full
- All network windows are reserved.
- closed_Dedicated
- A dedicated PE job is running on the network (the usage=dedicated option is specified in the network resource requirement string).
- unavail
- Network information is not available.
- CONFIGURED AFFINITY CPU LIST
-
The host is configured in the lsb.hosts file to accept jobs for CPU and memory affinity scheduling. If the AFFINITY parameter is configured as Y, the keyword all is displayed. If a CPU list is specified under the AFFINITY column, the configured CPU list for affinity scheduling is displayed.
Output: Resource-based -s option
The -s option displays the following resource information: the amounts that are used for scheduling, the amounts reserved, and the associated hosts for the resources. Only resources (shared or host-based) with numeric values are displayed.
- RESOURCE
- The name of the resource.
- TOTAL
- The total amount free of a resource that is used for scheduling.
- RESERVED
- The amount that is reserved by jobs. You specify the reserved resource by using the bsub -R option.
- LOCATION
- The hosts that are associated with the resource.
Output: Host-based -aff option
The -aff option displays host topology information for CPU and memory affinity scheduling. Only the topology nodes that contain CPUs in the list in the CPULIST parameter that is defined in the lsb.hosts file are displayed.
- AFFINITY
- If the host is configured in the lsb.hosts file to accept jobs for CPU and
memory affinity scheduling, and the host supports affinity scheduling, AFFINITY:
Enabled is displayed.
If the host is configured in the lsb.hosts file to accept jobs for CPU and memory affinity scheduling, but the host does not support affinity scheduling, AFFINITY: Disabled (not supported) is displayed. If the host is LIM is not available or sbatchd is unreachable, AFFINITY: UNKNOWN is displayed.
- Host[memory] host_name
- Maximum available memory on the host. If memory availability cannot be determined, a dash
(-) is displayed for the host. If the -l option is
specified with the -aff option, the host name is not displayed.
For hosts that do not support affinity scheduling, a dash (-) is displayed for host memory and no host topology is displayed.
- NUMA[numa_node: requested_mem / max_mem]
- Requested and total NUMA node memory. It is possible for requested memory for the NUMA node to
be greater than the maximum available memory displayed.
A socket is a collection of cores with a direct pipe to memory. Each socket contains 1 or more cores. A socket is not necessarily a physical socket, but rather refers to the memory architecture of the machine.
A core is a single entity capable of performing computations.
A node contains sockets. A socket contains cores, and a core can contain threads if the core is enabled for multithreading.
If no NUMA nodes are present, then the NUMA layer in the output is not shown. Other relevant items such as host, socket, core, and thread are still shown.
If the host is not available, only the host name is displayed. A dash (-) is shown where available host memory would normally be displayed.
bhosts -l -aff hostA
HOST hostA
STATUS CPUF JL/U MAX NJOBS RUN SSUSP USUSP RSV DISPATCH_WINDOW
ok 60.00 - 8 0 0 0 0 0 -
CURRENT LOAD USED FOR SCHEDULING:
r15s r1m r15m ut pg io ls it tmp swp mem slots
Total 0.0 0.0 0.0 30% 0.0 193 25 0 8605M 5.8G 13.2G 8
Reserved 0.0 0.0 0.0 0% 0.0 0 0 0 0M 0M 0M -
LOAD THRESHOLD USED FOR SCHEDULING:
r15s r1m r15m ut pg io ls it tmp swp mem
loadSched - - - - - - - - - - -
loadStop - - - - - - - - - - -
CONFIGURED AFFINITY CPU LIST: all
AFFINITY: Enabled
Host[15.7G]
NUMA[0: 100M / 15.7G]
Socket0
core0(0)
Socket1
core0(1)
Socket2
core0(2)
Socket3
core0(3)
Socket4
core0(4)
Socket5
core0(5)
Socket6
core0(6)
Socket7
core0(7)
...
Host[1.4G] hostB
NUMA[0: 1.4G / 1.4G] (*0 *1)
...
A job that requests two cores, or two sockets, or 2 CPUs runs. Requesting two cores from the same NUMA node runs. However, a job that requests two cores from the same socket remains pending.
Output: GPU-based -gpu option
The -gpu option displays information of the GPUs on the host.
- HOST_NAME
- The host name.
- GPU_ID
- The GPU IDs on the host. Each GPU is shown as a separate line.
- MODEL
- The full model name, which consists of the GPU brand name and the model type.
- MUSED
- The amount of GPU memory currently in use.
- MRSV
- The amount of GPU memory that is reserved by the job.
- NJOBS
- The total number of jobs that are using the GPUs.
- RUN
- The total number of running jobs that are using the GPUs.
- SUSP
- The total number of suspended jobs that are using the GPUs.
- RSV
- The total number of pending jobs that reserved the GPUs.
- VENDOR
- The GPU vendor type (that is, the GPU brand name).
- NGPUS
- The total number of GPUs on the host.
- SHARED_AVAIL
- The current total number of GPUs that are available for concurrent use by multiple jobs (that is, when the job is submitted with -gpu mode=shared or -gpu j_exclusive=no options)
- EXCLUSIVE_AVAIL
- The current total number of GPUs that are used exclusive by the job (that is, when the job is submitted with -gpu mode=exclusive_process or -gpu j_exclusive=yes options)
- STATIC ATTRIBUTES
- Static GPU information. The following field is specific to this section:
- NVLINK/XGMI
- The connections with other GPUs on the same host.
The connection flag of each GPU is separated by a slash (/) with the next GPU, with a Y showing that there is a direct NVLink (for Nvidia) or xGMI (for AMD) connection with that GPU.
- MIG
- A flag to indicate whether the GPU supports Nvidia Multi-Instance GPU (MIG) functions.
- DYNAMIC ATTRIBUTES
- The latest GPU usage information as maintained by LSF.
- GPU JOB INFORMATION
- Information on jobs that are using the host's GPUs. The following fields are specific to this section:
- JEXCL
- Flag to indicate whether the GPU job requested that the allocated GPUs cannot by used by other jobs (that is, whether the job was submitted with -gpu j_exclusive=yes)
- RUNJOBIDS
- The IDs of the running GPU jobs on the GPU.
- SUSPJOBIDS
- The IDs of the suspended GPU jobs on the GPU.
- RSVJOBIDS
- The IDs of the pending GPU jobs that reserved the GPU.
Resource connector -rconly option
The -rconly option displays information that is specific to the LSF resource connector.
- PUB_DNS_NAME and PUB_IP_ADDRESS
- Public DNS name and IP address of the host.
- PRIV_DNS_NAME and PRIV_IP_ADDRESS
- Private DNS name and IP address of the host.
- RC_STATUS
- LSF resource connector status.
- PROV_STATUS
- Resource provider status.
- TAG
- The RC_ACCOUNT value that is defined in the lsb.queues or lsb.applications files.
- UPDATED_AT
- Time stamp of the latest status change.
- INSTANCE_ID
- ID of the created machine instance. This ID uniquely identifies the host in LSF.
bhosts -rconly
PROVIDER : aws
TEMPLATE : aws-vm-1
PUB_DNS_NAME PUB_IP_ADDRESS PRIV_DNS_NAME PRIV_IP_ADDRESS RC_STATUS PROV_STATUS TAG UPDATED_AT INSTANCE_ID
ec2-52-43-171-109. 52.43.171.109 ip-192-168-0-85.us 192.168.0.85 Done terminated default 2017-05-31T14:30:47CDT -
ec2-35-160-157-112 35.160.157.112 ip-192-168-0-69.us 192.168.0.69 Allocated running default 2017-05-31T14:32:00CDT -
Output: Attribute -attr option
The -attr option displays information on attributes that are attached to the host.
- HOSTS
- The name of the hosts to which this attribute is attached.
- ATTRIBUTE
- The name of the attribute.
- TTL
- The current time-to-live (TTL) value of the attribute.
- CREATOR
- The name of the user that created the attribute.
- DESCRIPTION
- User-specified information about the attribute.
Files
Reads the lsb.hosts file.
See also
lsb.hosts, bqueues, lshosts, badmin, lsadmin