Define Intel Xeon Phi resources
Enable LSF so applications can use Intel Xeon Phi co-processors (previously referred to as Intel Many Integrated Core Architecture, or MIC, co-processors) in a Linux environment. LSF supports parallel jobs that request Xeon Phi resources, so you can specify some co-processors on each node at run time, based on availability.
Specifically, LSF supports the following environments:
- Intel Xeon Phi co-processors for serial and parallel jobs. Use the blaunch command to launch parallel jobs.
- Intel Xeon Phi co-processor for LSF jobs in offload mode, both serial and parallel.
- CUDA 4.0 to CUDA 8.0 and later.
- LIntel Xeon Phi co-processors support Linux x64.
LSF also supports the collection of metrics for Xeon Phi co-processors by using ELIMs and predefined LSF resources.
The elim.mic ELIM collects the following information:
- elim.mic detects the number of Intel Xeon Phi co-processors (nmics)
- For each co-processor, the optional elim detects the following resources:
- mic_ncores*
- Number of cores.
- mic_temp*
- Co-processor temperature.
- mic_freq*
- Co-processor frequency.
- mic_freemem*
- Co-processor free memory.
- mic_util*
- Co-processor utilization.
- mic_power*:
- Co-processor total power.
* If the resource consists of more than one resource, an index is displayed, starting at 0. For example, for mic_ncores, you might see mic_ncores0, mic_ncores1, and mic_ncores2, and so on.
When you enable LSF support for Intel Xeon Phi resources, note the following support:
- Checkpoint and restart are not supported.
- Preemption is not supported.
- Resource duration and decay are not supported.
- ELIMs for CUDA 4.0 can work with CUDA 8.0 or later.
Configure and use Intel Xeon Phi resources
- Binary files for the base elim.mic file are located under
$LSF_SERVERDIR. The binary for elim.mic.ext script file is
located under LSF_TOP/10.1.0/util/elim.mic.ext.
Make sure that the elim executable files are in the LSF_SERVERDIR directory.
For Intel Xeon Phi co-processor support, make sure that the following third-party software is installed correctly:
- Intel Xeon Phi co-processor (Knights Corner).
- Intel MPSS version 2.1.4982-15 or later.
- Runtime support library/tools from Intel Xeon Phi offload support.
- Configure the LSF
cluster that contains the Intel Xeon Phi resources.
- Configure the lsf.shared file.For Intel Xeon Phi support, define the following resources in the Resource section. The first resource (nmics) is required. The others are optional:
Begin Resource RESOURCENAME TYPE INTERVAL INCREASING CONSUMABLE DESCRIPTION nmics Numeric 60 N Y (Number of MIC devices) mic_temp0 Numeric 60 Y N (MIC device 0 CPU temp) mic_temp1 Numeric 60 Y N (MIC device 1 CPU temp) mic_freq0 Numeric 60 N N (MIC device 0 CPU freq) mic_freq1 Numeric 60 N N (MIC device 1 CPU freq) mic_power0 Numeric 60 Y N (MIC device 0 total power) mic_power1 Numeric 60 Y N (MIC device 1 total power) mic_freemem0 Numeric 60 N N (MIC device 0 free memory) mic_freemem1 Numeric 60 N N (MIC device 1 free memory) mic_util0 Numeric 60 Y N (MIC device 0 CPU utility) mic_util1 Numeric 60 Y N (MIC device 1 CPU utility) mic_ncores0 Numeric 60 N N (MIC device 0 number cores) mic_ncores1 Numeric 60 N N (MIC device 1 number cores) ...
End Resource
Note: The mic_util resource is a numeric resource, so the lsload command does not display it as the internal resource.
- Configure the lsf.cluster.cluster_namefile.
For Intel Xeon Phi support, define the following lines in the ResourceMap section. The first resource (nmics) is provided by the elim.mic. The others are optional:
Begin ResourceMap
RESOURCENAME LOCATION ... nmics [default] mic_temp0 [default] mic_temp1 [default] mic_freq0 [default] mic_freq1 [default] mic_power0 [default] mic_power1 [default] mic_freemem0 [default] mic_freemem1 [default] mic_util0 [default] mic_util1 [default] mic_ncores0 [default] mic_ncores1 [default] ... End ResourceMap
- Configure the nmics resource in the lsb.resources
file. You can set attributes in the ReservationUsage section with the following values:
Begin ReservationUsage RESOURCE METHOD RESERVE ... nmics PER_TASK N ... End ReservationUsage
If this file has no configuration for Intel Xeon Phi resources, by default LSF considers all resources as PER_HOST.
- Configure the lsf.shared file.
- Use the lsload -l command to show Intel Xeon Phi resources:
lsload -I nmics:ngpus:ngpus_shared:ngpus_excl_t:ngpus_excl_p HOST_NAME status nmics ngpus ngpus_shared ngpus_excl_t ngpus_excl_p hostA ok - 3.0 12.0 0.0 0.0 hostB ok 1.0 - - - - hostC ok 1.0 - - - - hostD ok 1.0 - - - - hostE ok 1.0 - - - - hostF ok - 3.0 12.0 0.0 0.0 hostG ok - 3.0 12.0 0.0 1.0 hostH ok - 3.0 12.0 1.0 0.0 hostI ok 2.0 - - - -
- Use the bhost -l command to see how the LSF
scheduler allocated Intel Xeon Phi resources. These resources are treated as normal host-based resources:
bhosts -l hostA HOST hostA STATUS CPUF JL/U MAX NJOBS RUN SSUSP USUSP RSV DISPATCH_WINDOW ok 60.00 - 12 2 2 0 0 0 - CURRENT LOAD USED FOR SCHEDULING: r15s r1m r15m ut pg io ls it tmp swp mem slots nmics Total 0.0 0.0 0.0 0% 0.0 3 4 0 28G 3.9G 22.5G 10 0.0 Reserved 0.0 0.0 0.0 0% 0.0 0 0 0 0M 0M 0M - - ngpus ngpus_shared ngpus_excl_t ngpus_excl_p Total 3.0 10.0 0.0 0.0 Reserved 0.0 2.0 0.0 0.0 LOAD THRESHOLD USED FOR SCHEDULING: r15s r1m r15m ut pg io ls it tmp swp mem loadSched - - - - - - - - - - - loadStop - - - - - - - - - - - nmics ngpus ngpus_shared ngpus_excl_t ngpus_excl_p loadSched - - - - - loadStop - - - - -
- Use the lshosts -l command to see the information for Intel Xeon Phi
co-processors that are collected by the elim:
lshosts -l hostA HOST_NAME: hostA type model cpuf ncpus ndisks maxmem maxswp maxtmp rexpri server nprocs ncores nthreads X86_64 Intel_EM64T 60.0 12 1 23.9G 3.9G 40317M 0 Yes 2 6 1 RESOURCES: (mg) RUN_WINDOWS: (always open) LOAD_THRESHOLDS: r15s r1m r15m ut pg io ls it tmp swp mem nmics ngpus ngpus_shared ngpus_excl_t ngpus_excl_p - 3.5 - - - - - - - - - - - - - -
- Submit jobs. Use the select[] string in a resource requirement
(-R option) to choose the hosts that have Intel Xeon Phi resources. Use the
rusage[] string to tell LSF how
many resources to use.
- Use Intel Xeon Phi resources in a LSF MPI
job:
bsub -n 4 -R "rusage[nmics=2]" mpirun -lsf mic_app
- Request Intel Xeon Phi
co-processors.
bsub -R "rusage[nmics=n]"
- Consume one Intel Xeon Phi resource on the execution
host:
bsub -R "rusage[nmics=1]" mic_app
- Run the job on one host and consume 2 Intel Xeon Phi resources on that
host:
bsub -R "rusage[nmics=2]" mic_app
- Use Intel Xeon Phi resources in a LSF MPI
job: