pam
Parallel Application Manager (job starter for MPI applications).
HP-UX vendor MPI syntax
bsub pam -mpi mpirun [mpirun_options] mpi_app [argument ...]Generic PJL framework syntax
bsub pam [-t] [-v] [-n num_tasks] -g [num_args] pjl_wrapper [pjl_options] mpi_app [argument ...] pam [-h] pam [-V]Description
The Parallel Application Manager (PAM) is fully integrated with LSF. PAM acts as the supervisor of a parallel LSF job.
MPI jobs started by the pam command can be submitted only through batch jobs, PAM cannot be used interactively to start parallel jobs. The sbatchd daemon starts PAM on the first execution host.
- Uses a vendor MPI library or an MPI Parallel Job Launcher (PJL), for example, mpirun or poe, to start a parallel job on a specified set of hosts in an LSF cluster.
- PAM contacts RES on each execution host that is allocated to the parallel job.
- PAM queries RES periodically to collect resource usage for each parallel task and passes control signals through RES to all process groups and individual running tasks, and cleans up tasks as needed.
- Passes job-level resource usage and process IDs (PIDs and PGIDs) to sbatchd for enforcement
- Collects resource usage information and exit status upon termination
Task startup for vendor MPI jobs
The pam command starts a vendor MPI job on a specified set of hosts in an LSF cluster. The pam command that starts an MPI job requires the underlying MPI system to be LSF-aware, using a vendor MPI implementation that supports LSF (for example, HP-UX vendor MPI).
PAM uses the vendor MPI library to create the child processes needed for the parallel tasks that make up your MPI application. It starts these tasks on the systems that are allocated by LSF. The allocation includes the number of execution hosts needed, and the number of child processes needed on each host.
Task startup for generic PJL jobs
- PAM starts the PJL, which in turn starts the TaskStarter (TS).
- TS starts the tasks on each execution host, reports the process ID to PAM, and waits for the task to finish.
- $MPIRUN_LSF_PRE_EXEC
- Runs before PAM is started.
- $MPIRUN_LSF_POST_EXEC
- Runs after PAM is started.
Options for vendor MPI jobs
- -auto_place
- The -auto_place option on the pam command line tells the IRIX mpirun library to start the MPI application according to the resources allocated by LSF.
- -mpi
- On HP-UX, you can have LSF manage
the allocation of hosts to achieve better resource usage by coordinating the start-up phase with the
mpirun command. Precede the regular MPI mpirun command with
the following
command:
bsub pam -mpi
For HP-UX vendor MPI jobs, the -mpi option must be the first option of the pam command.
For example, the following mpirun command runs a single-host job:mpirun -np 14 a.out
To have LSF select the host, include the mpirun command in the bsub job submission command:bsub pam -mpi mpirun -np 14 a.out
- -n num_tasks
- The number of processors that are required to run the parallel application, typically the same
as the number of parallel tasks in the job. If the host is a multiprocessor, one host can start
several tasks.
You can use both the bsub -n and pam -n commands in the same job submission. The number that is specified in the pam -n option must be less than or equal to the number specified by the bsub -n command. If the number of tasks that are specified with the pam -n command is greater than the number that is specified by the bsub -n command, the pam -n command is ignored.
For example, you can specify the following command:bsub -n 5 pam -n 2 -mpi -auto_place a.out
The job requests five processors, but PAM starts only two parallel tasks.
- mpi_app [argument ...]
- The name of the MPI application to be run on the listed hosts. This name must be the last argument on the command line.
- -h
- Prints command usage to stderr and exit.
- -V
- Prints LSF release version to stderr and exit.
Options for generic PJL jobs
- -t
- This option tells the pam command not to print the MPI job tasks summary report to the standard output. By default, the summary report prints the task ID, the host that it ran on, the command that was run, the exit status, and the termination time.
- -v
- Verbose mode. Displays the name of the execution host or hosts.
- -g [num_args] pjl_wrapper [pjl_options]
- The -g option is required to use the generic PJL framework. You must specify
all the other pam options before -g.
- num_args
- Specifies how many space-separated arguments in the command line are related to the PJL (after that, the remaining section of the command line is assumed to be related to the binary application that starts the parallel tasks).
- pjl_wrapper
- The name of the PJL.
- pjl_options
- Optional arguments to the PJL.
For example:- A PJL named no_arg_pjl takes no options, so
num_args=1. The syntax
is:
pam [pam_options] -g 1 no_arg_pjl job [job_options]
- A PJL is named 3_arg_pjl and takes the options -a,
-b, and group_name, so num_args=4. Use the
following
syntax:
pam [pam_options] -g 4 3_arg_pjl -a -b group_name job [job_options]
- -n num_tasks
- The number of processors that are required to run the MPI application, typically the number of
parallel tasks in the job. If the host is a multiprocessor, one host can start several tasks.
You can use both the bsub -n and pam -n commands in the same job submission. The number that is specified in the pam -n option must be less than or equal to the number specified by the bsub -n option. If the number of tasks that are specified with the pam -n option is greater than the number specified by the bsub -n option, the pam -n option is ignored.
- mpi_app [argument ...]
- The name of the MPI application to be run on the listed hosts. This name must be the last argument on the command line.
- -h
- Prints command usage to stderr and exit.
- -V
- Prints LSF release version to stderr and exits.
Exit Status
The pam command exits with the exit status of the mpirun command or the PJL wrapper.