About pre- and post-execution processing

The pre- and post-execution processing feature consists of two types:

  • Job-based pre- and post-execution processing, which is intended for sequential jobs and runs only on the first execution host.
  • Host-based pre- and post-execution processing, which is intended for parallel jobs and runs on all execution hosts.
You can use pre- and post-execution processing to run commands before a batch job starts or after it finishes. Typical uses of this feature include the following:
  • Reserving resources such as tape drives and other devices not directly configurable in LSF
  • Making job-starting decisions in addition to those directly supported by LSF
  • Creating and deleting scratch directories for a job
  • Customizing scheduling based on the exit code of a pre-execution command
  • Checking availability of software licenses
  • Assigning jobs to run on specific processors on SMP machines
  • Transferring data files needed for job execution
  • Modifying system configuration files before and after job execution
  • Using a post-execution command to clean up a state left by the pre-execution command or the job

Any executable command line can serve as a pre-execution or post-execution command. By default, the commands run under the same user account, environment, home directory, and working directory as the job.

When JOB_INCLUDE_POSTPROC is defined in an application profile or lsb.params, a job is considered in RUN state while the job is in post exec stage (which is DONE state for regular jobs).

Job-based pre- and post-execution processing

Job-based pre-execution and post-execution commands can be defined at the queue, application, and job levels.

The command path can contain up to 4094 characters for UNIX and Linux, or up to 255 characters for Windows, including the directory, file name, and expanded values for %J (job_ID) and %I (index_ID).

When the job is resizable, job grow requests are ignored. However, job shrink requests can be processed. For either case, LSF does not invoke the job resized notification command.

The following illustration shows the default behavior (feature not enabled) of job-based pre- and post-execution processing:

The following example illustrates how job-based pre- and post-execution processing works at the queue or application level for setting the environment prior to job execution and for transferring resulting files after the job runs.

The table below provides the scope of job-based pre- and post-execution processing:


Applicability Details
Operating system
  • UNIX
  • Windows
  • A mix of UNIX and Windows hosts
Dependencies
  • UNIX and Windows user accounts must be valid on all hosts in the cluster and must have the correct permissions to successfully run jobs.
  • On a Windows Server 2003, x64 Edition platform, users must have read and execute privileges for cmd.exe.
Limitations
  • Applies to batch jobs only (jobs submitted using the bsub command)

Host-based pre- and post-execution processing

Host-based pre- and post-execution processing is different from job-based pre- and post-execution processing in that it is intended for parallel jobs (you can also use this feature for sequential jobs) and is executed on all execution hosts, as opposed to only the first execution host. The purpose of this is to set up the execution hosts before all job-based pre-execution and other pre-processing which depend on host-based preparation, and clean up execution hosts after job-based post execution and other post-processing.

This feature can be used in a number of ways. For example:

  • HPC sites can have multiple ways to check for system health before actually launching jobs, such as checking for host or node status, key file systems are mounted, infiniband is working, required directories, files, environment, and correct user permissions are set, etc.)
  • Administrators can configure site specific policy to run host-based pre- and post-execution processing to set up ssh access to computer nodes. By default, ssh is disabled. However, with host-based pre- and post-execution processing, ssh access to the nodes allocated for the job can be enabled for the duration of job life cycle. This is required for debugging a parallel job on a non-first execution host and will not impact the overall cluster security policy.
  • Administrators can configure host-based pre- and post-execution processing to create and later remove temporary working directories on each host.

You can define the host-based pre- and post-execution processing at the application level and the queue level. Failure handling is also supported.

There are two ways to enable host-based pre- and post-execution processing for a job:

  • Configure HOST_PRE_EXEC and HOST_POST_EXEC in lsb.queues.
  • Configure HOST_PRE_EXEC and HOST_POST_EXEC in lsb.applications.

When configuring host-based pre- and post-execution processing, note the following:

  • Host-based pre- and post-execution processing is only supported on UNIX.
  • Host-based pre- and post-execution processing does not support the return of some environment variables in output and the setting of those environment variables for the job.
  • If a job is in the host-based pre-execution processing stage, sbatchd rejects any signals that are not termination signals and requests that the signal be sent again. If the job is in the host-based post-execution processing stage, job signals are rejected or ignored no matter how JOB_INCLUDE_POSTPROC is defined.
  • You cannot use the default value for JOB_PREPROC_TIMEOUT or JOB_POSTPROC_TIMEOUT for host-based pre- and post-execution processing. Configure a value based on how long it would take for host-based pre- and post-execution processing to run.
  • Checkpointing can not be performed until host-based pre-execution processing is finished. During that time, sbatchd returns a retry error.
  • Starting with LSF release 9.1.2, host-based pre- and post-execution processing will not be executed on allocated hosts to which the jobs were expanded by auto-resize.
  • Host-based pre- and post-execution processing treats lease-in host the same as the local host.
  • If a job with host-based pre- or post-execution processing is dispatched to Windows hosts, the job will fail, then display a pending reason.
  • Since host-based pre- and post-execution processing is not defined at the job level, MultiCluster forwarded and XL jobs do not take local queue and application host-based pre- and post-execution processing information, but instead follow the remote queue and application configuration.
  • The host-based pre- and post-execution processing feature is only supported by LSF 9.1.2 and future versions.