-k
Makes a job checkpointable and specifies the checkpoint directory.
Categories
properties
Synopsis
bsub -k "checkpoint_dir [init=initial_checkpoint_period] [checkpoint_period] [method=method_name]"Description
Specify a relative or absolute path name. The quotes (") are required if you specify a checkpoint period, initial checkpoint period, or custom checkpoint and restart method name.
The job ID and job file name are concatenated to the checkpoint directory when creating a checkpoint file.
When a job is checkpointed, the checkpoint information is stored in checkpoint_dir/job_ID/file_name. Multiple jobs can checkpoint into the same directory. The system can create multiple files.
The checkpoint directory is used for restarting the job (see brestart(1)). The checkpoint directory can be any valid path.
Optionally, specifies a checkpoint period in minutes. Specify a positive integer. The running job is checkpointed automatically every checkpoint period. The checkpoint period can be changed using bchkpnt. Because checkpointing is a heavyweight operation, you should choose a checkpoint period greater than half an hour.
Optionally, specifies an initial checkpoint period in minutes. Specify a positive integer. The first checkpoint does not happen until the initial period has elapsed. After the first checkpoint, the job checkpoint frequency is controlled by the normal job checkpoint interval.
The echkpnt.method_name and erestart.method_name programs must be in LSF_SERVERDIR or in the directory specified by LSB_ECHKPNT_METHOD_DIR (environment variable or set in lsf.conf).
If a custom checkpoint and restart method is already specified with LSB_ECHKPNT_METHOD (environment variable or in lsf.conf), the method you specify with bsub -k overrides this.
Process checkpointing is not available on all host types, and may require linking programs with a special libraries (see libckpt.a(3)). LSF invokes echkpnt (see echkpnt(8)) found in LSF_SERVERDIR to checkpoint the job. You can override the default echkpnt for the job by defining as environment variables or in lsf.conf LSB_ECHKPNT_METHOD and LSB_ECHKPNT_METHOD_DIR to point to your own echkpnt. This allows you to use other checkpointing facilities, including application-level checkpointing.
The checkpoint method directory should be accessible by all users who need to run the custom echkpnt and erestart programs.
Only running members of a chunk job can be checkpointed.