Using IBM® Spectrum LSF with Andrew File System (AFS)
Learn how LSF integrates with Andrew File System (AFS) so you can configure LSF to suit your needs.
TGT forwarding in LSF
The purpose of TGT (ticket granting ticket) forwarding in LSF is to forward user TGT files from the job submission host to the job execution host.
About this task
Job processes can use this TGT file to assume the identity of submission user, as illustrated in the following figure:
The TGT file is carried along with job submission from the bsub command to the mbatchd daemon, then to the execution host. Before the user job process is started, the TGT file is set up and the KRB5CCNAME environment variable in the user process is set to point to this file.
In the following example, user data is stored in NFSv4 with Kerberos protecting. A job needs the submission user’s TGT file to access job data. Site policy also dictates that each TGT has a lifetime of 8 hours and renewal limit of 40 hours. That is, the TGT can be used for a full work day before it needs to be renewed. It can be renewed for a whole work week.
Procedure
Results
After the TGT is set up on the execution host, your program can read and write to the NFSv4 volume the same as regular directories. Kerberos logic is handled by underlying system calls, so your job does not need to do anything.
LSF AFS integration
The LSF integration with AFS is effectively an application of LSF TGT forwarding, but with extra help from LSF.
About this task
The configuration of the LSF AFS integration covers the following case:
- The job accesses user data in an AFS volume.
- The job must have a valid TGT file.
- The job must use this TGT file to apply an AFS token.
This ensures that the job can access user data files in an AFS volume as if they are normal files.
- JOB_SPOOL_DIR is defined in an AFS volume. In this case, the child
sbatchd daemon, and the job RES needs to access the AFS volume to create the job
file, job output, error cache, and other files.
- The child sbatchd daemon and the job RES must have a valid TGT file.
- The child sbatchd daemon and the job RES must use this TGT file to apply an AFS token.
This ensures that the child sbatchd daemon and the job RES can access the JOB_SPOOL_DIR directory as if it is a normal directory.
LSF creates a separate PAG (process authentication group) for user jobs, the child sbatchd, and job RES to maximize the security of user tokens. This operation is depicted in the following figure:
In the following example, user data is stored in an AFS volume. This is different from NFSv4 because an AFS token is needed to access the AFS volume, and the AFS token must possess a valid TGT file. The job still needs the submission user’s TGT file to be forwarded to the execution host, but LSF must also apply an AFS token for the job based on this TGT file.
Site policy dictates that each TGT has a lifetime of 8 hours with a renewal limit of 40 hours That is, the TGT can be used without renewal for a full work day, and it can be renewed for a whole work week. AFS has the additional requirement that after TGT file is renewed, the AFS token derived from it must be renewed as well.