IBM Support

IT34900: DISTINCT CONCURRENT BACKUPS STOPPING RESPECTIVELY WITH CTGGA2403 'JOB IS TERMINATED' AND 'MOUNT.NFS: ACCESS DENIED'

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • In rare cases, starting an IBM Spectrum Protect Plus backup
    that needs the same vSnap volume than an already active backup
    might fail both backup processes as seen in the following job
    log output example.
    
    The first backup job log will display :
    
    SUMMARY,<date> 02:23:15,CTGGA2399,Starting job for policy
            BACKUP with job name <SLA_Name> (ID:<SLA_ID>). id ->
            <JobID_1>. IBM Spectrum Protect Plus version
            10.1.6-2045.
    ...
     DETAIL,<date> 02:26:28,CTGGA0576,Initializing backup of vm:
            <VM_1> (host: <ESXiHost_1>) data center:
            <DatacenterName>
     DETAIL,<date> 02:26:31,CTGGA0671,Backing up VM (<VM_1>) from
            remote proxy (IP: <vADP_IP>  Host name: <vADPHostName>)
       INFO,<date> 02:27:27,CTGGA2110,Will do incremental backup
            for VM <VM_1>
    ...
       INFO,<date> 02:31:30,CTGGA0590,VM: <VM_1> has transferred
            xxx GB ( yy%). Throughput since last update - zzz MB/s
       INFO,<date> 02:31:51,CTGGA2250,Proxy <vADPHostName>
            terminated request for VM <VM_1>
      ERROR,<date> 02:32:03,CTGGA2403,Backup of vm <VM_1> failed
            target storage volume name <vSnapVolumeName>.
            Error: The job is terminated.
    
    The second started a few minutes later than the first job :
    
    SUMMARY,<date> 02:29:07,CTGGA2399,Starting job for policy
            BACKUP with job name <SLA_Name> (ID:<SLA_ID>). id ->
            <JobID_2>. IBM Spectrum Protect Plus version
            10.1.6-2045.
    ...
     DETAIL,<date> 02:31:51,CTGGA0576,Initializing backup of vm:
            <VM_2> (host: <ESXiHost_2>) data center:
            <DatacenterName>
     DETAIL,<date> 02:31:52,CTGGA0671,Backing up VM (<VM_2>) from
            remote proxy (IP: <vADP_IP>  Host name: <vADPHostName>)
       INFO,<date> 02:32:59,CTGGA2110,Will do incremental backup
            for VM <VM_2>
    ...
      ERROR,<date> 02:34:49,CTGGA2403,Backup of vm <VM_2> failed
            target storage volume name <vSnapVolumeName>.
            Error: Command finished with error: exit status 32
            desc (mount.nfs: access denied by server while mounting
            <vSnapHostIP>:/vsnap/vpool<x>/fs<yy>) (Mounting failed)
    
    The above different jobs are seen to use the same
    <vSnapVolumeName>.
    
    The following sequence will be seen at the same period in the
    vSnap log :
    
    NFS share 1 is created and updated by JobID_1 :
    -----------------------------------------------
    [<timestamp>] INFO pid-<xxxx> vsnap.common.util  AUDIT:
                       serveradmin: Created share id 1 named
                       /vsnap/vpool1/<vSnapVolumeID> for volume
                       id <vSnapVolumeID> named <vSnapVolumeName>
    [<timestamp>] INFO pid-<xxxx> vsnap.common.util  AUDIT:
                       serveradmin: Updated share id 1 named
                       /vsnap/vpool1/<vSnapVolumeID> for volume
                       id <vSnapVolumeID> named <vSnapVolumeName>
    
    NFS share 1 is  deleted by JobID_2 as it believes it is a stale
    left over share from a previous backup.
    This triggers JobID_1 failure processing :
    --------------------------------------------------------------
    [<timestamp>] INFO pid-<yyyy> vsnap.common.util  AUDIT:
                       serveradmin: Deleted share id 1 named
                       /vsnap/vpool1/<vSnapVolumeID> for volume
                       id <vSnapVolumeID> named <vSnapVolumeName>
    
    A new NFS share 2 is created for the same volume by JobID_2 :
    -------------------------------------------------------------
    [<timestamp>] INFO pid-<zzzz> vsnap.common.util  AUDIT:
                       serveradmin: Created share id 2 named
                       /vsnap/vpool1/<vSnapVolumeID> for volume
                       id <vSnapVolumeID> named <vSnapVolumeName>
    [<timestamp>] INFO pid-<yyyy> vsnap.common.util  AUDIT:
                       serveradmin: Updated share id 2 named
                       /vsnap/vpool1/<vSnapVolumeID> for volume
                       id <vSnapVolumeID> named <vSnapVolumeName>
    
    NFS share 2 is deleted by JobID_1 cleanup process and causes
    JobID_2 to fail :
    ------------------------------------------------------------
    [<timestamp>] INFO pid-<xxxx> vsnap.common.util  AUDIT:
                       serveradmin: Deleted share id 2 named
                       /vsnap/vpool1/<vSnapVolumeID> for volume
                       id <vSnapVolumeID> named <vSnapVolumeName>
    
    The probability for this to happen is higher when attempting
    the above scenario for different guests having their files
    located on a same VMware datastore or Hyper-V storage location.
    
    
    IBM Spectrum Protect Plus Versions Affected:
    IBM Spectrum Protect Plus 10.1.5 and higher
    
    Initial Impact: Medium
    
    Additional Keywords: SPP, SPPLUS, TS004318249
    

Local fix

  • Avoid starting separate jobs for single VMs in a short timespan
    and wait for each backup job to complete before starting
    another.
    
    OR
    
    Run a single backup job including all needed guests.
    Within a single backup job, the multiple guest processing will
    avoid the above situation.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * IBM Spectrum Protect Plus level 10.1.5, 10.1.6 and 10.1.7.   *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See ERROR DESCRIPTION                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply the fixing level when available. This problem is       *
    * currently projected to be fixed in IBM Spectrum Protect Plus *
    * level 10.1.7 ifix2 and 10.1.8. Note that this is subject to  *
    * change at the discretion of IBM.                             *
    ****************************************************************
    

Problem conclusion

  • Distinct concurrent backups no longer stop because the tracking
    of vSnap NFS share usage across job sessions has been added.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT34900

  • Reported component name

    SP PLUS

  • Reported component ID

    5737SPLUS

  • Reported release

    A16

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2020-11-12

  • Closed date

    2021-02-09

  • Last modified date

    2021-02-09

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    SP PLUS

  • Fixed component ID

    5737SPLUS

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSNQFQ","label":"IBM Spectrum Protect Plus"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"A16","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
31 January 2024