IT34900: DISTINCT CONCURRENT BACKUPS STOPPING RESPECTIVELY WITH CTGGA2403 'JOB IS TERMINATED' AND 'MOUNT.NFS: ACCESS DENIED'

APAR status

Closed as program error.

Error description

In rare cases, starting an IBM Spectrum Protect Plus backup
that needs the same vSnap volume than an already active backup
might fail both backup processes as seen in the following job
log output example.

The first backup job log will display :

SUMMARY,<date> 02:23:15,CTGGA2399,Starting job for policy
        BACKUP with job name <SLA_Name> (ID:<SLA_ID>). id ->
        <JobID_1>. IBM Spectrum Protect Plus version
        10.1.6-2045.
...
 DETAIL,<date> 02:26:28,CTGGA0576,Initializing backup of vm:
        <VM_1> (host: <ESXiHost_1>) data center:
        <DatacenterName>
 DETAIL,<date> 02:26:31,CTGGA0671,Backing up VM (<VM_1>) from
        remote proxy (IP: <vADP_IP>  Host name: <vADPHostName>)
   INFO,<date> 02:27:27,CTGGA2110,Will do incremental backup
        for VM <VM_1>
...
   INFO,<date> 02:31:30,CTGGA0590,VM: <VM_1> has transferred
        xxx GB ( yy%). Throughput since last update - zzz MB/s
   INFO,<date> 02:31:51,CTGGA2250,Proxy <vADPHostName>
        terminated request for VM <VM_1>
  ERROR,<date> 02:32:03,CTGGA2403,Backup of vm <VM_1> failed
        target storage volume name <vSnapVolumeName>.
        Error: The job is terminated.

The second started a few minutes later than the first job :

SUMMARY,<date> 02:29:07,CTGGA2399,Starting job for policy
        BACKUP with job name <SLA_Name> (ID:<SLA_ID>). id ->
        <JobID_2>. IBM Spectrum Protect Plus version
        10.1.6-2045.
...
 DETAIL,<date> 02:31:51,CTGGA0576,Initializing backup of vm:
        <VM_2> (host: <ESXiHost_2>) data center:
        <DatacenterName>
 DETAIL,<date> 02:31:52,CTGGA0671,Backing up VM (<VM_2>) from
        remote proxy (IP: <vADP_IP>  Host name: <vADPHostName>)
   INFO,<date> 02:32:59,CTGGA2110,Will do incremental backup
        for VM <VM_2>
...
  ERROR,<date> 02:34:49,CTGGA2403,Backup of vm <VM_2> failed
        target storage volume name <vSnapVolumeName>.
        Error: Command finished with error: exit status 32
        desc (mount.nfs: access denied by server while mounting
        <vSnapHostIP>:/vsnap/vpool<x>/fs<yy>) (Mounting failed)

The above different jobs are seen to use the same
<vSnapVolumeName>.

The following sequence will be seen at the same period in the
vSnap log :

NFS share 1 is created and updated by JobID_1 :
-----------------------------------------------
[<timestamp>] INFO pid-<xxxx> vsnap.common.util  AUDIT:
                   serveradmin: Created share id 1 named
                   /vsnap/vpool1/<vSnapVolumeID> for volume
                   id <vSnapVolumeID> named <vSnapVolumeName>
[<timestamp>] INFO pid-<xxxx> vsnap.common.util  AUDIT:
                   serveradmin: Updated share id 1 named
                   /vsnap/vpool1/<vSnapVolumeID> for volume
                   id <vSnapVolumeID> named <vSnapVolumeName>

NFS share 1 is  deleted by JobID_2 as it believes it is a stale
left over share from a previous backup.
This triggers JobID_1 failure processing :
--------------------------------------------------------------
[<timestamp>] INFO pid-<yyyy> vsnap.common.util  AUDIT:
                   serveradmin: Deleted share id 1 named
                   /vsnap/vpool1/<vSnapVolumeID> for volume
                   id <vSnapVolumeID> named <vSnapVolumeName>

A new NFS share 2 is created for the same volume by JobID_2 :
-------------------------------------------------------------
[<timestamp>] INFO pid-<zzzz> vsnap.common.util  AUDIT:
                   serveradmin: Created share id 2 named
                   /vsnap/vpool1/<vSnapVolumeID> for volume
                   id <vSnapVolumeID> named <vSnapVolumeName>
[<timestamp>] INFO pid-<yyyy> vsnap.common.util  AUDIT:
                   serveradmin: Updated share id 2 named
                   /vsnap/vpool1/<vSnapVolumeID> for volume
                   id <vSnapVolumeID> named <vSnapVolumeName>

NFS share 2 is deleted by JobID_1 cleanup process and causes
JobID_2 to fail :
------------------------------------------------------------
[<timestamp>] INFO pid-<xxxx> vsnap.common.util  AUDIT:
                   serveradmin: Deleted share id 2 named
                   /vsnap/vpool1/<vSnapVolumeID> for volume
                   id <vSnapVolumeID> named <vSnapVolumeName>

The probability for this to happen is higher when attempting
the above scenario for different guests having their files
located on a same VMware datastore or Hyper-V storage location.


IBM Spectrum Protect Plus Versions Affected:
IBM Spectrum Protect Plus 10.1.5 and higher

Initial Impact: Medium

Additional Keywords: SPP, SPPLUS, TS004318249

Local fix

Avoid starting separate jobs for single VMs in a short timespan
and wait for each backup job to complete before starting
another.

OR

Run a single backup job including all needed guests.
Within a single backup job, the multiple guest processing will
avoid the above situation.

Problem summary

****************************************************************
* USERS AFFECTED:                                              *
* IBM Spectrum Protect Plus level 10.1.5, 10.1.6 and 10.1.7.   *
****************************************************************
* PROBLEM DESCRIPTION:                                         *
* See ERROR DESCRIPTION                                        *
****************************************************************
* RECOMMENDATION:                                              *
* Apply the fixing level when available. This problem is       *
* currently projected to be fixed in IBM Spectrum Protect Plus *
* level 10.1.7 ifix2 and 10.1.8. Note that this is subject to  *
* change at the discretion of IBM.                             *
****************************************************************

Problem conclusion

Distinct concurrent backups no longer stop because the tracking
of vSnap NFS share usage across job sessions has been added.

Temporary fix

Comments

APAR Information

APAR number
IT34900
Reported component name
SP PLUS
Reported component ID
5737SPLUS
Reported release
A16
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2020-11-12
Closed date
2021-02-09
Last modified date
2021-02-09

APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name
SP PLUS
Fixed component ID
5737SPLUS

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSNQFQ","label":"IBM Spectrum Protect Plus"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"A16","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
31 January 2024

Tips

IT34900: DISTINCT CONCURRENT BACKUPS STOPPING RESPECTIVELY WITH CTGGA2403 'JOB IS TERMINATED' AND 'MOUNT.NFS: ACCESS DENIED'

Subscribe

APAR status

Closed as program error.

Error description

Local fix

Problem summary

Problem conclusion

Temporary fix

Comments

APAR Information

APAR number

Reported component name

Reported component ID

Reported release

Status

PE

HIPER

Special Attention

Submitted date

Closed date

Last modified date

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name

Fixed component ID

Applicable component levels

Document Information

Share your feedback

Need support?