IBM Support

IT34674: GUEST BACKUPS UNSUCCESSFUL WITH MESSAGES "VOLUME NOT FOUND" OR "UNKNOWN"

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • In rare cases when a IBM Spectrum Protect Plus backup job has
    a long duration, a vSnap volume might unexpectedly be deleted
    if the Maintenance job runs while the backup is still active.
    One of the actions done by the Maintenance job is to clean up
    expired snapshots according to the defined retention rules for
    the SLA.
    If the expiration process deletes the last remaining snapshot
    for a volume, then the Maintenance job will also delete that
    volume.
    If that volume deletion is done for a vSnap volume that is
    selected for the long running backup job, the following two
    types of errors can be seen for the guests that need their data
    to be stored on such a volume.
    
    1. The backup job successfully completed the data sending phase
       to the vSnap volume and is waiting to create a snapshot for
       that volume to commit the data.
       This zfs volume snapshot, by design, is scheduled when the
       guest data sending phase is completed for all the guests
       selected for the backup at the end of the job.
       If the volume to snapshot is deleted by the Maintenance job
       before a snapshot can be created, the error seen in the job
       log will be :
       CTGGA0076,Unprotected VM:  <VMName>. Last error: Unknown
    
    2. The guest is actively sending data or still waiting in the
       queue to send the backup data but the target vSnap volume is
       deleted by the Maintenance job.
       In that case, the error seen will be :
       CTGGA0076,Unprotected VM:  <VMName>. Last error: [Unable to
                 update access for volume Object not found on
                 Vsnap : 404 NOT FOUND]
    
    In the virgo log found in the Spectrum Protect Plus appliance
    log bundle covering the observed period, the following example
    will be seen :
    
    Job assigning VM to Volume:
    
    [<timestamp>] INFO .. <BackupJobID> volumeInfo : <vSnapHostID>
                                        .volume.<VolumeID> group
                                        <groupID>
    .. Returning target volume for VM: <VMName>
    .. destVolume <VolumeName>
    .. Vsnap Call https://<vSnap FQDN>:8900/api/volume/<VolumeID>/
             path?path=<PathName>/<VMName>.vm-<VMMobID> method GET
    .. Generating key for vsnap storage folder on server
       <vSnap FQDN> volume null path null
    .. add destinationStorageVolumesInfo xxxxx
    ...
    [<timestamp>] INFO .. <BackupJobID> volumeInfo :
                  <vSnapHostID>.volume.<VolumeID> group <groupID>
    
    Maintenance deleting the volume after the last remaining
    snapshot was expired :
    
    [<timestamp>] INFO .. <MaintenanceJobID> Vsnap Call
                  https://<vSnap FQDN>:8900/api/snapshot/
                  <SnapshotID> method GET
    .. Expiring retention snapshot <SnapshotName> using from
       protectionInfo
    .. Expiring retention snapshot <SnapshotName> using catalog
       manager
    .. Expiring retention snapshot <SnapshotName> from storage
       controller <vSnap FQDN> for policy vmware_infra-daily
    .. Vsnap Call https://<vSnap FQDN>:8900/api/snapshot/
       <SnapshotID> method DELETE
    .. Checking if volume can be deleted for snapshot <SnapshotName>
    .. Catalog volume size returned for policy vmware_infra-daily
       size 1
    .. Checking volume <VolumeName>
    .. Storage Cache :::: Get Volume  <vSnapHostID>:<VolumeID>
    .. Vsnap Call https://<vSnap FQDN>:8900/api/volume/<VolumeID>/
       snapshot method GET
    .. Deleting volume <VolumeName> Id(<VolumeID>). No remaining
       snapshots found on volume for policy vmware_infra-daily
    .. Vsnap Call https://<vSnap FQDN>:8900/api/volume/<VolumeID>
       ?force=true method DELETE
    
    
    IBM Spectrum Protect Plus Versions Affected:
    IBM Spectrum Protect Plus 10.1.x
    
    Initial Impact: Medium
    
    Additional Keywords: SPP, SPPLUS, TS004357635, maintenance,
                         not found, partial
    

Local fix

  • Ensure the defined retention period is long enough to prevent
    the last snapshot to be deleted before creating a new backup
    version.
    OR
    Avoid the Maintenance job to start during a long lasting
    backup job.
    Eventually pause it and release it after the backups are
    completed.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * IBM Spectrum Protect Plus level 10.1.6 and 10.1.7            *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See ERROR DESCRIPTION                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply fixing level when available. This problem is currently *
    * projected to be fixed in IBM Spectrum Protect Plus level     *
    * 10.1.7 ifix2 and 10.1.8. Note that this is subject to change *
    * at the discretion of IBM                                     *
    ****************************************************************
    

Problem conclusion

  • The issue is fixed by ensure that volume that is in use by
    backup will not be removed by maintenance even if there are no
    snapshots for that volume.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT34674

  • Reported component name

    SP PLUS

  • Reported component ID

    5737SPLUS

  • Reported release

    A16

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2020-10-26

  • Closed date

    2021-02-11

  • Last modified date

    2021-02-11

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    SP PLUS

  • Fixed component ID

    5737SPLUS

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSNQFQ","label":"IBM Spectrum Protect Plus"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"A16","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
31 January 2024