IBM Support

IT35569: "CTGGA1986 UNABLE TO RESOLVE DATABASE STORAGE. REASON COULD NOT CONNECT TO STORAGE <VSNAPHOST>" INTERMITTENT MESSAGE

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Intermittently, jobs can stop because of insufficient available
    memory on the IBM Spectrum Protect Plus vSnap server version
    10.1.7.
    When the workload is too high or the server is sized too small
    to complete the applied jobs, some jobs might stop with a
    message indicating that the vSnap server is unreachable.
    Even when the host has sufficient resources, the problem can
    surface but less frequently.
    The following example messages might be seen (here it is a
    replication job for a database application) :
    
    SUMMARY,<timestamp>,CTGGA2398,Starting job for policy
            <PolicyName> (ID:<PolicyID>). id -> <JobID>. IBM
            Spectrum Protect Plus version 10.1.7-3043.
    ...
      ERROR,<timestamp>,CTGGA1986,Unable to resolve database
            storage. Reason Could not connect to storage
            <vSnapHost>. Be sure storage is reachable.
      ERROR,<timestamp>,CTGGA1847,Unable to determine backup policy
            name from recovery points. Cannot proceed with job
      ERROR,<timestamp>,CTGGA1953,Error during copy. Reason:
            DB_REPLICATION_EXCEPTION_OCCURRED
    
    in the virgo log '/opt/virgo/serviceability/logs/log.log', the
    following corresponding message will be seen for some https
    request to the vSnap host:
    
    [<timestamp>] INFO  .. Vsnap Call https://<vSnapHost>:8900/
                           api/system method GET
    [<timestamp>] INFO  .. VSnap Call GET https://<vSnapHost>:8900/
                           api/system time Taken 1338 ms
    [<timestamp>] INFO  .. reason : org.springframework.web.client.
                           HttpClientErrorException: 401
                           UNAUTHORIZED
    [<timestamp>] INFO  .. Status: :: 401
    [<timestamp>] ERROR .. Unable to resolve database storage.
                           Reason Could not connect to storage
                           <vSnapHost>. Be sure storage is
                           reachable.
    
    in the vSnap log, the root cause, insufficient memory, will be
    seen :
    
    [<timestamp>] ERROR pid-xxxx vsnap.api  Traceback (most recent
                        call last):
      File "/src/workspace/vsnap/api/core/common.py", line 51,
           in decorated
      File "/src/workspace/vsnap/common/util.py", line 228,
           in check_api_priv
      File "/src/workspace/vsnap/linux/system.py", line 406,
           in run_shell_command
      File "/usr/lib64/python3.6/subprocess.py", line 729,
           in __init__
        restore_signals, start_new_session)
      File "/usr/lib64/python3.6/subprocess.py", line 1295,
           in _execute_child
        restore_signals, start_new_session, preexec_fn)
    OSError: [Errno 12] Cannot allocate memory
    
    Depending on the fluctuating workload, when sufficient memory
    is again available, the api on the vSnap host will again be
    able to fulfill https requests and jobs will be able to
    complete.
    IBM Spectrum Protect Plus messaging should be reporting the
    actual root cause of the failure.
    
    IBM Spectrum Protect Plus Versions Affected:
    IBM Spectrum Protect Plus 10.1.7
    
    | MDVREGR 10.1.6 5737SPLUS |
    
    Initial Impact: Medium
    
    Additional Keywords: SPP, SPPLUS, TS004708004, memory, sizing
    

Local fix

  • 1. Ensure the vSnap server is sized following the best
       practices listed in the BluePrints :
    
    https://www.ibm.com/support/pages/ibm-spectrum-protect-plus-blue
    prints
    2. To recover :
       Run the following command on the vSnap host as 'serveradmin'
       user :
           sudo systemctl restart vsnap-api
       This command might take a long time before completing.
       OR
       Reboot the vSnap host
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * IBM Spectrum Protect Plus level 10.1.7.                      *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See Error Description.                                       *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply the fixing level when available. This problem was      *
    * fixed in IBM Spectrum Protect Plus levels 10.1.7 ifix2 and   *
    * 10.1.8. Note that this is subject to change at the           *
    * discretion of IBM.                                           *
    ****************************************************************
    

Problem conclusion

  • A memory leak in the vSnap API process used for handling PAM
    authentication caused an increase in RAM usage over time.
    Eventually this went on to cause failure to allocate new memory
    resulting in authentication failures when the IBM Spectrum
    Protect Plus server made API requests to the vSnap server. The
    problem has been resolved by correcting the memory leak.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT35569

  • Reported component name

    SP PLUS

  • Reported component ID

    5737SPLUS

  • Reported release

    A17

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2021-01-18

  • Closed date

    2021-02-12

  • Last modified date

    2021-02-12

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    SP PLUS

  • Fixed component ID

    5737SPLUS

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSNQFQ","label":"IBM Spectrum Protect Plus"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"A17","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
31 January 2024