IBM Storage Fusion Backup & Restore (Legacy) issues

List of known Backup & Restore (Legacy) issues and limitations in IBM Storage Fusion.

  • Whenever the "IBM Spectrum Protect Plus license expired" error occurs, do the following steps to fix the license issue:
    1. Log in to IBM Spectrum Protect Plus by using your spp-connection secret values. For the procedure to login, see Logging into IBM Spectrum Protect Plus.
      Note: The default credentials are admin/password.
    2. If you get a license expired error, then retrieve the license file /spp/server/SPP.lic from isf_bkprstr operator pod using oc command.
      See the following sample oc command:
      oc cp isf-bkprstr-operator-controller-manager-<podname>:/spp/server/SPP.lic SPP.lic
      Replace <podname> with your available podname. For example:
      <Podname>:isf-bkprstr-operator-controller-manager-599dc5b756-vcjd6
      Note: You must have a spp-connection secret after your first time login to IBM Spectrum Protect Plus by using the default set of credentials. For more information about the spp-connection secret creation, see What to do next section of Backup & Restore (Legacy).
    3. Copy the license and upload it from the user interface. For more details, see Uploading the product key.
  • If you restore an application to a new namespace and the original application is still running, then some of your pods may not come up.
    Cause
    Check whether conflict of resources exists. For example, IP address or port with the original application.
    Resolution
    To resolve the issue, reconfigure the restored application so that the pods can come up without any conflicts.
  • If IBM Spectrum Protect Plus agent (baas) upgrade fails, then run the following command to delete the Kafka pod from the OpenShift® Container Platform console:
    oc delete Kafka baas -n baas
  • Application becomes unresponsive when you create multiple locations
    This issue occurs whenever the container reaches its CPU and memory limits. Increase the CPU and memory limits and check whether the isf-ui-dep pod is still crashing or not. To change the limits of CPU and memory, update the UI operator code to increase it. For example, cpu: 500m memory: 500Mi.
  • If a Backup & Restore (Legacy) job fails to start and goes into aborted status, then as a resolution restart IBM Spectrum Protect Plus virgo pod from the IBM Spectrum Protect Plus user interface:
    1. Go to OpenShift Container Platform web management console.
    2. Go to Workloads > Pods.
    3. Select ibm-spectrum-protect-plus-ns project.
    4. Search for the sppvirgo pod.
    5. From the Actions menu, click Delete pod to respun it.
  • If IBM Storage Fusion is configured in an HTTP proxy environment, then defining an Object Storage Backup Storage Location that requires a proxy fails.
    Cause
    The IBM Spectrum Protect Plus does not support HTTP proxy.
    Resolution
    As a workaround, define a backup storage location in a transparent proxy mode.
  • In case where the retention period for a backup expires and the backup does not get deleted from the object storage in the subsequent maintenance cycle, delete it manually.
  • If you delete a backup policy that is associated with an application, it gets unassigned from the application but does not get deleted. To delete such a policy, first remove the assignment of the backup policy from the application and then delete the policy.
  • Backup & Restore (Legacy) backup jobs fail to retrieve the output files from baas-rest-spp-agent.baas.svc
    Cause
    Operations start to fail in the inventory phase when baas-spp-agent pod memory usage goes above 2450 MiB, closer to pod default limit of 2500MiB.
    Workaround
    Increase the amount of memory available for baas-spp-agent pod from 2500 MiB to 5000 MiB by adding the sppagent section to the IBMSPPC object:
    1. Use this command to obtain the correct value of sppagent.image digest.
      oc describe deployment.apps/baas-spp-agent -n baas | grep Image
    2. Edit IBMSPPC:
      oc edit IBMSPPC -n baas
      Sample YAML:
      
      sppagent:
          image:
            digest: sha256:3c32e1534118abe8f2b0ed7e058a81568d03c7cc5a3e07ddb6031c9de9c5bd3c
            name: baas-spp-agent
            pull_policy: Always
          replica_count: 1
          resources:
            limits:
              cpu: "3"
              ephemeral_storage: 20Gi
              memory: 5000Mi
            requests:
              cpu: "2"
              ephemeral_storage: 10Gi
              memory: 1250Mi
          rest_server_service:
            name: baas-rest-spp-agent
            port: 443
            port_name: rest-server
            target_port: 12345
          snapshot_restore_job_time_limit: 24
      
  • A "Failed restore snapshot" error occurs with applications using IBM Spectrum Scale storage PVCs.
    Cause
    The "disk quota exceeded" error occurs whenever you restore from an object storage location having applications that use IBM Spectrum Scale PVC with a size less than 5 GB.
    Workaround
    Increase the IBM Spectrum Scale PVC size to a minimum of 5 GB and do a backup and restore operation.

Known issues

  • Whenever the Virgo pod gets restarted, which usually takes a long time, restart all other pods.
  • Backup policies with custom frequency only run on the earliest scheduled date. This behavior applies to both new policies and those defined in previous versions before the upgrade.
  • When you restore backup to the same namespace, existing resources are not deleted or overwritten.
  • In the Storage tab of the application details page, the values of Used and Capacity might not display the correct values.
  • It is not possible to assign more than one policy to an application. However, if you upgraded from a previous version and had multiple policies that are assigned to the same application before the upgrade, then you might still be affected by this issue.
  • When you change the retention period for backups, it gets set for future backups. The expiration value for existing backups remain the same as the settings done during the backup operation.
  • The restore CR exists on the OpenShift even after the retention period of the backups expires.
  • when you backup an application present in namespace with high security context constraints privileges. The restored namespace will not have the same security context constraints privileges, resulting in restored pods in crashloopbackoff status.
    Workaround:
    • Restart the application pod.
  • Sometimes backups are not working as defined in the backup policies, especially when you set hourly policies. For example, if you set a policy for two hours and it does not run every two hours, there are gaps in the backup history. The possible reason might be that when a pod crashed and restarted, jobs scheduled were not accounting for the time zone, causing gaps in run intervals.
    The following are the observed symptoms:
    • Policies with custom every X hour at minutes YY schedules: the first scheduled run of this policy runs at minutes YY after X hours with a time zone offset from UTC instead of at minutes YY after X hours.
    • Monthly and yearly policies run more frequently.

    Workaround:

    • You can start backups manually until the next scheduled time.