APAR status
Closed as program error.
Error description
The issue occurs intermittently when a running copy backup operation is canceled in IBM Spectrum Protect Plus. An SLA with many backups (over 10) associated is more likely to hit the problem. This causes the datamover pod to be stuck in a pending state indefinitely as it is unable to start and exit. A kubectl get pod on the datamover will remain Pending as in the following NAME READY STATUS RESTARTS AGE pvc-backup-112342 0/1 Pending 0 5s An event will exist to show the scheduling has failed: Warning FailedScheduling <unknown> default-scheduler persistentvolumeclaim "testpvc" not found The situation is confirmed to have occurred when the backup operation has been canceled, the job is no longer running, yet a kubectl get pods shows that the deployment and pods still exist on the Kubernetes system.
Local fix
Wait for running backup jobs to complete or finish canceling. Obtain a list of the stuck Pending deployments with the following kubectl command: kubectl get deployment --all-namespaces | grep 'resource-backup\|pvc-backup' Manually delete these deployments with the following: kubectl delete deployment -n <namespace> <deployment name>
Problem summary
**************************************************************** * USERS AFFECTED: * * IBM Spectrum Protect Plus level 10.1.7 * **************************************************************** * PROBLEM DESCRIPTION: * * see ERROR Description * **************************************************************** * RECOMMENDATION: * * Apply fixing level when available. This problem is currently * * projected to be fixed IBM Spectrum Protect Plus level * * 10.1.8. Note that this is subject to change at the * * discretion of IBM. * ****************************************************************
Problem conclusion
The problem has been fixed so that the scheduler will correctly check that a request from the Agent was completed or canceled. The new behavior is the data mover will not be created and left in a pending state on the kubernetes cluster.
Temporary fix
Wait for running backup jobs to complete or finish canceling. Obtain a list of the stuck Pending deployments with the following kubectl command: kubectl get deployment --all-namespaces | grep 'resource-backup\|pvc-backup' Manually delete these deployments with the following: kubectl delete deployment -n <namespace> <deployment name>
Comments
APAR Information
APAR number
IT36193
Reported component name
SP PLUS
Reported component ID
5737SPLUS
Reported release
A17
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2021-03-11
Closed date
2021-03-18
Last modified date
2021-03-18
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
SP PLUS
Fixed component ID
5737SPLUS
Applicable component levels
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSNQFQ","label":"IBM Spectrum Protect Plus"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"A17","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
31 January 2024