APAR status
Closed as program error.
Error description
Offload to Spectrum Protect or Object Storage fails because the vsnap_targetcli can hang. The initial problem is caused by the following timeout that can be found in the vsnap.log: INFO pid-4341 vsnap.linux.system Executing command: vsnap_targetcli /loopback/naa.50014050f9c3e206/luns delete lun0 Subsequent offloads fail with entries similar, but not limited, to the following in the joblog: ERROR,id,time,2,CTGGA0309,Copy failed for snapshot (ID: 469) from source [server: <ip> volume: <volume name> snapshot: <snapshot name>] to target [server: <ip> volume: <offload session volume>]. Error: Exception: Failed to create gateway device: The file lock '/tmp/vsnap_target_lock' could not be acquired. or ERROR,id,time,2,CTGGA0309,Copy failed for snapshot (ID: 1101) from source [server: <ip> volume: <volume name> snapshot: <snapshot name>] to target [server: <ip> volume: <offload session volume>]. Error: Timeout: The file lock '/tmp/vsnap_target_lock' could not be acquired. The important part in the logs is indicated with "The file lock '/tmp/vsnap_target_lock' could not be acquired." | MDVPARTL 10.1.5.0-TIV_5737SPLUS IT32252| IBM Spectrum Protect Plus Versions Affected: IBM Spectrum Protect Plus 10.1.5 Initial Impact: medium Additional Keywords: SPP, SPPlus, TS003484210
Local fix
n/a
Problem summary
**************************************************************** * USERS AFFECTED: * * IBM Spectrum Protect Plus level 10.1.5 patch1 * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Apply fixing level when available. This problem is currently * * projected to be fixed in IBM Spectrum Protect Plus level * * 10.1.5.2218 and 10.1.6. Note that this is subject to change * * at the discretion of IBM. * ****************************************************************
Problem conclusion
As part of the 10.1.5 patch1 release, the Linux kernel bundled with vSnap was updated to version 4.19.101. The upstream kernel community had introduced a bug in the kernel that could cause the Linux IO (LIO) subsystem to hang when deleting loopback devices. vSnap uses loopback devices for copy operations to object storage. During the cleanup of the copy operation when the loopback device is cleaned up, the LIO hang could be triggered, which then caused all subsequent copy operations to fail. A previous attempt at fixing this problem was made in APAR IT32252 but this was incomplete and the problem could still occur. The problem has now been resolved by making additional fixes in the Linux kernel to avoid the hang. The updated kernel version 4.19.119-3c that contains these fixes has been incorporated into SPP/vSnap.
Temporary fix
Comments
APAR Information
APAR number
IT32466
Reported component name
SP PLUS
Reported component ID
5737SPLUS
Reported release
A15
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2020-04-16
Closed date
2020-06-08
Last modified date
2020-06-08
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
SP PLUS
Fixed component ID
5737SPLUS
Applicable component levels
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSNQFQ","label":"IBM Spectrum Protect Plus"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"A15","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
31 January 2024