IBM Support

IT31889: IF REPLICATION IS SLOW OR TAKES TOO LONG THEN THE OPERATION FAILS WITH ERROR MESSAGES.

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • When a replication with IBM Spectrum Plus is running between two
    remote sites, if it takes longer than 10 days or runs over an e
    xtremely slow network, it can fail with one of the following err
    ors that can be seen in the job log and/or repl.log on the prima
    ry vSnap.
    No such file or directory:'/tmp/repld_send_<id>_error'
    Receive operation failed: packet_write_wait: Connection to port2
    2:Broken pipe.
    

Local fix

  • If the replication runs for too long, some important files in /t
    mp get cleaned up. To avoid this, run the following command on t
    he vSnap:
    $ echo 'x /tmp/repld_*' | sudo tee -a /usr/lib/tmpfiles.d/tmp.co
    nf
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * IBM Spectrum Protect Plus level 10.1.5.                      *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See Error Description                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply fixing level when available. This problem is currently *
    * projected to be fixed in IBM Spectrum Protect Plus levels    *
    * 10.1.5 patch1 and 10.1.6. Note that this is subject to       *
    * change at the discretion of IBM.                             *
    ****************************************************************
    

Problem conclusion

  • The replication process on vSnap makes use of some temporary
    files under /tmp. If a replication session runs very slowly and
    lasts for several days, the temporary files may remain
    unmodified for several days. This can cause the operating system
    to remove the temporary files which eventually results in a
    failure of the vSnap replication session. This problem has been
    resolved by modifying the system configuration to exclude the
    replication-specific temporary files from being cleaned up
    automatically.
    
    Additionally, when the replication sessions runs very slowly,
    the SSH connection between the vSnap servers may experience
    timeouts and disconnections if no data is transferred for a few
    minutes. This also results in a failure of the replication
    session. This problem has been resolved by ensuring that when
    vSnap creates an SSH connection during replication, it uses
    larger default values for the parameters 'ServerAliveInterval'
    and 'ServerAliveCountMax'.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT31889

  • Reported component name

    SP PLUS

  • Reported component ID

    5737SPLUS

  • Reported release

    A15

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2020-02-18

  • Closed date

    2020-02-20

  • Last modified date

    2020-02-20

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    SP PLUS

  • Fixed component ID

    5737SPLUS

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSNQFQ","label":"IBM Spectrum Protect Plus"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"A15","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
31 January 2024