IBM Support

slow ontape backup or archive of Informix database server instance

Troubleshooting


Problem

You run an ontape backup but it runs too slowly.

Cause

  • TAPEBLK is set too low in the ONCONFIG to take full advantage of the device's I/O throughput potential.
  • The actual I/O speed is lower than the expected I/O speed of the devices used.
  • Not enough resources available on computer like CPU, memory, and so on. The system might be busy.

Diagnosing The Problem

  • Run a timed test to determine approximate speed of I/O from a chunk to the backup device . Run the dd command with a block size (bs) equal to the block size value for TAPEBLK in the ONCONFIG. Try to use a chunk that is at least 2 GB.

    Note:
    This test might not work on a running instance because of an OS lock on the chunk that will not allow the dd command to run. If you cannot run the dd command on an actual chunk then you must use or create a file which resides in the same location as the chunks ( the same directory, same disk, same NFS mount, same SAN, and so on ). This is critical for this test to yield accurate timings.

    Here is the command:

    timex dd if=/full/path/to/informix/chunk of=/full/path/to/backup_directory/timetest.out bs=128k


    Here is an actual example:

    timex dd if=/informix/chunks/rootdbs of=/informix/backups/timetest.out bs=256k
    586+1 records in
    586+1 records out

    real 7.76
    user 0.00
    sys 0.48

    This dd test shows the speed for this backup would be about 18.9 MB/second or 66.4 GB/hour. This would be the approximate speed of ontape I/O with some consideration for ontape overhead.

  • Capture and review the ontape function stacks to find out exactly what ontape is doing. Soon after you start the ontape backup you should see these 3 threads listed in the onstat -g ath output:

    ontape
    arcbackup1
    arcbackup2

    The output will be similar to this:

    ...
    213  41dac918  40347be0  1   cond wait  netnorm  1cpu   ontape
    214  41980a28  40348c50  1   IO Wait             1cpu   arcbackup1
    215  41980c88  40344a90  2   sleeping secs: 1    1cpu   arcbackup2


    To get the stack trace from each thread you will need to 3 separate commands. The first column of the onstat -g ath output is the thread id. Using the example output from above you would run these commands to continuously capture stack information for each of the 3 threads every 5 seconds:

    onstat -g stk 213 -r > ontape.out
    onstat -g stk 214 -r > arcbackup1.out
    onstat -g stk 215 -r > arcbackup2.out

    You can run these in the background or in 3 separate windows. Stop them when the backup is complete.

Resolving The Problem

  • Increase the value of the ONCONFIG configuration parameter TAPEBLK. If using a tape device, then use the device's suggested maximum block size. If writing the backup to disk, then continue increasing TAPEBLK on subsequent backups until you reach the point where increasing TAPEBLK does not decrease time to complete the backup ( you could start at 128 then go to 256, 512, 1024, and so on ).
  • Use faster backup device or find ways to increase I/O of that device. Some examples:
    • use a faster tape drive or a tape that can handle higher I/O speeds
    • backup to local disk then ftp the backup to a SAN
  • Collect ontape thread stack traces and analyze or send to technical support for analysis.

[{"Product":{"code":"SSGU8G","label":"Informix Servers"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"--","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF010","label":"HP-UX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"},{"code":"PF033","label":"Windows"}],"Version":"11.1;11.5;11.7","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
16 June 2018

UID

swg21503185