File backup techniques

If you are backing up your system on a file-by-file basis, you can use several backup techniques.

Use the following information to determine which file backup technique to use that best meets your needs.

Progressive incremental backup

Progressive incremental backup is the standard method of backup that is used by Tivoli® Storage Manager. Incremental backup processing backs up only those files that changed since the last full or incremental backup, unless the files are excluded from backup.

How it works
The following processes occur during an incremental backup:
  • The client queries the Tivoli Storage Manager server for active backup version metadata.
  • The server returns a list of active backup versions for the entire file system.
  • The client scans and compares the list with the local file system to determine which files are new or changed since the last backup.
  • The client backs up the new and changed files.
When to use
Use incremental backup when the system is not constrained by memory, backup window duration, or other operational issues. Incremental backup is the default backup method.
Advantages
Incremental backup processing has the following advantages:
  • This method is the most comprehensive backup method for Tivoli Storage Manager.
  • No redundant backups are made. You back up only what is changed.
  • There is less network utilization because unchanged files do not have to be sent over the network.
  • This method is a form of single-instance storage because a file is not backed up again if it does not change. Incremental backups are more efficient and save space on the server storage pools.
  • Files are easier to restore because you do not have to restore the base backup version first and apply incremental or differential changes.
Disadvantages
Incremental backups processing has the following disadvantages:
  • The client system might run out of memory if the number of active backup versions is too large.
  • The time that it takes to scan file systems that contain millions of files can exceed the duration of the backup window.
Related information: For more information about progressive incremental backup, search for incremental backup in the documentation.

Journal-based backup

Journal-based backup is an alternative form of incremental backup that uses a change journal that is maintained by the Tivoli Storage Manager journal process. On Windows clients, the change journal is maintained by a journal service. On AIX® and Linux clients, the change journal is maintained by a journal daemon process.

How it works
The following processes occur during journal-based backup processing:
  • Journal-based backup processing uses real-time monitoring of a file system for changed files.
  • The names of the changed files are logged to the journal database.
  • During backup processing, the client queries the journal for the list of changed files, and then backs up the changed files.
When to use
Use journal-based backup in the following situations:
  • The scheduled backups are not completed within the allotted time.
  • There are less than 1,000,000 files and a small number of changes between backups (less than 1,000,000).
  • There are less than 10,000,000 objects with 10-15% velocity of change. The velocity of change means the rate at which files are changed over a short amount of time (such 1 or 2 seconds).
Advantages
Journal-based backup can often greatly reduce the time that it takes to determine which files changed.
Disadvantages
Journal-based backup processing has the following limitations:
  • You must still run incremental backups periodically.
  • Journal-based backups are not suitable for file systems where large numbers of files can change over a short time interval, such as changing hundreds or thousands of files in 1 or 2 seconds.
  • This method is available only on Windows, AIX, and Linux clients.

Memory-efficient backup

The performance of incremental backups can degrade if the system is memory-constrained before the backup begins. Run incremental backup with the memoryefficientbackup yes option in the client options file. This setting causes the client to process only one directory at a time during incremental backups, which reduces memory consumption but increases backup time.

How it works
The following processes occur during an incremental backup with the memory-efficient setting:
  • The client queries the server for the metadata of active backup versions for the first directory to be backed up.
  • The server returns a list of active backup versions for the directory.
  • The client scans the list and compares it with the local file system, and backs up the new and changed files.
  • The client queries the server for the next directory and repeats the process for all directories.
When to use
Use memory-efficient backup when your system has a low amount of memory available for incremental backups.
Advantages
Memory-efficient backup is a comprehensive incremental backup with a smaller backup memory footprint.
Disadvantages
Memory-efficient backup processing has the following disadvantages:
  • The backup run time is increased.
  • This method does not work for a single directory that contains a large number of files.
  • If the system is not memory-constrained, running memory-efficient backup can degrade the backup performance.

Memory-efficient backup with disk caching

If your client system is memory-constrained and incremental backups still cannot complete successfully with the memoryefficientbackup yes setting, run incremental backups with the memoryefficientbackup diskcachemethod option. This setting causes the client to use less memory but requires more disk space on the client system.

How it works
This method is similar to incremental backup processing but the client temporarily stores active backup version metadata on disk instead of memory.
When to use
Use memory-efficient backup with disk caching in the following situations:
  • The client is running out of memory with incremental backups and memory-efficient backup is not sufficient.
  • Journal-based backup is not available on the operating system.
Advantages
Memory-efficient backup with disk caching is a comprehensive incremental backup operation with a smaller backup memory footprint.
Disadvantages
Memory-efficient backup processing with disk caching has the following disadvantages:
  • The backup processing time might be longer because the active backup inventory is on disk instead of in memory.
  • Gigabytes of free disk space are required to temporarily cache the active backup inventory.

Backup of virtual mount points

You can save processing time when you define a virtual mount point within a file system because it provides a direct path to the files that you want to back up.

How it works
The following processes occur during the backup of virtual mount points:
  • Instead of backing up an entire file system to a single file space on the server, you can logically partition a large file system into smaller file systems, and then define mount points for backup processing.
  • The file system that are represented by the mount points can be managed as separate file spaces on the server.
When to use
Use virtual mount points to back up large, balanced, AIX, HP-UX, Linux, and Solaris file systems that can be efficiently divided into logical partitions.
Advantages
Backup processing of virtual mount points provides a balanced approach to the backup of large file systems by effectively dividing them into smaller file systems. It is more efficient than defining the file system with the domain option, and then specifying the exclude option to exclude the files you do not want to back up.
Disadvantages
Backup processing of virtual mount points has the following limitations:
  • This method of backup processing is not appropriate for a single directory that contains a large number of files.
  • Virtual mount points are static and cannot be changed.
  • This method requires monitoring to ensure that new directories are still backed up in one of the virtual mount points, along with other processing that is required to maintain the virtual mount point definitions.
  • Command-line restore operations require the use of braces ( { } ) to delimit the virtual mount point name in the file specification.
  • This method is only available for AIX, HP-UX, Linux, and Solaris operating systems.
Related concept: File space tuning

Incremental-by-date backup

This backup method backs up new and changed files that have a modification date later than the date of the last incremental backup that is stored at the server, unless the files are excluded from backup.

How it works
The following processes occur during an incremental-by-date backup:
  • The client queries the server for the most recent backup of the entire file system.
  • The server returns the time stamp of the most recent backup of the entire file system.
  • The client scans and compares the list from the server with the local file system and backs up the new and changed files that are based on the time stamp of the most recent backup.
When to use
Use incremental-by-date backup in the following situations:
  • The scheduled backups are not completed within the allotted time.
  • The changes to the file system are additive or changing, but not deleted.
  • You also run weekly (or periodic) full incremental backups.
Advantages
Incremental-by-date backup processing has the following benefits:
  • This method reduces the time that it takes to determine which files changed.
  • This method removes the processing time on the server that is used to query the database for changed files.
  • This method removes the network traffic that is used to communicate the query results.
Disadvantages
Incremental-by-date backup processing has the following disadvantages:
  • This method reduces the flexibility of the scope of the backup operation. You must back up the entire file system.
  • The files are not backed up if the changes do not affect the date (for example, attribute, mode, ACL, rename, copy, move, and security changes).
  • The deleted files are not expired on the server.
  • Policy rebinding does not take place.
  • The entire file system must be scanned.
  • This method cannot be used if the client and server clocks are set to different times or are not in the same time zone.

File list backup

You can control which files are backed when run a backup with the filelist option.

How it works
File list backup can be used in the following manner:
  • An application creates a list of files for backup and passes the list to the client.
  • The client runs a selective backup of the files that are specified in the list.
When to use
Use file list backup in the following situations:
  • The scheduled backups are not completing within the allotted time.
  • The list of changed files is known.
Advantages
Selective backup eliminates the query of the server database and the scan of local file system.
Disadvantages
File list backup has the following disadvantages:
  • You must find a way to create the file list.
  • You must explicitly specify the files. You cannot use wildcard characters or directory recursion in the file list.
Related reference: Filelist

Multiple session backup

The backup-archive client can run concurrent sessions to back up and restore data to help improve performance. During incremental backup processing, the client can process multiple objects in parallel by opening more than one session with the Tivoli Storage Manager server.

How it works
Multiple sessions are used when you specify multiple file specifications on a backup, restore, archive, or retrieve command. For example, you can start a multiple session backup with the following command:
  • On the AIX, HP-UX, Linux, Mac OS X, or Solaris client:
    incr /Volumes/filespace_A /Volumes/filespace_B
  • On the Windows client:
    incr c: d:

The resourceutilization option is used to regulate the level of resources that the Tivoli Storage Manager server and client can use during processing. The default is to use a maximum of two sessions, one session to query the server and one session to send file data.

When to use
Use multiple backup sessions when you want to increase client performance, and you have sufficient client and server resources and processing capacity. For example, the server and client hardware must have sufficient memory, storage, and processor capacity to support multiple sessions. The network bandwidth must also be sufficient to handle the increased amount of data that flows across the network.
Advantages
Using more than one backup session can often lead to overall improvements in throughput.
Disadvantages
Running multiple backup sessions has the following disadvantages. Some workarounds are included.
  • During a multiple session backup operation, files from one file specification might be stored on multiple tapes on the server and interspersed with files from different file specifications. This arrangement can decrease restore performance.

    To avoid the performance degradation in restore operations, set the collocatebyfilespec option to yes. This setting eliminates the interspersing of files from different file specifications by limiting the client to one server session for each file specification. Therefore, if the data is stored to tape, the files for each file specification are stored together on one tape, unless another tape is required for more capacity.

  • The client might produce multiple accounting records.
  • The server might not start enough concurrent sessions. To avoid this situation, the maxsessions server parameter must be reviewed and possibly changed.
  • A query node command might not summarize the client activity.

Adaptive subfile backup

If you plan to back up your files over a network device with limited bandwidth, you can reduce network traffic by using adaptive subfile backup.

How it works
Adaptive subfile backup processing backs up the changed portion of a file on Windows clients.
When to use
Use adaptive subfile backup in the following situations:
  • The network is constrained.
  • The file sizes are small (less than 2 GB in size).
Advantages
Adaptive subfile backup processing has the following benefits:
  • Faster throughput.
  • Reduced storage pool consumption.
Disadvantages
Adaptive subfile backup processing has the following disadvantages:
  • This method uses a large amount of local cache space.
  • Some processing time is required during the backup.
  • The restore operations can take longer because of the base file and delta files are restored.
  • The client can run out of disk space during the restore if disk space is constrained because of how files are reconstructed from the base files and the delta files.