Failover and failback operations

The failover operation is the process of switching production to a backup facility (normally your recovery site). A failback operation is the process of returning production to its original location after a disaster or a scheduled maintenance period.

There are times, both planned and unplanned, when it is necessary to suspend disk mirroring and to make use of the secondary storage unit in your configuration. As a manual process, this can be complex. However, failover and failback recovery operations are available to simplify this process and reduce the risk of error and the time it takes to switch sites and restart I/O operations.

Failover is the process of temporarily switching production to a backup facility (normally your recovery site) following a scheduled maintenance period or a disaster at your production (or local) site. A failover operation is always followed by a failback operation, which is the process of returning production to its original location. These operations use remote mirror and copy functions to reduce the time that is required to synchronize volumes after switching sites during planned or unplanned outages.

The failover and failback operations allow change recording to be enabled on the target volumes without having to communicate between the target and source storage units. This method eliminates the need to perform a full volume copy from your recovery site to the production site, which can reduce the time that is required to resume operations at your production site.

In a typical remote mirror and copy environment, processing will temporarily failover to the storage unit at your recovery site if an outage occurs at the production site. Through use of failover operations, the state of a storage unit in your target configuration changes. As a result, the storage unit is recognized as the source storage unit in the pair. Because the failover process puts the volumes into a suspended state, changes are tracked within a bitmap. Assuming that change recording is enabled, only change data is sent to the production site to synchronize the volumes, reducing the time that is required to complete the failback operation.

When it is safe to return to your production site, assuming that no physical damage has occurred to the storage unit in the location, you can delete paths and create new ones from your production site to your recovery site. Then, you can create a failback recovery request to restore the storage unit as the production storage unit in the relationship.

The following considerations are for failover and failback operations: