Recovering the CDC Replication Engine for Oracle databases after a database failover operation
CDC Replication requires a specific procedure to recover from an Oracle database failover operation.
About this task
In a disaster recovery configuration, users often set up their environments using a primary database and a physical standby database that are connected through DataGuard services. CDC Replication would normally be configured to replicate data from the primary database.
These databases have two mutually exclusive roles: primary and standby. The roles could be interchanged, which is known as role transition. Role transitions can happen due to planned transitions or as a result of a database failure.
- Planned switchover
- This scenario is a planned operation in which the primary database and the standby database change roles. A switchover guarantees no data loss. This planned operation occurs without having to re-instantiate either of the databases.
- Failover
- This scenario happens when the primary database fails, becomes unreachable and cannot be recovered in a timely manner. Failover might or might not result in data loss, depending on the data protection mode in use at the time of the failover. This type of transition requires a re-instantiation of the newly activated database.
A planned switchover is comprised of a series of steps that users would follow in order to switchover their databases. The CDC Replication Engine for Oracle databases should be taken into consideration as part of that plan.
The CDC Replication Engine for Oracle databases recovery mode provides a solution for some of the unplanned failover cases. This mode does not support automatic failover. The failover procedure is still manual. Implementing the manual failover procedure results in the following configuration:
In the new configuration, CDC Replication replicates from the newly primary database (B). The procedure described from now on does not describe how to move the CDC Replication Engine for Oracle databases from one machine to the other. It assumes that the software is ready to start replication from the newly primary database (B).
- Archives the current online redo logs if they are accessible
- Erases the contents of the online redo logs
- Resets the log sequence number to 1
- Creates the online redo log files if they do not currently exist
- Updates all current data files and online redo logs and all subsequent archived redo logs with a new RESETLOGS SCN and timestamp
The RESETLOG operation creates a new incarnation of the database. If the CDC Replication Engine for Oracle databases were to be started after the RESETLOG operation was executed, replication would fail because CDC Replication could only read logs from the current incarnation of the database.
To enable CDC Replication to resume replication, CDC Replication must be run in recovery mode. In recovery mode, CDC Replication reads logs from the previous incarnation of the database. The CDC Replication Engine for Oracle databases uses the dmfailoverrecovery command-line utility to implement this recovery mode.
The dmfailoverrecovery command enables the CDC Replication Engine for Oracle databases to read logs from the previous incarnation of the database until all required logs are processed. If the recovery step finishes successfully, the CDC Replication Engine for Oracle databases resumes normal replication of the new database incarnation.
The dmfailoverrecovery command starts replication for all configured subscriptions and mirrors data until all logs from the previous incarnation are processed and the last available SCN on the previous incarnation is reached.
To continue replication after to a database failover, perform the following procedure after executing the manual failover procedure and configuring the CDC Replication Engine for Oracle databases with the new primary database (B):