TAKEOVER HADR command
The TAKEOVER HADR command instructs an HADR standby database to take over as the new HADR primary database for the HADR pair. This is a cluster-wide command in a Db2® pureScale® environment, so you can issue it on any member on the standby, including non-replay members.
Authorization
One of the following authorities:
- SYSADM
- SYSCTRL
- SYSMAINT
Required connection
Instance. The command establishes a database connection if one does not exist, and closes the database connection when the command completes.
Command syntax
Command parameters
- DATABASE database-alias
- Identifies the current HADR standby database that should take over as the HADR primary database.
- USER user-name
- Identifies the user name under which the takeover operation is to be started.
- USING password
- The password used to authenticate user-name.
- BY FORCE
- Specifies that the database is not to wait for confirmation that the original HADR primary
database has been shut down. Unless you are using SUPERASYNC
synchronization mode, this option is required if the HADR pair is not in peer state.
- PEER WINDOW ONLY
- When this option is specified, there is not any committed transaction loss if the command
succeeds and the primary database is brought down before the end of the peer window period (set the
database configuration parameter hadr_peer_window to a non-zero value). Not
bringing down the primary database, before the peer window expires, results in a split
brain. If the TAKEOVER BY FORCE PEER WINDOW ONLY command is executed when
the HADR pair is not in a peer or disconnected peer state (the peer window has expired), an error is
returned.
You cannot use the PEER WINDOW ONLY option when the synchronization mode is set to ASYNC or SUPERASYNC.
Note: The takeover operation with the PEER WINDOW ONLY option can behave incorrectly if the primary database clock and the standby database clock are not synchronized to within 5 seconds of each other. That is, the operation may succeed when it should fail, or fail when it should succeed. You should use a time synchronization service (for example, NTP) to keep the clocks synchronized to the same source.
Usage notes
Table 1 and Table 2 show the behavior of the TAKEOVER HADR command when issued on an active standby database for each possible state and option combination. An error message is returned if this command is issued on an inactive standby database.
Standby state | Takeover behavior |
---|---|
Disconnected peer | Takeover fails and an error message is returned. |
Local catchup | Takeover fails and an error message is returned. |
Peer | The primary database and standby database switch roles. If no failure is encountered during takeover, there is no data loss. However, if failures are encountered during takeover, data loss might occur and the roles of the primary and standby might or might not have been changed. The following is a guideline for handling failures during a takeover in which the primary and standby switch roles:
|
Remote catchup |
Non-forced takeover is allowed in remote catchup state only if one of the following
is true:
|
Remote catchup pending | Takeover fails and an error message is returned. |
Standby state | Takeover behavior |
---|---|
Disconnected peer (without the PEER WINDOW ONLY option) | The standby database becomes the primary database, but there is no assurance
of data consistency. Note: A
no transaction losstakeover is also possible using the TAKEOVER BY FORCE command without the PEER WINDOW ONLY option, that is, unconditional failover, as long as the necessary conditions hold. Such a failover can be executed even long after the expiration of the peer window that was in effect when the primary failed. |
Disconnected peer (with the PEER WINDOW ONLY option) |
The standby database becomes the primary database, and there is a greater assurance of data
consistency than if you did not specify the PEER WINDOW ONLY option. There are
situations in which data loss can still happen:
|
Local catchup |
In most cases, takeover fails and an error message is returned. The exception is when primary reintegration is in progress; during the reintegration, forced a takeover is allowed on a standby in local catchup state. |
Peer |
The standby database becomes the primary database, but there is no assurance of data consistency. Even with SYNC and NEARSYNC mode, the primary can fall out of peer state and commit more transactions, with the standby still in peer state and not aware of the primary's state change (the primary and standby may not notice network connection breakage at the same time). |
Remote catchup |
The standby database becomes the primary database, but there is a risk of data loss. |
Remote catchup pending |
The standby database becomes the primary database, but there is a risk of data loss. If log retrieval is in progress (retrieval only happens in remote catchup pending state), retrieval is stopped as part of the takeover process. |
When issuing the TAKEOVER HADR command, the corresponding error codes might be generated: SQL1767N, SQL1769N, or SQL1770N with a reason code of 98. The reason code indicates that there is no installed license for HADR on the server where the command was issued. To correct the problem, install a valid HADR license using the db2licm or install a version of the server that contains a valid HADR license as part of its distribution.
- If you do not have an immediate need to connect to the standby database, wait for the UPGRADE DATABASE command to complete on the primary database and the standby database to replay all upgrade log records that were sent from the primary database then reissue the command.
- If you need to connect to this standby database immediately, issue the STOP HADR command to turn the HADR role to STANDARD.
- Forced takeover stops log shipping or log retrieval on the standby. Log replay continues to the end of received or retrieved logs.
- During a forced takeover, if the standby is connected to the old primary, it sends a
poison pill, or a disabling message, to the old primary. This is done on a best effort
basis; due to network, hardware, or software problems, the old primary might not receive the
disabling message or correctly process it. Once the disabling message is received, the old primary
should persist the poison pill to disk and shut itself down. As long as the pill is in effect, the
old primary cannot be restarted. The pill is cleared only when one of the following commands is
issued on the old primary:
- START HADR with the AS STANDBY option (that is, the old primary is reintegrated as a new standby)
- START HADR with the AS PRIMARY and BY FORCE options (the old primary is explicitly restarted as the primary, for reasons such as: the new primary failed to serve as the primary, so the user switches back to old primary; the user wants a clone of the database)
- STOP HADR (that is, the database is no longer an HADR database)
- DROP DATABASE
- RESTORE DATABASE
- Takeover and reads on standby
-
If you have reads on standby enabled, any user application currently connected to the standby is disconnected to allow the takeover to proceed. Depending on the number of readers that are active on the standby, the takeover operation can take slightly longer to complete than it would if there were no readers on the standby. New connections are not allowed during the role switch. Any attempt to connect to the HADR standby during the role switch on takeover receives an error (SQL1776N).
- Takeover and log spooling
-
If you are using a high value for hadr_spool_limit, you should consider that if there is a large gap between the log position of the primary and log replay on the standby, which might lead to a longer takeover time because the standby cannot assume the role of the new standby until the replay of the spooled logs finishes.
- Takeover and delayed replay
-
If you have configured hadr_replay_delay to a non-zero value, you cannot issue the command on that standby (SQL1770N).
- Takeover in a Db2 pureScale environment
-
The following considerations apply to Db2 pureScale environments:
- All log streams must pass the check to allow a takeover command to proceed. However, the streams do not need to be in the same state.
- When a primary database changes role into a standby database, a member that has a direct connection to the old standby is chosen as the replay member, with preference given to the preferred replay member (the preferred member is not chosen if it has no direct connection to the standby). Non-replay members are deactivated.
- When a standby database changes role into a primary database, only the old replay member stays active; other members on the new primary are not activated.
- Non-forced takeover is not allowed if any member on the primary is in member crash recovery (MCR) pending or in progress state.
- Non-forced takeover is not allowed if the primary database is in group crash recovery because the streams cannot be in the required state.