Setting the hadr_timeout and hadr_peer_window database configuration parameters
You can configure the hadr_timeout and hadr_peer_window database configuration parameters for optimal response to a connection failure.
- hadr_timeout database configuration parameter
- If an HADR database does not receive any communication from its partner database for longer than the length of time that is specified by the hadr_timeout database configuration parameter, then the database concludes that the connection with the partner database is lost. If the database is in peer state when the connection is lost, then it moves into disconnected peer state if the hadr_peer_window database configuration parameter is greater than zero, or into remote catchup pending state if hadr_peer_window is not greater than zero. The state change applies to both primary and standby databases.
- hadr_peer_window database configuration parameter
- The hadr_peer_window configuration parameter
does not replace the hadr_timeout configuration
parameter. The hadr_timeout configuration parameter
determines how long an HADR database waits before it considers that
its connection with the partner database as failed. The hadr_peer_window configuration
parameter determines whether the database goes into disconnected peer
state after the connection is lost, and how long the database remains
in that state. HADR breaks the connection as soon as a network error
is detected during send, receive, or poll on the TCP socket. HADR
polls the socket every 100 milliseconds. This frequency allows it
to respond quickly to network errors detected by the OS. Only in the
worst case does HADR wait until the timeout to break a bad connection. In
this case, a database application that is running at the time of failure
can be blocked for the time equal to the sum of the hadr_timeout and hadr_peer_window database
configuration parameters.Note: The HADR peer window is not supported in a Db2® pureScale® environment. Attempts to update it to a nonzero value fail with a warning, and the START HADR command fails if hadr_peer_window is not set to 0.
- Setting the hadr_timeout and hadr_peer_window database configuration parameters
- It is desirable to keep the waiting time that a database application
experiences to a minimum. Setting the hadr_timeout and hadr_peer_window configuration
parameters to small values would reduce the time that a database application
must wait if an HADR standby database loses its connection with the
primary database. However, you should also consider the following
details when you are choosing values to assign to the hadr_timeout and hadr_peer_window configuration
parameters:
- Set the hadr_timeout database configuration parameter to a value that is long enough to avoid false alarms on the HADR connection that are caused by short, temporary network interruptions. For example, the default value of hadr_timeout is 120 seconds, which is a reasonable value on many networks.
- Set the hadr_peer_window database configuration
parameter to a value that is long enough to allow the system to perform
automated failure responses. If the HA system, for example a cluster
manager, detects primary database failure before disconnected peer
state ends, a failover to the standby database takes place. Data is
not lost in the failover as all data from old primary is replicated
to the new primary. If the peer window is too short, the HA system might
not have enough time to detect the failure and respond.Note: The principal standby uses the primary's setting for hadr_peer_window (the effective peer window). The setting for hadr_peer_window on any auxiliary standby is meaningless because that type of standby always runs in SUPERASYNC mode.