You can configure the HADR_TIMEOUT and HADR_PEER_WINDOW
database configuration parameters for optimal response to a connection
failure.
- hadr_timeout database configuration parameter
- If an HADR database does not receive any communication
from its partner database for longer than the length of time specified
by the hadr_timeout database configuration parameter,
then the database concludes that the connection with the partner database
is lost. If the database is in peer state when the connection is lost,
then it moves into disconnected peer state if the hadr_peer_window database
configuration parameter is greater than zero, or into remote catchup
pending state if hadr_peer_window is not greater
than zero. The state change applies to both primary and standby databases.
- hadr_peer_window database configuration parameter
- The hadr_peer_window configuration parameter
does not replace the hadr_timeout configuration
parameter. The hadr_timeout configuration parameter
determines how long an HADR database waits before considering its
connection with the partner database as failed. The hadr_peer_window configuration
parameter determines whether the database goes into disconnected peer
state after the connection is lost, and how long the database should
remain in that state. HADR will break the connection as soon as a
network error is detected during send, receive, or poll on the TCP
socket. HADR polls the socket every 100 milliseconds. This allows
it to respond quickly to network errors detected by the OS. Only in
the worst case, HADR will wait until timeout to break a bad connection.
In this case, a database application that is running at the time of
failure can be blocked for a period of time equal to the sum of the hadr_timeout and hadr_peer_window database
configuration parameters.
- Setting the hadr_timeout and hadr_peer_window database
configuration parameters
- It is desirable to keep the waiting time that a database application
experiences to a minimum. Setting the hadr_timeout and hadr_peer_window configuration
parameters to small values would reduce the time that a database application
must wait if a HADR standby databases loses its connection with the
primary database. However, there are two other details that should
be considered when choosing values to assign to the hadr_timeout and hadr_peer_window configuration
parameters:
- The hadr_timeout database configuration parameter
should be set to a value that is long enough to avoid false alarms
on the HADR connection caused by short, temporary network interruptions.
For example, the default value of hadr_timeout is
120 seconds, which is a reasonable value on many networks.
- The hadr_peer_window database configuration
parameter should be set to a value that is long enough to allow the
system to perform automated failure responses. If the HA system, for
example a cluster manager, detects primary database failure before
disconnected peer state ends, a failover to the standby database
takes place. Data is not lost in the failover as all data from old
primary is replicated to the new primary. If hadr_peer_window is
too short, HA system may not have enough time to detect the failure
and respond.