11.3. Managing Multiple-Standby Servers
This section describes how streaming replication works with multiple standby servers.
11.3.1. Replication Priority and Synchronous States
The primary server classifies each standby server based on its role and reliability to manage multiple standbys efficiently in a synchronous replication environment. This classification depends on two internal attributes: sync_priority and sync_state.
The primary server assigns these attributes to all managed standby servers and treats each server according to these values. The primary performs this assignment even if it manages only one standby server.
11.3.1.1. sync_priority
The sync_priority attribute indicates the priority of a standby server in synchronous mode.
A lower value represents a higher priority. The special value 0 means the standby server is in asynchronous mode.
The primary server assigns priorities based on the order listed in the synchronous_standby_names parameter.
For example, in the following configuration, the priorities of standby1 and standby2 are 1 and 2, respectively.
synchronous_standby_names = 'standby1, standby2'Standby servers not listed in this parameter operate in asynchronous mode with a priority of 0.
11.3.1.2. sync_state
The sync_state attribute indicates the current state of the standby server. It can be one of the following:
- sync: The standby server is in synchronous mode and is the highest-priority standby server currently working.
- potential: The standby server is in synchronous mode and is a lower-priority standby server. If the current sync standby fails, this server is promoted to the sync state.
- async: The standby server is in asynchronous mode. It never enters sync or potential states.
- quorum: The standby servers operate in quorum mode. See Section 11.3.2.1 for details.
Issuing the following query displays the priority and state of the standby servers:
testdb=# SELECT application_name AS host,
sync_priority, sync_state FROM pg_stat_replication;
host | sync_priority | sync_state
----------+---------------+------------
standby1 | 1 | sync
standby2 | 2 | potential
(2 rows)11.3.2. How the Primary Manages Multiple-standbys
The primary server waits for ACK responses from the synchronous standby server alone. It confirms only the synchronous standby’s writing and flushing of WAL data. Streaming replication, therefore, ensures that only the synchronous standby remains in a consistent and synchronous state with the primary.
Figure 11.4 shows a case where the ACK response from the potential standby arrives earlier than that from the primary (sync) standby.
Figure 11.4. Managing multiple standby servers.
- The primary’s backend process continues to wait for an ACK response from the synchronous standby server, even after receiving an ACK from the potential standby.
- After receiving the ACK from the synchronous standby, the backend process releases the latch and completes the transaction.
In the opposite case (the sync standby’s ACK returns before the potential’s), the primary server immediately completes the commit without verifying if the potential standby has written or flushed WAL data.
11.3.2.1. Quorum-Based Synchronous Replication
PostgreSQL 9.6 introduced quorum-based synchronous replication. This feature allows transactions to be considered committed once a subset (a quorum) of synchronous standby servers acknowledges them.
Quorum-based synchronous replication has two modes: ANY and FIRST.
ANY Mode
The format of synchronous_standby_names for ANY mode is:
ANY num_sync ( standby_name [, ...] )In ANY mode, the primary server completes the commit of the current transaction once any ’num_sync’ standby servers in the list return ACK responses.
For example, the following setting allows the commit as soon as any two standby servers return responses:
synchronous_standby_names = 'ANY 2 (standby1, standby2, standby3)'Figure 11.5 illustrates the behavior of the ANY Mode setting:
Figure 11.5. Behavior of the ANY Mode Setting.
The state of sync_priority and sync_state for this setting is:
testdb=# SELECT application_name AS host,
sync_priority, sync_state FROM pg_stat_replication;
host | sync_priority | sync_state
----------+---------------+------------
standby1 | 1 | quorum
standby2 | 1 | quorum
standby3 | 1 | quorum
(3 rows)FIRST Mode
The format of synchronous_standby_names for FIRST mode is:
[FIRST] num_sync ( standby_name [, ...] )In FIRST mode, the primary server completes the commit after the first ’num_sync’ standby servers in the list return ACK responses.
For example, the following setting causes the primary to wait until standby1 and standby2 return ACKs, even if standby3 responds earlier:
synchronous_standby_names = 'FIRST 2 (standby1, standby2, standby3)'Figure 11.6 illustrates the behavior of the FIRST Mode setting:
Figure 11.6. Behavior of the FIRST Mode Setting.
The following is the state of sync_priority and sync_state for the above setting:
SELECT application_name AS host,
sync_priority, sync_state FROM pg_stat_replication;
host | sync_priority | sync_state
----------+---------------+------------
standby3 | 3 | potential
standby2 | 2 | sync
standby1 | 1 | sync
(3 rows)11.3.3. Behavior When a Failure Occurs
When either a potential or an asynchronous standby server fails, the primary server terminates the walsender process connected to the failed standby and continues all processing. In other words, a failure of either standby type does not affect transaction processing on the primary server.
When a synchronous standby server fails, the primary server terminates the walsender process connected to the failed standby and replaces the synchronous standby with the highest-priority potential standby. See Figure 11.7.
Figure 11.7. Replacing of synchronous standby server.
Unlike a failure of a potential or asynchronous standby, a failure of a synchronous standby causes query processing on the primary server to pause until the replacement process is complete. (Therefore, failure detection is a critical function for increasing the availability of the replication system. Failure detection is described in the next section.)
In any case, if one or more standby servers run in synchronous mode, the primary server maintains exactly one synchronous standby server at all times. This synchronous standby server remains in a consistent and synchronous state with the primary.