12.2. Starting Logical Replication

Alpha Version: Work in progress.

Before setting up logical replication, we assume the following:

  • The Publisher has the tables that will be replicated to the Subscriber.
  • The Subscriber has already created tables with the same structure as those on the Publisher, but without any data.

In this section, we use the following tables:

CREATE TABLE tbl_1 (id int PRIMARY KEY, name text, data int);
CREATE TABLE tbl_2 (id int NOT NULL UNIQUE, name text, data int);
CREATE INDEX tbl_2_idx on tbl_2 (id, name);
CREATE TABLE tbl_3 (id int, name text, data int);
ALTER TABLE tbl_3 REPLICA IDENTITY FULL;

To set up logical replication, two commands are issued:

12.2.1. Creating Publication

The CREATE PUBLICATION command creates the publication data in the following system catalogs, and updates the pg_class if necessary:

testdb=# -- Publisher
testdb=# CREATE PUBLICATION mypub FOR TABLE tbl_1, tbl_2, tbl_3;
CREATE PUBLICATION
testdb=# \x
Expanded display is on.
testdb=# SELECT * FROM pg_publication;
-[ RECORD 1 ]+------
oid          | 16460
pubname      | mypub
pubowner     | 10
puballtables | f
pubinsert    | t
pubupdate    | t
pubdelete    | t
pubtruncate  | t
pubviaroot   | f
pubgencols   | n

testdb=# \x
Expanded display is off.
testdb=# SELECT * FROM pg_publication_rel;
  oid  | prpubid | prrelid | prqual | prattrs
-------+---------+---------+--------+---------
 16461 |   16460 |   16438 |        |
 16462 |   16460 |   16446 |        |
 16463 |   16460 |   16455 |        |
(3 rows)

The system view pg_publication_tables shows information about the mapping between publications and the tables they belong to.

testdb=# -- Publisher
testdb=# SELECT * FROM pg_publication_tables;
 pubname | schemaname | tablename |    attnames    | rowfilter
---------+------------+-----------+----------------+-----------
 mypub   | public     | tbl_1     | {id,name,data} |
 mypub   | public     | tbl_2     | {id,name,data} |
 mypub   | public     | tbl_3     | {id,name,data} |
(3 rows)

12.2.2. Creating Subscription

A subscription is created by issuing the CREATE SUBSCRIPTION command on the Subscriber.

testdb=# -- subscriber
testdb=# CREATE SUBSCRIPTION mysub
            CONNECTION 'host=192.168.3.10 port=5432 dbname=testdb'
            PUBLICATION mypub
            WITH (enabled = true, binary = false);

This process can be divided into three phases:

  • Phase 1: Create the subscription, and (by default) request the creation of the corresponding replication slot on the Publisher.
  • Phase 2: Connect an apply worker to a walsender on the Publisher.
  • Phase 3: Synchronize the tables with the corresponding tables on the Publisher.
Figure 12.9. Logical Replication Initializing Sequence.

In the following, we explore these phases.

Phase 1:

This phase involves two primary tasks:

  • Creating the subscription data in the system catalogs pg_subscription and pg_subscription_rel.
  • (By Default) Requesting the creation of the replication slot on the publisher to manage this logical replication.

To deal with these tasks, the subscriber performs the following sequence:

  1. The postgres process that issues ‘CREATE SUBSCRIPTION’ establishes a connection with a walsender.
  2. The postgres process interacts with the walsender to perform the following:
    • Check if the specified publication exists.
    • Retrieve the target table’s schemas and OIDs to ensure structural compatibility.
    • (By default) request the creation of the replication slot to manage this logical replication. Note that if a replication slot already created is specified in the CREATE SUBSCRIPTION command, this request is omitted.
  3. The postgres process inserts the subscription data into the system catalogs ‘pg_subscription’ and ‘pg_subscription_rel’.
  4. The Publisher terminates the walsender process.
testdb=# -- Subscriber
testdb=# SELECT * FROM pg_subscription;
-[ RECORD 1 ]-------+---------------------------------------
oid                 | 16436
subdbid             | 16388
subskiplsn          | 0/0
subname             | mysub
subowner            | 10
subenabled          | t
subbinary           | f
substream           | p
subtwophasestate    | d
subdisableonerr     | f
subpasswordrequired | t
subrunasowner       | f
subfailover         | f
subconninfo         | host=192.168.3.10 port=5432 dbname=testdb
subslotname         | mysub
subsynccommit       | off
subpublications     | {mypub}
suborigin           | any

testdb=# \x
Expanded display is off.
testdb=# SELECT s.srsubid, s.srrelid, c.relname, s.srsublsn
testdb-#       FROM pg_subscription_rel AS s, pg_class AS c WHERE c.oid = s.srrelid;
 srsubid | srrelid | relname | srsublsn
---------+---------+---------+-----------
   16436 |   16414 | tbl_1   | 0/1BF35E0
   16436 |   16422 | tbl_2   | 0/1BF35E0
   16436 |   16431 | tbl_3   | 0/1BF5320
(3 rows)

The replication slot on the publisher is created as follows:

testdb=# -- Publisher
testdb=# SELECT * FROM pg_replication_slots;
-[ RECORD 1 ]-------+----------
slot_name           | mysub
plugin              | pgoutput
slot_type           | logical
datoid              | 16384
database            | testdb
temporary           | f
active              | t
active_pid          | 2051
xmin                |
catalog_xmin        | 823
restart_lsn         | 0/1BF5320
confirmed_flush_lsn | 0/1BF5358
wal_status          | reserved
safe_wal_size       |
two_phase           | f
two_phase_at        |
inactive_since      |
conflicting         | f
invalidation_reason |
failover            | f
synced              | f
Phase 2:

This phase initiates the background processes required for logical replication after the subscription has been created.

  1. The Logical Replication Launcher starts an apply worker.
  2. The apply worker establishes a connection to a walsender on the publisher and initializes its replication origin (creating an entry in the system catalog pg_replication_origin if it does not already exist).

These two processes — the walsender on the publisher and the apply worker on the subscriber — continue running to stream and apply changes.

Phase 3:

This phase synchronizes the tables with the corresponding tables on the publisher.

To deal with this task, the logical replication launcher launches apply workers to retrieve the rows of the target tables.

During this process, the launcher will launch as many workers as possible to maximize efficiency. The workers launched here to synchronize the table are called table sync workers.

  1. The Launcher launches the table sync workers.
  2. The table sync workers establish connections with walsenders.
  3. The walsenders send table data to the table sync workers, and the workers insert the data into the target tables.
  4. After synchronizing the tables, the table sync workers and walsenders are terminated.

The walsender and table sync worker internally use the COPY ... TO STDOUT and COPY ... FROM STDIN protocols to stream and insert table data efficiently. Refer to tablesync.c for more details.

pg_createsubscriber

In version 17, the pg_createsubscriber utility is supported.