12.3. ReorderBuffer Structure

Alpha Version: Work in progress.

A ReorderBuffer area is allocated for each walsender process. The size of this area is limited by the logical_decoding_work_mem configuration parameter (the default is 64MB).

The ReorderBuffer consists of the following three components:

Figure 12.11. ReorderBuffer Structure.

The central element of the ReorderBuffer structure is the by_txn hash table, which uses the transaction ID (txid) as its key. Each entry in this hash table is a ReorderBufferTXN structure, which stores metadata and the actual WAL data associated with each transaction.

Individual data modifications (such as INSERT, UPDATE, and DELETE) are represented by ReorderBufferChange structures. These are appended in LSN order to the changes doubly linked list within the corresponding ReorderBufferTXN.

12.3.1. ReorderBuffer

This structure maintains the primary context for logical decoding.

Item Type Description
by_txn HTAB_* A hash table mapping txids to ReorderBufferTXN entries. It functions as an index for rapidly retrieving active transactions.

12.3.2. ReorderBufferTXN

This structure manages the state of an individual transaction and its associated changes.

Item Type Description
first_lsn XLogRecPtr The LSN of the first change record belonging to this transaction. It is used to identify the starting point of the transaction.
final_lsn XLogRecPtr The LSN of the commit (or abort) record for this transaction.
origin_id RepOriginId The ID of the replication origin where this transaction was initially created.
origin_lsn XLogRecPtr The LSN of the commit record on the publisher where this transaction originated.
base_snapshot Snapshot The historic snapshot used for decoding the transaction. It ensures correct visibility during catalog scans by identifying which data was visible at the start of the transaction.
changes dlist_head A doubly linked list of ReorderBufferChange structures, storing individual data change records in LSN order. See the following subsection.

Note: While base_snapshot is essential for determining visibility immediately after slot creation or during catalog changes, it is omitted from subsequent discussions to focus on the steady-state data flow.

12.3.3. ReorderBufferChange

This structure represents an individual data modification. Here, only the main items are listed.

Item Type Description
lsn XLogRecPtr The LSN of the WAL record that generated this specific change.
action ReorderBufferChangeType The type of change operation (e.g., INSERT, UPDATE, DELETE, or TRUNCATE).
data union A union containing action-specific data, such as the tp (tuple) or truncate structures.
data.tp.rlocator RelFileLocator Identifies the physical relation (table) affected by the change. This is a triplet consisting of spcOid (Tablespace), dbOid (Database), and relNumber (RelFilenode number).
data.tp.oldtuple HeapTuple The “before” version of the tuple. This is populated for UPDATE or DELETE operations if required by the Replica Identity configuration (see Section 12.4.3).
data.tp.newtuple HeapTuple The “after” version of the tuple, containing the new data for INSERT or UPDATE operations.
/*
 * Types of the change passed to a 'change' callback.
 *
 * For efficiency and simplicity reasons we want to keep Snapshots, CommandIds
 * and ComboCids in the same list with the user visible INSERT/UPDATE/DELETE
 * changes. Users of the decoding facilities will never see changes with
 * *_INTERNAL_* actions.
 *
 * The INTERNAL_SPEC_INSERT and INTERNAL_SPEC_CONFIRM, and INTERNAL_SPEC_ABORT
 * changes concern "speculative insertions", their confirmation, and abort
 * respectively.  They're used by INSERT .. ON CONFLICT .. UPDATE.  Users of
 * logical decoding don't have to care about these.
 */
typedef enum ReorderBufferChangeType
{
	REORDER_BUFFER_CHANGE_INSERT,
	REORDER_BUFFER_CHANGE_UPDATE,
	REORDER_BUFFER_CHANGE_DELETE,
	REORDER_BUFFER_CHANGE_MESSAGE,
	REORDER_BUFFER_CHANGE_INVALIDATION,
	REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT,
	REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID,
	REORDER_BUFFER_CHANGE_INTERNAL_TUPLECID,
	REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT,
	REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM,
	REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT,
	REORDER_BUFFER_CHANGE_TRUNCATE,
} ReorderBufferChangeType;

/* forward declaration */
struct ReorderBufferTXN;

/*
 * a single 'change', can be an insert (with one tuple), an update (old, new),
 * or a delete (old).
 *
 * The same struct is also used internally for other purposes but that should
 * never be visible outside reorderbuffer.c.
 */
typedef struct ReorderBufferChange
{
	XLogRecPtr	lsn;

	/* The type of change. */
	ReorderBufferChangeType action;

	/* Transaction this change belongs to. */
	struct ReorderBufferTXN *txn;

	RepOriginId origin_id;

	/*
	 * Context data for the change. Which part of the union is valid depends
	 * on action.
	 */
	union
	{
		/* Old, new tuples when action == *_INSERT|UPDATE|DELETE */
		struct
		{
			/* relation that has been changed */
			RelFileLocator rlocator;

			/* no previously reassembled toast chunks are necessary anymore */
			bool		clear_toast_afterwards;

			/* valid for DELETE || UPDATE */
			HeapTuple	oldtuple;
			/* valid for INSERT || UPDATE */
			HeapTuple	newtuple;
		}			tp;

		/*
		 * Truncate data for REORDER_BUFFER_CHANGE_TRUNCATE representing one
		 * set of relations to be truncated.
		 */
		struct
		{
			Size		nrelids;
			bool		cascade;
			bool		restart_seqs;
			Oid		   *relids;
		}			truncate;

		/* Message with arbitrary data. */
		struct
		{
			char	   *prefix;
			Size		message_size;
			char	   *message;
		}			msg;

		/* New snapshot, set when action == *_INTERNAL_SNAPSHOT */
		Snapshot	snapshot;

		/*
		 * New command id for existing snapshot in a catalog changing tx. Set
		 * when action == *_INTERNAL_COMMAND_ID.
		 */
		CommandId	command_id;

		/*
		 * New cid mapping for catalog changing transaction, set when action
		 * == *_INTERNAL_TUPLECID.
		 */
		struct
		{
			RelFileLocator locator;
			ItemPointerData tid;
			CommandId	cmin;
			CommandId	cmax;
			CommandId	combocid;
		}			tuplecid;

		/* Invalidation. */
		struct
		{
			uint32		ninvalidations; /* Number of messages */
			SharedInvalidationMessage *invalidations;	/* invalidation message */
		}			inval;
	}			data;

	/*
	 * While in use this is how a change is linked into a transactions,
	 * otherwise it's the preallocated list.
	 */
	dlist_node	node;
} ReorderBufferChange;