5. Concurrency Control

Concurrency Control is a mechanism that maintains atomicity and isolation, which are two properties of the ACID, when multiple transactions run concurrently in a database.

There are three broad concurrency control techniques: Multi-version Concurrency Control (MVCC), Strict Two-Phase Locking (S2PL), and Optimistic Concurrency Control (OCC). Each technique has many variations.

In MVCC, each write operation creates a new version of a data item while retaining the old version. When a transaction reads a data item, the system selects one of the versions to ensure isolation of the individual transaction. The main advantage of MVCC is that ‘readers don’t block writers, and writers don’t block readers’, in contrast, for example, an S2PL-based system must block readers when a writer writes an item because the writer acquires an exclusive lock for the item. PostgreSQL and some RDBMSs use a variation of MVCC called Snapshot Isolation (SI).

To implement SI, some RDBMSs, such as Oracle, use rollback segments. When writing a new data item, the old version of the item is written to the rollback segment, and subsequently the new item is overwritten to the data area. PostgreSQL uses a simpler method. A new data item is inserted directly into the relevant table page. When reading items, PostgreSQL selects the appropriate version of an item in response to an individual transaction by applying visibility check rules.

SI does not allow the three anomalies defined in the ANSI SQL-92 standard: Dirty Reads, Non-Repeatable Reads, and Phantom Reads. However, SI cannot achieve true serializability because it allows serialization anomalies, such as Write Skew and Read-only Transaction Skew. Note that the ANSI SQL-92 standard based on the classical serializability definition is not equivalent to the definition in modern theory.

To deal with this issue, Serializable Snapshot Isolation (SSI) has been added as of version 9.1. SSI can detect the serialization anomalies and can resolve the conflicts caused by such anomalies. Thus, PostgreSQL versions 9.1 or later provide a true SERIALIZABLE isolation level. (In addition, SQL Server also uses SSI; Oracle still uses only SI.)

This chapter comprises the following four parts:

  • Part 1: Sections 5.1. — 5.3.
    This part provides basic information required for understanding the subsequent parts.
    Sections 5.1 and 5.2 describe transaction ids and tuple structure, respectively. Section 5.3 exhibits how tuples are inserted, deleted, and updated.

  • Part 2: Sections 5.4. — 5.6.
    This part illustrates the key features required for implementing the concurrency control mechanism.
    Sections 5.4, 5.5, and 5.6 describe the commit log (clog), which holds all transaction states, transaction snapshots, and the visibility check rules, respectively.

  • Part 3: Sections 5.7. — 5.9.
    This part describes the concurrency control in PostgreSQL using specific examples.
    Section 5.7 describes the visibility check. This section also shows how the three anomalies defined in the ANSI SQL standard are prevented. Section 5.8 describes preventing Lost Updates, and Section 5.9 briefly describes SSI.

  • Part 4: Section 5.10.
    This part describes several maintenance process required to permanently running the concurrency control mechanism. The maintenance processes are performed by vacuum processing, which is described in Chapter 6.

This chapter focuses on the topics that are unique to PostgreSQL, although there are many concurrency control-related topics. Note that descriptions of deadlock prevention and lock modes are omitted. (For more information, refer to the official documentation.)

Transaction Isolation Level in PostgreSQL

PostgreSQL-implemented transaction isolation levels are described in the following table:

Isolation Level Dirty Reads Non-repeatable Read Phantom Read Serialization Anomaly
READ COMMITTED Not possible Possible Possible Possible
REPEATABLE READ*1 Not possible Not possible Not possible in PG; See Section 5.7.2.
(Possible in ANSI SQL)
Possible
SERIALIZABLE Not possible Not possible Not possible Not possible

*1 : In versions 9.0 and earlier, this level had been used as ‘SERIALIZABLE’ because it does not allow the three anomalies defined in the ANSI SQL-92 standard. However, with the implementation of SSI in version 9.1, this level has changed to ‘REPEATABLE READ’ and a true SERIALIZABLE level was introduced.

Note

PostgreSQL uses SSI for DML (Data Manipulation Language, e.g, SELECT, UPDATE, INSERT, DELETE), and 2PL for DDL (Data Definition Language, e.g., CREATE TABLE, etc).