7.1. Heap Only Tuple (HOT)
PostgreSQL version 8.3 implemented HOT to effectively use both index and table pages when an updated row is stored in the same page as the old row. HOT also reduces the need for VACUUM processing.
Since the README.HOT file in the source code directory describes HOT in detail, this chapter only provides a brief introduction.
Section 7.1.1 describes how PostgreSQL updates a row without HOT to clarify the issues that HOT resolves. Section 7.1.2 then describes how HOT works.
7.1.1. Update a Row Without HOT
Assume the table ’tbl’ has two columns: ‘id’ (primary key) and ‘data’.
testdb=# \d tbl
Table "public.tbl"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
id | integer | | not null |
data | text | | |
Indexes:
"tbl_pkey" PRIMARY KEY, btree (id)Table ’tbl’ contains 1,000 tuples. The last tuple (id = 1000) is stored in the 5th page of the table. The corresponding index tuple (key = 1000) points to this tuple with TID ‘(5, 1)’. See Figure 7.1(a).
Figure 7.1. Update a row without HOT
Consider how the last tuple is updated without HOT.
testdb=# UPDATE tbl SET data = 'B' WHERE id = 1000;In this case, PostgreSQL inserts both the new table tuple and a new index tuple. See Figure 7.1(b).
The insertion of index tuples consumes index page space. Furthermore, both the insertion and vacuuming costs of index tuples are high. HOT reduces the impact of these issues.
7.1.2. How HOT Performs
When a row is updated with HOT, PostgreSQL does not insert a new index tuple if the updated row is stored in the same table page as the old row. Instead, it sets the HEAP_HOT_UPDATED bit in the t_informask2 field of the old tuple and the HEAP_ONLY_TUPLE bit in the new tuple. See Figures 7.2 and 7.3.
Figure 7.2. Update a row with HOT
For example, in this case, Tuple_1 is marked with HEAP_HOT_UPDATED, and Tuple_2 is marked with HEAP_ONLY_TUPLE.
In addition, PostgreSQL uses these bits regardless of whether the pruning and defragmentation processes (described below) are executed.
Figure 7.3. HEAP_HOT_UPDATED and HEAP_ONLY_TUPLE bits
The following describes how PostgreSQL accesses updated tuples via an index scan immediately after a HOT update. See Figure 7.4(a).
Figure 7.4. Pruning of the line pointers
- Find the index tuple that points to the target tuple.
- Access line pointer [1] pointed to by the index tuple.
- Read Tuple_1.
- Read Tuple_2 via the t_ctid of Tuple_1.
PostgreSQL reads both tuples and decides which is visible using the concurrency control mechanism described in Chapter 5.
However, a problem arises if dead tuples are removed from table pages. In Figure 7.4(a), if Tuple_1 is removed as a dead tuple, the index cannot access Tuple_2.
To resolve this, PostgreSQL redirects the line pointer of the old tuple to the line pointer of the new tuple at an appropriate time. This process is called pruning. Figure 7.4(b) depicts how PostgreSQL accesses tuples after pruning.
- Find the index tuple.
- Access line pointer [1] pointed to by the index tuple.
- Access line pointer [2] via the redirected line pointer.
- Read Tuple_2 pointed to by line pointer [2].
Pruning is executed, if possible, during SQL commands such as SELECT, UPDATE, INSERT, and DELETE. This documantation does not describe the exact execution timing because it is complicated; details are in the README.HOT file.
PostgreSQL also removes dead tuples at an appropriate time, similar to the pruning process. This process is called defragmentation. Figure 7.5 depicts defragmentation by HOT.
Figure 7.5. Defragmentation of the dead tuples
The cost of defragmentation is lower than normal VACUUM processing because it does not involve removing index tuples.
Thus, HOT reduces the page consumption of both indexes and tables. It also reduces the number of tuples that VACUUM must process. Consequently, HOT improves performance by reducing index tuple insertions and the necessity of VACUUM processing.
Cases where HOT is not available include:
- When the updated tuple is stored in a different page from the old tuple, PostgreSQL must insert a new index tuple. See Figure 7.6(a).
- When the key value of the index tuple is updated, PostgreSQL must insert a new index tuple. See Figure 7.6(b).