6.6. Full VACUUM

Although Concurrent VACUUM is essential for operation, it is not sufficient. For example, it cannot reduce the size of a table even if many dead tuples are removed.

Figure 6.8 shows an extreme example. Suppose that a table consists of three pages, and each page contains six tuples. The following DELETE command is executed to remove tuples, and the VACUUM command is executed to remove dead tuples:

Fig. 6.8. An example showing the disadvantages of (concurrent) VACUUM.
testdb=# DELETE FROM tbl WHERE id % 6 != 0;
testdb=# VACUUM tbl;

The dead tuples are removed; but the table size is not reduced. This is both a waste of disk space and has a negative impact on database performance. For instance, in the above example, when three tuples in the table are read, three pages must be loaded from disk.

To deal with this situation, PostgreSQL provides the Full VACUUM mode. Figure 6.9 shows an outline of this mode.

Fig. 6.9. Outline of Full VACUUM mode.
  • [1] Create new table file: Fig. 6.9(1)
    When the VACUUM FULL command is executed for a table, PostgreSQL first acquires the AccessExclusiveLock lock for the table and creates a new table file whose size is 8 KB. The AccessExclusiveLock lock prevents other users from accessing the table.

  • [2] Copy live tuples to the new table: Fig. 6.9(2)
    PostgreSQL copies only live tuples within the old table file to the new table.

  • [3] Remove the old file, rebuild indexes, and update the statistics, FSM, and VM: Fig. 6.9(3)
    After copying all live tuples, PostgreSQL removes the old file, rebuilds all associated table indexes, updates both the FSM and VM of this table, and updates associated statistics and system catalogs.

The pseudocode of the Full VACUUM is shown in below:

Pseudocode: Full VACUUM
(1)  FOR each table
(2)     Acquire AccessExclusiveLock lock for the table
(3)     Create a new table file
(4)     FOR each live tuple in the old table
(5)        Copy the live tuple to the new table file
(6)        Freeze the tuple IF necessary
        END FOR
(7)     Remove the old table file
(8)     Rebuild all indexes
(9)     Update FSM and VM
(10)    Update statistics
        Release AccessExclusiveLock lock
     END FOR
(11) Remove unnecessary clog files and pages if possible

Two points should be considered when using the VACUUM FULL command:

  1. Nobody can access(read/write) the table when Full VACUUM is processing.

  2. At most twice the disk space of the table is used temporarily; therefore, it is necessary to check the remaining disk capacity when a huge table is processed.

When should I do VACUUM FULL?

There is unfortunately no one-size-fits-all answer to the question of when to execute VACUUM FULL. However, the extension pg_freespacemap can give you some good suggestions.

The following query shows the average freespace ratio of the table you want to know.

testdb=# CREATE EXTENSION pg_freespacemap;
CREATE EXTENSION

testdb=# SELECT count(*) as "number of pages",
       pg_size_pretty(cast(avg(avail) as bigint)) as "Av. freespace size",
       round(100 * avg(avail)/8192 ,2) as "Av. freespace ratio"
       FROM pg_freespace('accounts');
 number of pages | Av. freespace size | Av. freespace ratio 
-----------------+--------------------+---------------------
            1640 | 99 bytes           |                1.21
(1 row)

As the result above, You can find that there are few free spaces (1.21 % free space).

If you delete almost tuples and execute VACUUM command, you can find that almost pages are empty (86.97 % free space), but the number of pages remains the same. In other words, the table file has not been compacted.

testdb=# DELETE FROM accounts WHERE aid %10 != 0 OR aid < 100;
DELETE 90009

testdb=# VACUUM accounts;
VACUUM

testdb=# SELECT count(*) as "number of pages",
       pg_size_pretty(cast(avg(avail) as bigint)) as "Av. freespace size",
       round(100 * avg(avail)/8192 ,2) as "Av. freespace ratio"
       FROM pg_freespace('accounts');
 number of pages | Av. freespace size | Av. freespace ratio 
-----------------+--------------------+---------------------
            1640 | 7124 bytes         |               86.97
(1 row)

The following query inspects the freespace ratio of each page of the specified table.

testdb=# SELECT *, round(100 * avail/8192 ,2) as "freespace ratio"
                FROM pg_freespace('accounts');
 blkno | avail | freespace ratio 
-------+-------+-----------------
     0 |  7904 |           96.00
     1 |  7520 |           91.00
     2 |  7136 |           87.00
     3 |  7136 |           87.00
     4 |  7136 |           87.00
     5 |  7136 |           87.00
....

If you run VACUUM FULL in this situation, you will see that the table file has been compacted.

testdb=# VACUUM FULL accounts;
VACUUM
testdb=# SELECT count(*) as "number of blocks",
       pg_size_pretty(cast(avg(avail) as bigint)) as "Av. freespace size",
       round(100 * avg(avail)/8192 ,2) as "Av. freespace ratio"
       FROM pg_freespace('accounts');
 number of pages | Av. freespace size | Av. freespace ratio 
-----------------+--------------------+---------------------
             164 | 0 bytes            |                0.00
(1 row)