Transaction logs are an essential part of databases because they ensure that no data is lost even when a system failure occurs. They are a history log of all changes and actions in a database system. This ensures that no data is lost due to failures, such as a power failure or a server crash. The log contains sufficient information about each transaction that has already been executed, so the database server can recover the database cluster by replaying the changes and actions in the transaction log in the event of a server crash.
In the field of computer science, WAL is an acronym for Write-Ahead Logging, which is a protocol or rule that requires both changes and actions to be written to a transaction log. However, in PostgreSQL, WAL is also an acronym for Write Ahead Log. In PostgreSQL, the term WAL is used interchangeably with transaction log, and it also refers to the implemented mechanism for writing actions to a transaction log (WAL). Although this can be confusing, this document will adopt the PostgreSQL definition.
The WAL mechanism was first implemented in version 7.1 to mitigate the impacts of server crashes. It also made possible the implementation of the Point-in-Time Recovery (PITR) and Streaming Replication (SR), both of which are described in Chapter 10 and Chapter 11 respectively.
Although understanding the WAL mechanism is essential for system integrations and administration using PostgreSQL, the complexity of this mechanism makes it impossible to summarize its description in brief. Therefore, the complete explanation of WAL in PostgreSQL is as follows:
Let's take a look at the overview of the WAL mechanism. To clarify the issue that WAL solves, the first subsection shows what happens when a crash occurs if PostgreSQL does not implement WAL. The second subsection introduces some key concepts and shows an overview of the main subjects in this chapter: the writing of WAL data and the database recovery process. The final subsection completes the overview of WAL by adding one more key concept.
In this section, to simplify the description, the table TABLE_A which contains just one page has been used.
As described in Chapter 8, every DBMS implements a shared buffer pool to provide efficient access to the relation's pages.
Assume that we insert some data tuples into TABLE_A on PostgreSQL which does not implement the WAL feature. This situation is illustrated in Fig. 9.1.
Fig. 9.1. Insertion operations without WAL.Therefore, a database without WAL is vulnerable to system failures.
Before WAL was introduced (versions 7.0 or earlier), PostgreSQL did synchronous writes to the disk by issuing a sync system call whenever a page was changed in memory in order to ensure durability. This made modification commands such as INSERT and UPDATE very poor-performing.
To deal with the system failures mentioned above without compromising performance, PostgreSQL supports WAL. In this subsection, some keywords and key concepts are described, followed by the writing of WAL data and the recovery of the database.
PostgreSQL writes all modifications as history data into a persistent storage to prepare for failures. In PostgreSQL, the history data are known as XLOG record(s) or WAL data.
XLOG records are written into the in-memory WAL buffer by change operations such as insertion, deletion, or commit action. They are immediately written into a WAL segment file on the storage when a transaction commits or aborts. (To be precise, the writing of XLOG records may occur in other cases. The details will be described in Section 9.5.) The LSN (Log Sequence Number) of an XLOG record represents the location where its record is written on the transaction log. The LSN of a record is used as the unique id of the XLOG record.
When considering how a database system recovers, one question that may arise is: what point does PostgreSQL start to recover from? The answer is the REDO point. That is the location to write the XLOG record at the moment when the latest checkpoint is started. (Checkpoints in PostgreSQL are described in Section 9.7.) In fact, the database recovery process is closely linked to the checkpoint process, and both of these processes are inseparable.
The WAL and checkpoint process were implemented at the same time in version 7.1.
As the introduction of major keywords and concepts has just finished, the following is a description of the tuple insertion with WAL. See Fig. 9.2 and the following description. (Also refer to this slide.)
Fig. 9.2. Insertion operations with WAL.'TABLE_A's LSN' shows the value of pd_lsn within the page-header of TABLE_A. 'page's LSN' is the same manner.
The following instructions show how to recover our database cluster back to the state immediately before the crash. There is no need to do anything special, since PostgreSQL will automatically enter recovery-mode by restarting. See Fig. 9.3 (and this slide). PostgreSQL will sequentially read and replay XLOG records within the appropriate WAL segment files from the REDO point.
Fig. 9.3. Database recovery using WAL.PostgreSQL can recover itself in this way by replaying XLOG records written in WAL segment files in chronological order. Thus, PostgreSQL's XLOG records are REDO log.
PostgreSQL does not support UNDO log.
Although writing XLOG records certainly costs a certain amount, it is nothing compared to writing the entire modified pages. We are confident that the benefit we gain, namely system failure tolerance, is greater than the amount we pay.
Suppose that the TABLE_A's page data on the storage is corrupted because the operating system has failed while the background writer process has been writing the dirty pages. As XLOG records cannot be replayed on the corrupted page, we would need an additional feature.
PostgreSQL supports a feature called full-page writes to deal with such failures. If it is enabled, PostgreSQL writes a pair of the header data and the entire page as an XLOG record during the first change of each page after every checkpoint. (This is the default setting.) In PostgreSQL, such a XLOG record containing the entire page is called a backup block (or full-page image).
Let's describe the insertion of tuples again, but with full-page writes enabled. See Fig. 9.4 and the following description.
Fig. 9.4. Full page writes.Restart the PostgreSQL server to repair the broken cluster. See Fig. 9.5 and the following description.
Fig. 9.5. Database recovery with backup block.In this way, PostgreSQL can recover the database even if some data write errors occur due to a process or operating system crash.
As mentioned above, WAL can prevent data loss due to process or operating system crashes. However, if a file system or media failure occurs, the data will be lost. To deal with such failures, PostgreSQL provides online backup and replication features.
If online backups are taken regularly, the database can be restored from the most recent backup, even if a media failure occurs. However, it is important to note that the changes made after taking the last backup cannot be restored.
The synchronous replication feature can store all changes to another storage or host in real time. This means that if a media failure occurs on the primary server, the data can be restored from the secondary server.
Logically, PostgreSQL writes XLOG records into a virtual file that is 8 bytes long (16 exabytes).
Since a transaction log capacity is effectively unlimited and so can be said that 8-bytes of address space is vast enough, it is impossible for us to handle a file with a capacity of 8 bytes. Therefore, a transaction log in PostgreSQL is divided into files of 16 megabytes, by default, each of which is known as a WAL segment. See Fig. 9.6.
In versions 11 or later, the size of WAL segment file can be configured using --wal-segsize option when PostgreSQL cluster is created by initdb command.
The WAL segment filename is in hexadecimal 24-digit number and the naming rule is as follows:
\begin{align} \verb|WAL segment file name| = \verb|timelineId| + (\verb|uint32|) \frac{\verb|LSN|-1}{16\verb|M|*256} + (\verb|uint32|)\left(\frac{\verb|LSN|-1}{16\verb|M|}\right) \% 256 \end{align}PostgreSQL's WAL contains the concept of timelineId (4-byte unsigned integer), which is for Point-in-Time Recovery (PITR) described in Chapter 10. However, the timelineId is fixed to 0x00000001 in this chapter because this concept is not required in the following descriptions.
The first WAL segment file is 000000010000000000000001. If the first one has been filled up with the writing of XLOG records, the second one 000000010000000000000002 would be provided. Files are used in ascending order in succession. After 0000000100000000000000FF has been filled up, the next one 000000010000000100000000 will be provided. In this way, whenever the last 2-digit carries over, the middle 8-digit number increases one.
Similarly, after 0000000100000001000000FF has been filled up, 000000010000000200000000 will be provided, and so on.
Using the built-in function pg_xlogfile_name (versions 9.6 or earlier) or pg_walfile_name (versoin 10 or later), we can find the WAL segment file name that contains the specified LSN. An example is shown below:
testdb=# SELECT pg_xlogfile_name('1/00002D3E'); # In versions 10 or later, "SELECT pg_walfile_name('1/00002D3E');" pg_xlogfile_name -------------------------- 000000010000000100000000 (1 row)
A WAL segment is a 16 MB file by default, and it is internally divided into pages of 8192 bytes (8 KB). The first page has a header-data defined by the XLogLongPageHeaderData structure, while the headings of all other pages have the page information defined by the XLogPageHeaderData structure. Following the page header, XLOG records are written in each page from the beginning in descending order. See Fig. 9.7.
Fig. 9.7. Internal layout of a WAL segment file.The XLogLongPageHeaderData structure and the XLogPageHeaderData structure are defined in src/include/access/xlog_internal.h. The explanation of both structures is omitted because they are not required in the following descriptions.
An XLOG record comprises a general header portion and each associated data portion. The first subsection describes the header structure. The remaining two subsections explain the structure of the data portion in versions 9.4 and earlier, and version 9.5, respectively. (The data format changed in version 9.5.)
All XLOG records have a general header portion defined by the XLogRecord structure. Here, the structure of 9.4 and earlier versions is shown below, although it has been changed in version 9.5.
typedef struct XLogRecord { uint32 xl_tot_len; /* total len of entire record */ TransactionId xl_xid; /* xact id */ uint32 xl_len; /* total len of rmgr data. This variable was removed in ver.9.5. */ uint8 xl_info; /* flag bits, see below */ RmgrId xl_rmid; /* resource manager for this record */ /* 2 bytes of padding here, initialize to zero */ XLogRecPtr xl_prev; /* ptr to previous record in log */ pg_crc32 xl_crc; /* CRC for this record */ } XLogRecord;
In versions 9.5 or later, one variable (xl_len) has been removed the structure XLogRecord to refine the XLOG record format, which reduced the size by a few bytes.
Apart from two variables, most of the variables are so obvious that they do not need to be described.
Both xl_rmid and xl_info are variables related to resource managers, which are collections of operations associated with the WAL feature, such as writing and replaying of XLOG records. The number of resource managers tends to increase with each PostgreSQL version. Version 10 contains the following:
Operation | Resource manager |
---|---|
Heap tuple operations | RM_HEAP, RM_HEAP2 |
Index operations | RM_BTREE, RM_HASH, RM_GIN, RM_GIST, RM_SPGIST, RM_BRIN |
Sequence operations | RM_SEQ |
Transaction operations | RM_XACT, RM_MULTIXACT, RM_CLOG, RM_XLOG, RM_COMMIT_TS |
Tablespace operations | RM_SMGR, RM_DBASE, RM_TBLSPC, RM_RELMAP |
replication and hot standby operations | RM_STANDBY, RM_REPLORIGIN, RM_GENERIC_ID, RM_LOGICALMSG_ID |
Here are some representative examples of how resource managers work:
XLogRecord structure in versions 9.4 or earlier is defined in src/include/access/xlog.h and that of versions 9.5 or later is defined in src/include/access/xlogrecord.h.
The heap_xlog_insert and heap_xlog_update are defined in src/backend/access/heap/heapam.c; while the function xact_redo_commit is defined in src/backend/access/transam/xact.c.
The data portion of an XLOG record can be classified into either a backup block (which contains the entire page) or a non-backup block (which contains different data depending on the operation).
Fig. 9.8. Examples of XLOG records (versions 9.4 or earlier).The internal layouts of XLOG records are described below, using some specific examples.
A backup block is shown in Fig. 9.8(a). It is composed of two data structures and one data object:
The BkpBlock structure contains the variables that identify the page in the database cluster (i.e., the relfilenode and fork number of the relation that contains the page, and the page's block number), as well as the starting position and length of the page's free space.
In non-backup blocks, the layout of the data portion differs depending on the operation. Here, the XLOG record for an INSERT statement is explained as a representative example. See Fig. 9.8(b). In this case, the XLOG record for the INSERT statement is composed of two data structures and one data object:
The xl_heap_insert structure contains the variables that identify the inserted tuple in the database cluster (i.e., the relfilenode of the table that contains this tuple, and the tuple's tid), as well as a visibility flag of this tuple.
The reason to remove a few bytes from inserted tuple is described in the source code comment of the structure xl_heap_header:
We don't store the whole fixed part (HeapTupleHeaderData) of an inserted or updated tuple in WAL; we can save a few bytes by reconstructing the fields that are available elsewhere in the WAL record, or perhaps just plain needn't be reconstructed.
One more example will be shown here. See Fig. 9.8(c). The XLOG record for a checkpoint record is quite simple; it is composed of two data structures:
The xl_heap_header structure is defined in src/include/access/htup.h while the CheckPoint structure is defined in src/include/catalog/pg_control.h.
In versions 9.4 or earlier, there was no common format for XLOG records, so each resource manager had to define its own format. This made it increasingly difficult to maintain the source code and implement new features related to WAL. To address this issue, a common structured format that is independent of resource managers was introduced in version 9.5.
The data portion of an XLOG record can be divided into two parts: header and data. See Fig. 9.9.
Fig. 9.9. Common XLOG record format.The header part contains zero or more XLogRecordBlockHeaders and zero or one XLogRecordDataHeaderShort (or XLogRecordDataHeaderLong). It must contain at least one of these. When the record stores a full-page image (i.e., a backup block), the XLogRecordBlockHeader includes the XLogRecordBlockImageHeader, and also includes the XLogRecordBlockCompressHeader if its block is compressed.
The data part is composed of zero or more block data and zero or one main data, which correspond to the XLogRecordBlockHeader(s) and to the XLogRecordDataHeader, respectively
In versions 9.5 or later, full-page images within XLOG records can be compressed using the LZ compression method by setting the parameter wal_compression = enable. In that case, the XLogRecordBlockCompressHeader structure will be added.
This feature has two advantages and one disadvantage. The advantages are reducing the I/O cost for writing records and suppressing the consumption of WAL segment files. The disadvantage is consuming much CPU resource to compress.
Some specific examples are shown below, as in the previous subsection.
The backup block created by an INSERT statement is shown in Fig. 9.10(a). It is composed of four data structures and one data object:
The XLogRecordBlockHeader structure contains the variables to identify the block in the database cluster (the relfilenode, the fork number, and the block number). The XLogRecordImageHeader structure contains the length of this block and offset number. (These two header structures together can store the same data as the BkBlock structure used until version 9.4.)
The XLogRecordDataHeaderShort structure stores the length of the xl_heap_insert structure, which is the main data of the record. (See below.)
The main data of an XLOG record that contains a full-page image is not used except in some special cases, such as logical decoding and speculative insertions. It is ignored when the record is replayed, making it redundant data. This may be improved in the future.
In addition, the main data of backup block records depends on the statements that create them. For example, an UPDATE statement appends xl_heap_lock or xl_heap_updated.
Next, I will describe the non-backup block record created by the INSERT statement (see Fig. 9.10(b)). It is composed of four data structures and one data object:
The XLogRecordBlockHeader structure contains three values (the relfilenode, the fork number, and the block number) to specify the block that the tuple was inserted into, and the length of the data portion of the inserted tuple. The XLogRecordDataHeaderShort structure contains the length of the new xl_heap_insert structure, which is the main data of this record.
The new xl_heap_insert structure contains only two values: the offset number of this tuple within the block, and a visibility flag. It became very simple because the XLogRecordBlockHeader structure stores most of the data that was contained in the old xl_heap_insert structure.
As the final example, a checkpoint record is shown in the Fig. 9.10(c). It is composed of three data structures:
The structure xl_heap_header is defined in src/include/access/htup.h and the CheckPoint structure is defined in src/include/catalog/pg_control.h.
Although the new format is a little complicated for us, it is well-designed for the parsers of the resource managers. Additionally, the size of many types of XLOG records is usually smaller than the previous ones. The sizes of the main structures are shown in Figures 9.8 and 9.10, so you can calculate the sizes of those records and compare them. (The size of the new checkpoint is greater than the previous one, but it contains more variables.)
Having finished the warm-up exercises, we are now ready to understand the writing of XLOG records. I will explain it as precisely as possible in this section.
First, issue the following statement to explore the PostgreSQL internals:
testdb=# INSERT INTO tbl VALUES ('A');
By issuing the above statement, the internal function exec_simple_query() is invoked. The pseudocode of exec_simple_query() is shown below:
exec_simple_query() @postgres.c (1) ExtendCLOG() @clog.c /* Write the state of this transaction * "IN_PROGRESS" to the CLOG. */ (2) heap_insert()@heapam.c /* Insert a tuple, creates a XLOG record, * and invoke the function XLogInsert. */ (3) XLogInsert() @xloginsert.c (9.4 or earlier, xlog.c) /* Write the XLOG record of the inserted tuple * to the WAL buffer, and update page's pd_lsn. */ (4) finish_xact_command() @postgres.c /* Invoke commit action.*/ XLogInsert() @xloginsert.c (9.4 or earlier, xlog.c) /* Write a XLOG record of this commit action * to the WAL buffer. */ (5) XLogWrite() @xloginsert.c (9.4 or earlier, xlog.c) /* Write and flush all XLOG records on * the WAL buffer to WAL segment. */ (6) TransactionIdCommitTree() @transam.c /* Change the state of this transaction * from "IN_PROGRESS" to "COMMITTED" on the CLOG. */
In the following paragraphs, each line of the pseudocode will be explained to help you understand the writing of XLOG records. See also Figs. 9.11 and 9.12.
In the above example, the commit action caused the writing of XLOG records to the WAL segment, but such writing may be caused by any of the following:
If any of the above occurs, all WAL records on the WAL buffer are written into a WAL segment file regardless of whether their transactions have been committed or not.
It is taken for granted that DML (Data Manipulation Language) operations write XLOG records, but so do non-DML operations. As described above, a commit action writes a XLOG record that contains the id of the committed transaction. Another example is a checkpoint action, which writes a XLOG record that contains general information about the checkpoint. Furthermore, the SELECT statement creates XLOG records in special cases, although it does not usually create them. For example, if deletion of unnecessary tuples and defragmentation of the necessary tuples in pages occur by HOT (Heap Only Tuple) during a SELECT statement, the XLOG records of modified pages are written to the WAL buffer.
PostgreSQL versions 15 and earlier do not support direct I/O, although it has been discussed. Reffer to this discussion on the pgsql-ML and this article.
In version 16, the debug-io-direct option has been added. This option is for developers to improve the use of direct I/O in PostgreSQL. If development goes well, direct I/O will be officially supported in the near future.
The WAL writer is a background process that periodically checks the WAL buffer and writes all unwritten XLOG records to the WAL segments. This process helps to avoid bursts of XLOG record writing. If the WAL writer is not enabled, writing XLOG records could be bottlenecked when a large amount of data is committed at once.
The WAL writer is enabled by default and cannot be disabled. The check interval is set to the configuration parameter wal_writer_delay, which defaults to 200 milliseconds.
In PostgreSQL, the checkpointer (background) process performs checkpoints. It starts when one of the following occurs:
In versions 9.1 or earlier, as mentioned in in Section 8.6, the background writer process did both checkpinting and dirty-page writing.
In the following subsections, the outline of checkpointing and the pg_control file, which holds the metadata of the current checkpoint, are described.
vThe checkpoint process has two aspects: preparing for database recovery, and cleaning dirty pages in the shared buffer pool. In this subsection, we will focus on the former aspect and describe its internal processing. See Fig. 9.13 for an overview.
Fig. 9.13. Internal processing of PostgreSQL Checkpoint.To summarize the description above from the perspective of database recovery, checkpointing creates a checkpoint record that contains the REDO point, and stores the checkpoint location and other information in the pg_control file. This allows PostgreSQL to recover itself by replaying WAL data from the REDO point (obtained from the checkpoint record) provided by the pg_control file.
As the pg_control file contains the fundamental information of the checkpoint, it is essential for database recovery. If it is corrupted or unreadable, the recovery process cannot start because it cannot obtain a starting point.
Even though pg_control file stores over 40 items, three items that will be needed in the next section are shown below:
A pg_control file is stored in the global subdirectory under the base-directory. Its contents can be shown using the pg_controldata utility.
postgres> pg_controldata /usr/local/pgsql/data pg_control version number: 1300 Catalog version number: 202306141 Database system identifier: 7250496631638317596 Database cluster state: in production pg_control last modified: Latest checkpoint location: 0/16AF0090 Latest checkpoint's REDO location: 0/16AF0090 Latest checkpoint's REDO WAL file: 000000010000000000000016 ... snip ...
PostgreSQL 11 or later only stores the WAL segments that contain the latest checkpoint or newer. Older segment files, which contains the prior checkpoint, are not stored to reduce the disk space used for saving WAL segment files under the pg_wal subdirectory. See this thread in details.
PostgreSQL implements redo log-based recovery. If the database server crashes, PostgreSQL can restore the database cluster by sequentially replaying the XLOG records in the WAL segment files from the REDO point.
We have already talked about database recovery several times up to this section. Here, I will describe two things about recovery that have not been explained yet.
The first thing is how PostgreSQL starts the recovery process. When PostgreSQL starts up, it first reads the pg_control file. The following are the details of the recovery process from that point. See Fig. 9.14 and the following description.
Fig. 9.14. Details of the recovery process.The second point is about the comparison of LSNs: why the non-backup block's LSN and the corresponding page's pd_lsn should be compared. Unlike the previous examples, this will be explained using a specific example that emphasizes the need for this comparison. See Figs. 9.15 and 9.16. (Note that the WAL buffer is omitted to simplify the description.)
Fig. 9.15. Insertion operations during the background writer working.Unlike the examples in overview, the TABLE_A's page has been once written to the storage in this scenario.
Do shutdown with immediate-mode, and then start the database.
Fig. 9.16. Database recovery.As you can see from this example, if the replay order of non-backup blocks is incorrect or non-backup blocks are replayed more than once, the database cluster will no longer be consistent. In short, the redo (replay) operation of non-backup block is not idempotent. Therefore, to preserve the correct replay order, non-backup block records should be replayed only if their LSN is greater than the corresponding page's pd_lsn.
On the other hand, as the redo operation of backup block is idempotent, backup blocks can be replayed any number of times regardless of their LSN.
PostgreSQL writes XLOG records to one of the WAL segment files stored in the pg_wal subdirectory (in versions 9.6 or earlier, pg_xlog subdirectory). A new WAL segment file is switched in if the old one has been filled up. The number of WAL files varies depending on several configuration parameters, as well as server activity. In addition, the management policy for WAL segment files has been improved in version 9.5.
The following subsections describe how WAL segment files are switched and managed.
WAL segment switches occur when one of the following events happens:
When a WAL segment file is switched, it is usually recycled (renamed and reused) for future use. However, it may be removed later if it is not needed.
Whenever a checkpoint starts, PostgreSQL estimates and prepares the number of WAL segment files that will be needed for the next checkpoint cycle. This estimate is based on the number of WAL segment files that were consumed in previous checkpoint cycles.
The number of WAL segment files is counted from the segment that contains the prior REDO point, and the value must be between the min_wal_size parameter (which default to 80 MB, or 5 files) and the max_wal_size parameter (which default to 1 GB, or 64 files).
If a checkpoint starts, PostgreSQL will keep or recycle the necessary WAL segment files, and remove any unnecessary files.
A specific example is shown in Fig. 9.17. Assuming that there are six WAL segment files before a checkpoint starts, WAL_3 contains the prior REDO point (in versions 10 or earlier; in versions 11 or later, the REDO point), and PostgreSQL estimates that five files will be needed. In this case, WAL_1 will be renamed as WAL_7 for recycling and WAL_2 will be removed.
The files older than the one that contains the prior REDO point can be removed, because, as is clear from the recovery mechanism described in Section 9.8, they would never be used.
If more WAL segment files are required due to a spike in WAL activity, new WAL segment files will be created while the total size of the WAL segment files is less than the max_wal_size parameter. For example, in Fig. 9.18, if WAL_7 has been filled up, WAL_8 will be newly created.
Fig. 9.18. Creating WAL segment file.The number of WAL segment files adapts to the server activity. If the amount of WAL data writing has been constantly increasing, the estimated number of WAL segment files, as well as the total size of the WAL segment files, will gradually increase. In the opposite case (i.e., the amount of WAL data writing has decreased), these will decrease.
If the total size of the WAL segment files exceeds the max_wal_size parameter, a checkpoint will be started. Figure 9.19 illustrates this situation. By checkpointing, a new REDO point will be created and the prior REDO point will be discarded; then, unnecessary old WAL segment files will be recycled. In this way, PostgreSQL will always keep just the WAL segment files that are needed for database recovery.
Fig. 9.19. Checkpointing and recycling WAL segment files.The wal_keep_size (or wal_keep_segments in versions 12 or earlear) and the replication slot feature also affect the number of WAL segment files.
Continuous archiving is a feature that copies WAL segment files to an archival area at the time when a WAL segment switch occurs. It is performed by the archiver (background) process. The copied file is called an archive log. This feature is typically used for hot physical backup and PITR (Point-in-Time Recovery), which are described in Chapter 10.
The path to the archival area is set by the archive_command configuration parameter. For example, the following parameter would copy WAL segment files to the directory '/home/postgres/archives/' every time a segment switch occurs:
archive_command = 'cp %p /home/postgres/archives/%f'
where, the %p placeholder represents the copied WAL segment, and the %f placeholder represents the archive log.
Fig. 9.20. Continuous archiving.The archive_command parameter can be set to any Unix command or tool. This means that you can use the scp command or any other file backup tool to transfer the archive logs to another host, instead of using a simple copy command.
In versions 14 or earlier, continuous archiving could only use shell commands. In version 15, PostgreSQL introduced a loadable library feature that allows you to achieve continuous archiving using a library. For more information, see the archive_library and basic_archive documentation.
PostgreSQL does not automatically clean up created archive logs. Therefore, you must properly manage the logs when using this feature. If you do nothing, the number of archive logs will continue to grow.
The pg_archivecleanup utility is one of the useful tools for managing archive log files.
You can also use the find command to delete archive logs. For example, the following command would delete all archive logs that were created more than three days ago:
$ find /home/postgres/archives -mtime +3d -exec rm -f {} \;