1.Logic Structure
Speaking about its logical structure, we can describe WAL as a stream of log entries of variable length. Each entry contains some data about a particular operation
preceded by a standard header.
1 Among other things, the header provides the following
information:
• transaction ID related to the entry
• the resource manager that interprets the entry
• the checksum to detect data corruption
• entry length
• a reference to the previous WAL entry
WAL is usually read in the forward direction, but some utilities like pg_rewind may scan it backwards.
2.Pyhsical Structure
2.1 WAL Buffer(wal_buffers)
--wal_buffers默认为1/32 Buffer Cache Size
WAL fles take up special buffers in the server’s shared memory. The size of the wal_buffer parameter, this size chosen automatically as 1/32 of the total buffer cache size.
2.2 WAL Log大小(类比oracle online redolog)
On disk, the WAL is stored in the $PG_DATA/pg_wal directory as separate ffles, or seg 16MBments. Their size is shown by the read-only wal_segment_size parameter.
For high-load systems, it makes sense to increase the segment size since it may reduce the overhead, but this setting can be modiffed only during cluster initialization
(initdb --wal-segsize).
postgres=# select pg_current_wal_lsn(), pg_current_wal_insert_lsn(),pg_walfile_name(pg_current_wal_lsn()),pg_walfile_name_offset(pg_current_wal_lsn());
pg_current_wal_lsn | pg_current_wal_insert_lsn | pg_walfile_name | pg_walfile_name_offset
--------------------+---------------------------+--------------------------+------------------------------------
5/87CA310 | 5/87CA310 | 000000010000000500000008 | (000000010000000500000008,8168208)
(1 row)
postgres=# SELECT file_name, upper(to_hex(file_offset)) file_offset FROM pg_walfile_name_offset('5/87C9FA8');
file_name | file_offset
--------------------------+-------------
000000010000000500000008 | 7C9FA8
(1 row)
Timeline(高8位):timeline used for recovery from a backup
lsn:中间8为为wal logsequence number(16^8=4294967296,16MB maxsize=4294967296*16/1024=67108864GB)
末8为日志segment offset
--查看TimeLine
[postgres@postgre12 ~]$ pg_controldata | grep TimeLine
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
-----
数据库恢复到过去时间点时,会导致类似科幻小说里的时间旅行和平行宇宙的复杂情况,因而需要标识这些“平行宇宙”。每当数据库恢复完成时,都会创建一个新的时间线(类似于因时间旅行而产生的新的平行宇宙)来标识在该恢复后产生的WAL日志,为了区分因数据库恢复造成的不同时间段的WAL日志而产生的时间记录称为时间线(TimeLine),WAL日志文件名 000000010000000000000001 中前八位 00000001 表示的便是数据库的时间线。
数据库初始化后,默认时间线是1,随着数据库系统的运行,新时间线会在以下两种情况下产生:
PITR恢复后执行了promote操作( select sys_wal_replay_resume() )
备节点升为主节点
每次创建一个新的时间线,都会创建一个“时间线历史”文件,文件名后缀为 .history ,它里面的内容是由原时间线history文件的内容再追加一条当前时间线切换记录。假设数据库恢复启动后,切换到新的时间线ID为4,那么文件名就是 00000004.history ,该文件记录了当前时间线是在什么时间从哪个时间线由于什么原因分出来的,该文件可能含有多行记录,比如:
$ cat 00000004.history
1 0/140000C8 no recovery target specified
2 0/19000060 no recovery target specified
3 0/1F000090 no recovery target specified
wal_segment_size为只读参数(如果需要更改主要重新编译执行initdb --wal-segsize)
postgres=# select name,setting from pg_settings where name like '%wal%';
name | setting
------------------------------+-----------
max_wal_senders | 10
max_wal_size | 1024
min_wal_size | 80
wal_block_size | 8192
wal_buffers | 512
wal_compression | off
wal_consistency_checking |
wal_init_zero | on
wal_keep_segments | 0
wal_level | replica
wal_log_hints | off
wal_receiver_status_interval | 10
wal_receiver_timeout | 60000
wal_recycle | on
wal_retrieve_retry_interval | 5000
wal_segment_size | 16777216
wal_sender_timeout | 60000
wal_sync_method | fdatasync
wal_writer_delay | 200
wal_writer_flush_after | 128
PostgreSQL WAL结构
于 2024-06-16 23:53:42 首次发布