PostgreSQL DBA(19) - REDO point浅析

What's REDO point?

REDO point is the point PostgreSQL start to recover from;that is, the location to write the XLOG record at the moment when the latest checkpoint is started.

REDO point是PostgreSQL启动恢复的起始点,也就是说REDO point是最后一次checkpoint启动时写XLOG Record的位置.

一、REDO point的存储

checkpointer进程启动时,从pg_control文件中获取Redo point并存储在内存中,在执行checkpoint时更新Redo point为当前即将写XLOG Record的位置,checkpoint执行成功后把Redo point更新到pg_control文件中.
类似的,数据库启动需要执行恢复时,从pg_control文件中获取REDO point进行恢复.
pg_control文件位于$PGDATA/global目录中,可通过命令pg_controldata查看文件中的内容.

[xdb@localhost pg111db]$ find ./ -name pg_control
./global/pg_control
[xdb@localhost pg111db]$ pg_controldata
pg_control version number:            1100
Catalog version number:               201809051
Database system identifier:           6624362124887945794
Database cluster state:               in production
pg_control last modified:             Thu 20 Dec 2018 03:34:05 PM CST
Latest checkpoint location:           1/48447DF0
Latest checkpoint's REDO location:    1/48447DF0
Latest checkpoint's REDO WAL file:    000000010000000100000048
Latest checkpoint's TimeLineID:       1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:1979
Latest checkpoint's NextOID:          25238
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        561
Latest checkpoint's oldestXID's DB:   16402
Latest checkpoint's oldestActiveXID:  0
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 16402
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Thu 20 Dec 2018 03:34:05 PM CST
Fake LSN counter for unlogged rels:   0/1
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    minimal
wal_log_hints setting:                off
max_connections setting:              100
max_worker_processes setting:         8
max_prepared_xacts setting:           0
max_locks_per_xact setting:           64
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       8192
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        1996
Size of a large-object chunk:         2048
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           0
Mock authentication nonce:            90bf37566703859a557b7f20688eb944b6335b5c3d36f5530941ebf1dfa777c1
[xdb@localhost pg111db]$ 

其中

Latest checkpoint's REDO location:    1/48447DF0

记录的信息即为REDO point.

二、REDO point的变更

1.查看当前的REDO point

[xdb@localhost pg111db]$ pg_controldata|grep 'REDO location'
Latest checkpoint's REDO location:    1/48448150

REDO point为1/48448150.

2.执行DML操作
插入3条记录

testdb=# insert into cp values(7);
INSERT 0 1
testdb=# insert into cp values(8);
INSERT 0 1
testdb=# insert into cp values(9);
INSERT 0 1

查看1/48448150后的XLOG Record

[xdb@localhost pg_wal]$ pg_waldump -p ./ -s 1/48448150
rmgr: XLOG        len (rec/tot):    106/   106, tx:          0, lsn: 1/48448150, prev 1/484480E0, desc: CHECKPOINT_ONLINE redo 1/48448150; tli 1; prev tli 1; fpw true; xid 0:1982; oid 25238; multi 1; offset 0; oldest xid 561 in DB 16402; oldest multi 1 in DB 16402; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; online
rmgr: Heap        len (rec/tot):     54/   474, tx:       1982, lsn: 1/484481C0, prev 1/48448150, desc: INSERT off 11, blkref #0: rel 1663/16402/17046 blk 0 FPW
rmgr: Transaction len (rec/tot):     34/    34, tx:       1982, lsn: 1/484483A0, prev 1/484481C0, desc: COMMIT 2018-12-20 16:10:17.471639 CST
rmgr: Heap        len (rec/tot):     59/    59, tx:       1983, lsn: 1/484483C8, prev 1/484483A0, desc: INSERT off 12, blkref #0: rel 1663/16402/17046 blk 0
rmgr: Transaction len (rec/tot):     34/    34, tx:       1983, lsn: 1/48448408, prev 1/484483C8, desc: COMMIT 2018-12-20 16:10:20.170594 CST
rmgr: Heap        len (rec/tot):     59/    59, tx:       1984, lsn: 1/48448430, prev 1/48448408, desc: INSERT off 13, blkref #0: rel 1663/16402/17046 blk 0
rmgr: Transaction len (rec/tot):     34/    34, tx:       1984, lsn: 1/48448470, prev 1/48448430, desc: COMMIT 2018-12-20 16:10:22.268365 CST
pg_waldump: FATAL:  error in WAL record at 1/48448470: invalid record length at 1/48448498: wanted 24, got 0

最后一个XLOG Record记录的位置为1/48448470,加上记录大小34(十六进制为0x22),位置为1/48448492,按理论上来说,如果现在执行checkpoint,该位置为REDO point.

3.执行checkpoint

testdb=# checkpoint;
CHECKPOINT

查看pg_control文件内容

[xdb@localhost pg111db]$ pg_controldata|grep 'REDO location'
Latest checkpoint's REDO location:    1/48448498

再次查看XLOG Record记录

[xdb@localhost pg_wal]$ pg_waldump -p ./ -s 1/48448150
rmgr: XLOG        len (rec/tot):    106/   106, tx:          0, lsn: 1/48448150, prev 1/484480E0, desc: CHECKPOINT_ONLINE redo 1/48448150; tli 1; prev tli 1; fpw true; xid 0:1982; oid 25238; multi 1; offset 0; oldest xid 561 in DB 16402; oldest multi 1 in DB 16402; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; online
rmgr: Heap        len (rec/tot):     54/   474, tx:       1982, lsn: 1/484481C0, prev 1/48448150, desc: INSERT off 11, blkref #0: rel 1663/16402/17046 blk 0 FPW
rmgr: Transaction len (rec/tot):     34/    34, tx:       1982, lsn: 1/484483A0, prev 1/484481C0, desc: COMMIT 2018-12-20 16:10:17.471639 CST
rmgr: Heap        len (rec/tot):     59/    59, tx:       1983, lsn: 1/484483C8, prev 1/484483A0, desc: INSERT off 12, blkref #0: rel 1663/16402/17046 blk 0
rmgr: Transaction len (rec/tot):     34/    34, tx:       1983, lsn: 1/48448408, prev 1/484483C8, desc: COMMIT 2018-12-20 16:10:20.170594 CST
rmgr: Heap        len (rec/tot):     59/    59, tx:       1984, lsn: 1/48448430, prev 1/48448408, desc: INSERT off 13, blkref #0: rel 1663/16402/17046 blk 0
rmgr: Transaction len (rec/tot):     34/    34, tx:       1984, lsn: 1/48448470, prev 1/48448430, desc: COMMIT 2018-12-20 16:10:22.268365 CST
rmgr: XLOG        len (rec/tot):    106/   106, tx:          0, lsn: 1/48448498, prev 1/48448470, desc: CHECKPOINT_ONLINE redo 1/48448498; tli 1; prev tli 1; fpw true; xid 0:1985; oid 25238; multi 1; offset 0; oldest xid 561 in DB 16402; oldest multi 1 in DB 16402; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; online
pg_waldump: FATAL:  error in WAL record at 1/48448498: invalid record length at 1/48448508: wanted 24, got 0
[xdb@localhost pg_wal]$ 

可以看到1/48448498是最后一条checkpoint记录的起始写入位置,REDO point为1/48448498,比预想的要多出6个字节(这6个字节都是0x00,用于补齐?).

4.dump WAL segment file
再次使用hexdump工具查看WAL segment file文件.

1/48448470开始的"COMMIT"记录

[xdb@localhost pg_wal]$ echo "obase=10;ibase=16;448470"|bc
4490352 --> 文件位置偏移
[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000048 -s 4490352 -n 34 
00448470  22 00 00 00 c0 07 00 00  30 84 44 48 01 00 00 00  |".......0.DH....|
00448480  00 01 00 00 83 5e 7a d4  ff 08 cd 2f 4a 6e 6e 20  |.....^z..../Jnn |
00448490  02 00                                             |..|
00448492

XLOG Record的头部首先是XLogRecord结构体,第一个域是uint32的record的大小,即0x00000022,十进制为34(大小无异).

后续的6个字节
均为0x00

[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000048 -s 4490386 -n 6 
00448492  00 00 00 00 00 00                                 |......|
00448498

XLOG Record for checkpoint

[xdb@localhost pg_wal]$ hexdump -C 000000010000000100000048 -s 4490392 -n 106 
00448498  6a 00 00 00 00 00 00 00  70 84 44 48 01 00 00 00  |j.......p.DH....|
004484a8  10 00 00 00 ac d2 8b 95  ff 50 98 84 44 48 01 00  |.........P..DH..|
004484b8  00 00 01 00 00 00 01 00  00 00 01 00 00 00 00 00  |................|
004484c8  00 00 c1 07 00 00 96 62  00 00 01 00 00 00 00 00  |.......b........|
004484d8  00 00 31 02 00 00 12 40  00 00 01 00 00 00 12 40  |..1....@.......@|
004484e8  00 00 ba 4e 1b 5c 00 00  00 00 00 00 00 00 00 00  |...N.\..........|
004484f8  00 00 00 00 00 00 00 00  00 00                    |..........|
00448502

大小为0x0000006A,即106B(头部XLogRecord 24B + XLogRecordDataHeaderShort 2B + Checkpoint 80B ).
从FF 50为XLogRecordDataHeaderShort结构体的内容,0xFF为标志位,0x50为Data的大小(即80B).
checkpoint记录的内容详见Checkpoint结构体,该结构体第一个域字段为8个字节的LSN-->0x00000001 48448498,
即REDO point:1/48448498.

三、参考资料

PostgreSQL 源码解读(109)- WAL#5(相关数据结构)
PostgreSQL DBA(15) - WAL文件结构
PostgreSQL DBA(16) - WAL segment file内部结构
PostgreSQL DBA(17) - XLOG Record data内部结构
PostgreSQL DBA(18) - pg_waldump工具简介

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/6906/viewspace-2374774/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/6906/viewspace-2374774/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值