postgreSQL-复制槽

排骨~年糕

已于 2024-05-10 16:40:31 修改

阅读量579

点赞数 5

文章标签： postgresql 数据库

于 2024-05-10 14:58:04 首次发布

本文链接：https://blog.csdn.net/m0_71902491/article/details/138665631

版权

1.复制槽简介

流复制正常工作的时候，主服务器不会为落后的备用服务器保留额外的 WAL。在某些情况

下，由于某些原因，从服务器可能会断开连接，从而导致延迟的时间增大，并且当某些未

复制的 WAL 文件被回收，备用机无法恢复，就会收到如下错误：

ERROR: requested WAL segment 000000010000000000000008 has already been 
removed

复制槽(Physical Replication Slot)提供了一种办法确保主库不会删除还未发送到备库的

WAL 日志，即使备库掉线。

通过 Replication Slot 记录的从库状态，PostgreSQL 会保证从库还没有 apply 的日志，

不会从主库的日志目录里面清除。而且，replication slot 的状态信息是持久化保存的，即

便从库断掉或主库重启，这些信息仍然不会丢掉或失效。

2.创建复制槽

主库创建复制槽

select * from pg_create_physical_replication_slot('slot1'); 
 slot_name | lsn 
-----------+----- 
 slot1 | 
(1 row) 
select * from pg_replication_slots; 
-[ RECORD 1 ]-------+--------- 
slot_name           | slot1 #复制槽名称 
plugin              | #复制槽对应的输出插件名，如果是物理复制槽，这里显示为空 
slot_type           | physical #复制槽类型，有 physical 或 logical 
datoid              | #复制槽对应的数据库 oid，如果物理复制槽，此字段显示为空 
database            | #复制槽对应的数据库名称，如果物理复制槽，此字段显示为空 
temporary           | f #是否为临时复制操，临时槽不会被保存在磁盘上并且会在出错
或会话结束时自动被删除掉。 
active              | f #当前复制槽，如果正在使用显示为 t 
active_pid          | #使用复制槽会话的进程号 
xmin                | #数据库需要保留的最老事务，VACUUM 不能移除被其后续事务
删除的元组。 
catalog_xmin        | #数据库需要保留的影响系统目录的最旧事务。VACUUM 不能
移除被其后续事务删除的目录元组。 
restart_lsn         | #这个复制操被消费的最旧的 lsn 位置，checkpoint 不会删除这
个 lsn 之后的 wal 文件 
confirmed_flush_lsn | #代表逻辑槽的消费者已经确认接收数据到什么位置的地址
（LSN）。 比这个地址更旧的数据已经不再可用。对于物理槽这里是 NULL。

在从库配置参数，然后重启从库

--修改 postgresql.conf，使其连接到主服务器的复制槽 
vi postgresql.conf 
primary_slot_name = 'slot1' 

--重新加载配置文件 
pg_ctl reload 

--如果一切正常，主库上的复制槽的状态应为 active。 
select * from pg_replication_slots; 
-[ RECORD 1 ]-------+---------- 
slot_name | slot1 
plugin    | 
slot_type | physical 
datoid    | 
database  | 
temporary | f 
active    | t 
active_pid   | 16652 
xmin         | 
catalog_xmin | 
restart_lsn  | 0/3003B98 
confirmed_flush_lsn |

查看 restart_lsn 所在的 wal 文件

restart_lsn 表示主库 checkpoint 的时候不会删除这之后的 wal 日志，为备库保留着

postgres=# select pg_walfile_name(restart_lsn) from pg_replication_slots; 
 pg_walfile_name | 000000010000000000000013

模拟：没有复制槽的情况

postgres=# select pg_drop_replication_slot('slot1'); 
 pg_drop_replication_slot 
-------------------------- 
 
(1 row) 
pg_ctl reload 

postgres=# select * from pg_replication_slots; 
 slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | 
xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn 
-----------+--------+-----------+--------+----------+-----------+--------+------------
+------+--------------+-------------+--------------------- 
(0 rows) 
 
--关掉备库： 
pg_ctl stop 

--主库查看复制状态 
postgres=# select client_addr,sync_state from pg_stat_replication；
 client_addr | sync_state 
-------------+------------ 
(0 rows) 

 
--主库上频繁创建表，以及切换 WAL 日志 
postgres=# create table t2(id int); 
CREATE TABLE 
postgres=# select pg_switch_wal(); 
 pg_switch_wal 
--------------- 
 0/3016168 
(1 row)

postgres=# create table t3(id int);
CREATE TABLE 
postgres=# select pg_switch_wal(); 
 pg_switch_wal 
--------------- 
 0/4003A50

postgres=# create table t4(id int); 
CREATE TABLE 
postgres=# select pg_switch_wal(); 
 pg_switch_wal 
--------------- 
 0/50011A0 
(1 row) 
postgres=# checkpoint；
CHECKPOINT

再启动备库，可以看到流复制关系断了

--备库启动 
pg_ctl start 

--主库查看复制状态 
postgres=# select client_addr,sync_state from pg_stat_replication ; 
 client_addr | sync_state 
-------------+------------ 
(0 rows) 

--查看备库日志 
2023-05-07 10:20:59.162 CST [2079] STATEMENT: START_REPLICATION 
3/61000000 TIMELINE 3 
2023-05-07 10:21:04.167 CST [2080] ERROR: requested WAL segment 
000000030000000300000061 has already been removed

恢复流复制环境，使用复制槽

postgres=# select client_addr,sync_state from pg_stat_replication ; 
 client_addr | sync_state 
-----------------+------------ 
 192.168.1.103 | async 
(1 row) 

postgres=# select * from pg_replication_slots ; 
 slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | 
xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn 
-----------+--------+-----------+--------+----------+-----------+--------+------------
+------+--------------+-------------+--------------------- 
 slot1 | | physical | | | f | t | 4496 | 486 | | 
0/3000060 | 
(1 row) 

 
--同样的操作，关掉备库 
pg_ctl stop 

--主库查看复制状态 
postgres=# select client_addr,sync_state from pg_stat_replication ; 
 client_addr | sync_state 
-------------+------------ 
(0 rows)

--然后主库上频繁的创建表，以及切换 WAL 日志 
postgres=# create table t6(id int); 
CREATE TABLE 
postgres=# select pg_switch_wal(); 
 pg_switch_wal 
--------------- 
 0/3014FF0 
(1 row) 
 
postgres=# create table t7(id int); 
CREATE TABLE 
postgres=# select pg_switch_wal(); 
 pg_switch_wal 
--------------- 
 0/40011A0 
(1 row) 
postgres=# create table t8(id int); 
CREATE TABLE 
postgres=# select pg_switch_wal(); 
 pg_switch_wal 
--------------- 
 0/5003A50 
(1 row) 
postgres=# create table t8(id int); 
CREATE TABLE 
postgres=# select pg_switch_wal(); 
 pg_switch_wal 
--------------- 
 0/5003A50 
(1 row) 
postgres=# create table t10(id int); 
CREATE TABLE 
postgres=# select pg_switch_wal(); 
 pg_switch_wal 
--------------- 
 0/5003A50 
(1 row) 
postgres=# checkpoint ; 

CHECKPOINT 
--启动备库 
pg_ctl start 
--主库查看复制状态 
postgres=# select client_addr,sync_state from pg_stat_replication ; 
 client_addr | sync_state 
-----------------+------------ 
 192.168.1.103 | async 
(1 row)

在备库上查一下是否有这些表，可以看到，流复制正常

postgres=# \d 
 List of relations 
 Schema | Name | Type | Owner 
--------+------+-------+---------- 
 public | t1 | table | postgres 
 public | t10 | table | postgres 
 public | t2 | table | postgres 
 public | t3 | table | postgres 
 public | t4 | table | postgres 
 public | t5 | table | postgres 
 public | t6 | table | postgres 
 public | t7 | table | postgres 
 public | t8 | table | postgres 
 public | t9 | table | postgres 
(10 rows)

总结

复制槽防止备库需要的 wal 日志在主库被删除，主库会根据备库返回的信息确认哪些 wal

日志已不再需要，才能进行清理。

注意

如果收不到从库的 reply，复制槽的状态 restart lsn 会保持不变，造成主库会一直保留本

地日志，可能导致日志磁盘满。所以应该实时监控 wal 日志磁盘使用情况，并设置较小的

wal_sender_timeout，默认为 60s，及早发现从库断掉的情况。

排骨~年糕

关注

5
点赞
踩
6

收藏

觉得还不错? 一键收藏
0
评论
postgreSQL-复制槽

流复制正常工作的时候，主服务器不会为落后的备用服务器保留额外的 WAL。在某些情况下，由于某些原因，从服务器可能会断开连接，从而导致延迟的时间增大，并且当某些未复制的 WAL 文件被回收，备用机无法恢复，就会收到如下错误：removed复制槽(Physical Replication Slot)提供了一种办法确保主库不会删除还未发送到备库的WAL 日志，即使备库掉线。通过 Replication Slot 记录的从库状态，PostgreSQL 会保证从库还没有 apply 的日志，
复制链接

扫一扫