Postgresql - Streaming Replication - pg_rewind

最新推荐文章于 2024-06-23 09:31:57 发布

|ChuckChen|

最新推荐文章于 2024-06-23 09:31:57 发布

阅读量604

点赞数

分类专栏： Postgresql 文章标签： Postgresql 主备切换 Streaming Replication 流复制 pg_rewind

本文链接：https://blog.csdn.net/chuckchen1222/article/details/86522793

版权

Postgresql 专栏收录该内容

142 篇文章 8 订阅

订阅专栏

pg_rewind是一个工具，用于在集群的时间线出现分歧后，将PostgreSQL集群与同一集群的另一个副本进行同步。一个典型的场景是，在故障转移之后，将旧的主服务器作为新主服务器之后的备用服务器重新联机。

结果相当于用源数据目录替换目标数据目录。只复制来自关系文件的已更改块；所有其他文件（包括配置文件）都将被完整复制。与使用新的基本备份或rsync等工具相比，pg_rewind的优势在于pg_rewind不需要读取集群中未更改的块。这使得当数据库很大时速度更快，并且集群之间只有很小一部分块不同。

pg_rewind检查源集群和目标集群的时间线历史，以确定它们的分歧点，并期望在目标集群的pg_wal目录中找到wal，一直延伸到分歧点。分歧点可以在目标时间线、源时间线或它们的共同祖先上找到。在典型的故障转移场景中，目标集群在分歧后很快关闭，这不是问题，但是如果目标集群在分歧后运行很长时间，那么旧的wal文件可能不再存在。在这种情况下，可以手动将它们从wal归档文件复制到pg_wal目录，或者通过配置recovery.conf在启动时获取。pg_rewind的使用不仅限于故障转移，例如，可以升级备用服务器，运行一些写事务，然后重新倒带以再次成为备用服务器。

当目标服务器在运行pg_rewind后第一次启动时，它将进入恢复模式，并在偏离点后重放源服务器中生成的所有wal。如果在运行pg_rewind时，源服务器中的某些wal不再可用，因此pg_rewind会话无法复制，则必须在启动目标服务器时使其可用。这可以通过使用合适的restore_命令在目标数据目录中创建recovery.conf文件来完成。

pg_rewind要求目标服务器在postgresql.conf中启用wal_log_hints选项，或者在使用initdb初始化集群时启用数据校验和。默认情况下，这两个选项都当前处于打开状态。完整的页面写入也必须设置为打开，但默认情况下是启用的。

如果在处理过程中pg_rewind失败，那么目标的数据文件夹可能不处于可以恢复的状态。在这种情况下，建议采用新的备份。

如果pg_rewind找到无法直接写入的文件，它将立即失败。例如，当源服务器和目标服务器对只读SSL密钥和证书使用相同的文件映射时，就会发生这种情况。如果目标服务器上存在此类文件，建议在运行pg_rewind之前删除它们。在执行倒带操作后，其中一些文件可能已从源文件中复制，在这种情况下，可能需要删除复制的数据并还原倒带之前使用的链接集。

基本思想是将所有文件系统级更改从源集群复制到目标集群：

扫描目标集群的wal日志，从源集群的时间线历史分叉点之前的最后一个检查点开始。对于每个wal记录，记录每个被触摸的数据块。这将生成源集群分叉后目标集群中更改的所有数据块的列表。

使用直接文件系统访问（--source pgdata）或SQL（--source server）将所有更改的块从源集群复制到目标集群。

将所有其他文件（如pg_xact和配置文件）从源集群复制到目标集群（关系文件除外）。与基本备份类似，从源集群复制的数据中省略目录pg_dynshmem/、pg_notify/、pg_replslot/、pg_serial/、pg_snapshots/、pg_stat_tmp/和pg_subtrans/的内容。省略以pgsql_tmp开头的任何文件或目录，以及backup_label、tablespace_map、pg_internal.init、postmaster.opts和postmaster.pid。

从故障转移时创建的检查点开始，从源集群应用wal。（严格地说，pg_rewind不应用wal，它只创建一个备份标签文件，使postgresql从该检查点向前重放所有wal开始。）

盘它！！！

实验环境： CentOS 7 + PG 11

==============================================
先搭一个从库，做实验。
pg_basebackup -h 192.168.3.51 -U dbsr -F p -P -R -D /usr/local/pgsql/data 
cd /usr/local/pgsql/data 
vim recovery.conf
  recovery_target_timeline='latest'
chown -R postgres:postgres *
service postgresql-11 start
==============================================

pg_rewind 前提条件

1 full_page_writes=on #默认

2 wal_log_hints=on #默认是off，

或者 PG 在初始化时开启 checksums 功能

1. 在主备节点上查看状态

主：

# pg_controldata | grep cluster
Database cluster state:               in production

备：

# pg_controldata | grep cluster
Database cluster state:               in archive recovery

2. 将备提成主。

备：

# su - postgres
$ pg_ctl promote -D $PGDATA
waiting for server to promote.... done
server promoted
（promote mode commands the standby server that is running in the specified data directory to end standby mode and begin read-write operations.）

查看备库状态。

$ pg_controldata | grep cluster
Database cluster state:               in production

3. 在新主插入一张测试表。

postgres=# create table test(id int4);
CREATE TABLE
postgres=# insert into test (id) select n from generate_series(1,10000) n;
INSERT 0 10000

4. 停原主

$ pg_controldata | grep cluster
Database cluster state:               in production
$ pg_ctl stop -m fast -D $PGDATA
waiting for server to shut down.... done
server stopped

5. 使用pg_rewind 将原主做成备库

$ pg_rewind --target-pgdata $PGDATA --source-server='host=192.168.3.52 port=5432 user=postgres dbname=postgres' -P
connected to server
servers diverged at WAL location 10/DE000140 on timeline 1
rewinding from last common checkpoint at 10/DE000098 on timeline 1
reading source file list
reading target file list
reading WAL in target
need to copy 55 MB (total source directory size is 96 MB)
57187/57187 kB (100%) copied
creating backup label and updating control file
syncing target data directory
Done!

6. 调整recovery.conf文件

$ mv recovery.done recovery.conf

调整文件信息

$ vim recovery.conf 
standby_mode = 'on'
primary_conninfo = 'user=dbsr passfile=''/root/.pgpass'' host=192.168.3.52 port=5432 sslmode=prefer sslcompression=0 krbsrvname=postgres target_session_attrs=any'
recovery_target_timeline='latest'

7. 启动新备库，查看状态

# service postgresql-11 start
# pg_controldata | grep cluster
Database cluster state:               in archive recovery

8. 验证数据

在新主上查询

postgres=# select * from pg_stat_replication ;
 pid  | usesysid | usename | application_name | client_addr  | client_hostname | client_port |        backend_start         | backend_xmin |   state   |  sent_lsn   |  write_lsn  |  flush_lsn
  | replay_lsn  |    write_lag    |    flush_lag    |   replay_lag    | sync_priority | sync_state
------+----------+---------+------------------+--------------+-----------------+-------------+------------------------------+--------------+-----------+-------------+-------------+-----------
--+-------------+-----------------+-----------------+-----------------+---------------+------------
 4037 | 17253049 | dbsr    | walreceiver      | 192.168.3.51 | db1             |       43012 | 2019-01-16 22:28:40.24195-05 |              | streaming | 10/DE0BE3E0 | 10/DE0BE3E0 | 10/DE0BE3E
0 | 10/DE0BE3E0 | 00:00:00.001219 | 00:00:00.001989 | 00:00:00.002334 |             0 | async
(1 row)

在新备上查询

postgres=# select count(*) from test ;
 count
-------
 10000
(1 row)

|ChuckChen|

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Postgresql - Streaming Replication - pg_rewind

pg_rewind是一个工具，用于在集群的时间线出现分歧后，将PostgreSQL集群与同一集群的另一个副本进行同步。一个典型的场景是，在故障转移之后，将旧的主服务器作为新主服务器之后的备用服务器重新联机。结果相当于用源数据目录替换目标数据目录。只复制来自关系文件的已更改块；所有其他文件（包括配置文件）都将被完整复制。与使用新的基本备份或rsync等工具相比，pg_rewind的优势在于pg_...
复制链接

扫一扫

专栏目录