postgresql13部署主从同步和切换+pg_rewind

CR7-NO.1

已于 2023-06-30 10:49:54 修改

阅读量3.8k

点赞数 22

分类专栏：数据库文章标签： postgresql ubuntu docker redis 数据库

于 2023-06-30 10:28:47 首次发布

本文链接：https://blog.csdn.net/qq_39169917/article/details/131471236

版权

数据库专栏收录该内容

2 篇文章 0 订阅

订阅专栏

环境:
Os:Centos 7
DB:13.8
主库:192.168.1.134
从库:192.168.1.135

参考网站：https://www.cnblogs.com/hxlasky/p/16810443.html

################主从部署######################

主库创建流复制的用户
postgres=# CREATE ROLE replica login replication encrypted password 'replica';

主库修改pg_hba.conf文件,允许备库IP通过复制用户访问数据库

切换root用户 su root

vi /opt/pg13/data/pg_hba.conf

# replication privilege.

local replication all trust

host replication all 127.0.0.1/32 trust

host replication all ::1/128 trust

#此配置置于ipv4

host replication replica 192.168.1.0/24 md5 ## 新增的,我这里整个网段开放

或是具体指定ip

# replication privilege.

local replication all trust

host replication all 127.0.0.1/32 trust

host replication all ::1/128 trust

host replication replica 192.168.1.135/32 md5 ## 具体指定ip

需要重新reload,否则报错连接不了
[postgres@host134 ~]$ pg_ctl -D /opt/pg13/data reload

3.停掉从库
su - postgres
pg_ctl -D /opt/pg13/data -l /opt/pg13/log/postgres.log stop

从库准备data目录
从库安装完成后，不初始化，若已经初始化，删除其data目录
若之前安装的pg有data目录的话需要将其删除掉,并创建一个空的相同的目录
su - postgres
[postgres@host135 ~]$ cd /opt/pg13
[postgres@host135 pg13]$ mv data bakdata
[postgres@host135 pg13]$ mkdir data

创建归档目录，保持与主库一致

[postgres@host135 pg13]$mkdir -p /opt/pg13/archivelog

注意权限要正确,不对的话需要进行修改,root用户下修改权限
[root@host135 ~]# chown -R postgres:postgres /opt/pg13
[root@host135 ~]# chmod 0700 /opt/pg13/data

5.备库上执行对于主库的基础备份
[postgres@host135 pg13]$pg_basebackup -h 192.168.1.134 -p 5432 -U replica --password -X stream -Fp --progress -D /opt/pg13/data -R
注意,备份选项上带有-R选项.

[postgres@host135 pg13]$ pg_basebackup -h 192.168.1.134 -p 5432 -U replica --password -X stream -Fp --progress -D /opt/pg13/data -R
Password:
pg_basebackup: error: FATAL: no pg_hba.conf entry for replication connection from host "192.168.1.135", user "replica", SSL off

原因1：

是主库修改了pg_hba.conf,没有reload,执行如下reload即可
pg_ctl -D /opt/pg13/data reload

原因2：

如果操作失败尝试：防火窗是否链拦截，把虚拟机中的防火墙清一下

sudo iptables -F

打开主节点5432端口

firewall-cmd --permanent --zone=public --add-port=5432/tcp

firewall-cmd --state

firewall-cmd --reload

[postgres@host135 pg13]$ pg_basebackup -h 192.168.1.134 -p 5432 -U replica --password -X stream -Fp --progress -D /opt/pg13/data -R
Password:
32247/32247 kB (100%), 1/1 tablespace

执行了pg_basebackup命令,从库会把主库的 postgresql.conf,pg_hba.conf文件也拷贝过来了的
现在这两个文件的内容主从库是一致的.

若是在归档模式下的话，需要从库创建同样的归档目录

6.备库就可以执行pg_ctl start启动了
这时,就可以看到备库服务器上自动生成了standby.signal文件,同时,也看到在$PGDATA路径下,数据库自动帮我们配置了关于流复制的主库的信息:

[postgres@host135 data]$ ls -1

backup_label

backup_manifestbase

current_logfilesglobal

log

pg_commit_ts

pg_dynshmem

pg_hba.conf

pg_ident.conf

pg_logical

pg_multixact

pg_notify

pg_replslot

pg_serial

pg_snapshots

pg_stat

pg_stat_tmp

pg_subtrans

pg_tblspc

pg_twophase

PG_VERSION

pg_wal

pg_xact

postgresql.auto.conf

postgresql.confstandby.signal

也看到在$PGDATA路径下,数据库会复制主库的pg_hba.conf,postgresql.conf这两个文件到从库,这个时候主从库配置文件保持了一致,若需要修改的,也可以修改,比如端口号.

同时postgresql.auto.conf,数据库自动帮我们配置了关于流复制的主库的信息
[postgres@host135 data]$ more postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
primary_conninfo = 'user=replica password=replica channel_binding=disable host=192.168.1.134 port=5432 sslmode=disable sslcompression=0 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any'

当然了,如果我们没有使用-R来备份主库的话.我们完全可以在备库上手工创建standby.signal文件,然后手工编辑postgresql.conf(不是postgresql.auto.conf文件),并在其内容中配置主库的信息.

7.启动从库
pg_ctl -D /opt/pg13/data -l /opt/pg13/log/postgres.log start

报错:
2022-10-19 10:16:25 CST [32043]: [1-1] user=,db=,app=,client=LOG: redirecting log output to logging collector process
2022-10-19 10:16:25 CST [32043]: [2-1] user=,db=,app=,client=HINT: Future log output will appear in directory "/opt/pg13/log".
2022-10-19 10:57:31 CST [3551]: [1-1] user=,db=,app=,client=FATAL: data directory "/opt/pg13/data" has invalid permissions
2022-10-19 10:57:31 CST [3551]: [2-1] user=,db=,app=,client=DETAIL: Permissions should be u=rwx (0700) or u=rwx,g=rx (0750).

解决办法:
root用户下修改权限
[root@host135 ~]# chown -R postgres:postgres /opt/pg13
[root@host135 ~]# chmod 0700 /opt/pg13/data

主库查看数据库复制信息

进入数据库：psql -h localhost -U postgres -p 5432

postgres=# select * from pg_stat_replication;

------+----------+---------+------------------+----------------+-----------------+-------------+-------------------------------+--------------+-----------+-----------+-----------+-----------+------------+-----------+-----------+------------+---------------+------------+-------------------------------

2197 | 16403 | replica | walreceiver | 192.168.88.130 | | 34058 | 2023-06-09 19:23:29.105932+08 | | streaming | 0/7000060 | 0/7000060 | 0/7000060 | 0/7000060 | | | | 0 | async | 2023-06-09 19:24:59.403341+08

(1 row)

9.进程查看
从库进程

[postgres@host135 data]$ ps -ef|grep postgres

postgres 3815 1 0 10:59 ? 00:00:00 /opt/pg13/bin/postgres -D /opt/pg13/data

postgres 3816 3815 0 10:59 ? 00:00:00 postgres: logger

postgres 3817 3815 0 10:59 ? 00:00:00 postgres: startup recovering 00000001000000000000001B

postgres 3818 3815 0 10:59 ? 00:00:00 postgres: checkpointer

postgres 3819 3815 0 10:59 ? 00:00:00 postgres: background writer

postgres 3820 3815 0 10:59 ? 00:00:00 postgres: stats collector

postgres 3821 3815 0 10:59 ? 00:00:00 postgres: walreceiver streaming 0/1B000148

postgres 3864 26618 0 11:00 pts/1 00:00:00 ps -ef

postgres 3865 26618 0 11:00 pts/1 00:00:00 grep --color=auto postgres

root 26617 25114 0 09:26 pts/1 00:00:00 su - postgres

postgres 26618 26617 0 09:26 pts/1 00:00:00 -bash

主库进程

[postgres@host134 data]$ ps -ef|grep postgres

postgres 11073 1 0 Oct18 ? 00:00:00 /opt/pg13/bin/postgres -D /opt/pg13/data

postgres 11074 11073 0 Oct18 ? 00:00:00 postgres: logger

postgres 11077 11073 0 Oct18 ? 00:00:00 postgres: checkpointer

postgres 11078 11073 0 Oct18 ? 00:00:00 postgres: background writer

postgres 11079 11073 0 Oct18 ? 00:00:00 postgres: walwriter

postgres 11080 11073 0 Oct18 ? 00:00:00 postgres: autovacuum launcher

postgres 11081 11073 0 Oct18 ? 00:00:00 postgres: archiver last was 00000001000000000000001A.00000028.backup

postgres 11082 11073 0 Oct18 ? 00:00:01 postgres: stats collector

postgres 11083 11073 0 Oct18 ? 00:00:00 postgres: logical replication launcher

postgres 11294 11073 0 Oct18 ? 00:00:00 postgres: postgres postgres 192.168.1.134(40882) idle

postgres 21407 11073 0 10:59 ? 00:00:00 postgres: walsender replica 192.168.1.135(50736) streaming 0/1B000148

主库
[postgres@host134 20221021]$ pg_controldata /opt/pg13/data/| grep 'Database cluster state'
Database cluster state: in production

备库
[postgres@host135 bin]$ pg_controldata /opt/pg13/data/| grep 'Database cluster state'
Database cluster state: in archive recovery

10.数据验证

登录从库

[postgres@host135 data]$ psql -h 192.168.1.135 -U postgres

Password for user postgres:

psql (13.8)

Type "help" for help.

postgres=# \c db_test;

You are now connected to database "db_test" as user "postgres".

db_test=# select * from tb_test;

id | name | createtime | modifytime ----+-------+----------------------------+----------------------------

1 | name1 | 2022-10-18 11:32:33.649901 | 2022-10-18 11:32:33.649901

2 | name2 | 2022-10-18 11:32:33.665863 | 2022-10-18 11:32:33.665863

3 | name3 | 2022-10-18 11:32:33.691182 | 2022-10-18 11:32:33.691182

4 | name4 | 2022-10-18 11:32:33.771843 | 2022-10-18 11:32:33.771843

5 | name5 | 2022-10-18 11:32:34.496502 | 2022-10-18 11:32:34.496502

(5 rows)

主库写入:

[postgres@host134 data]$ psql -h 192.168.1.134 -U postgres

Password for user postgres:

psql (13.8)

Type "help" for help.

postgres=# \c db_test;

You are now connected to database "db_test" as user "postgres".

db_test=# select * from tb_test;

id | name | createtime | modifytime ----+-------+----------------------------+----------------------------

1 | name1 | 2022-10-18 11:32:33.649901 | 2022-10-18 11:32:33.649901

2 | name2 | 2022-10-18 11:32:33.665863 | 2022-10-18 11:32:33.665863

3 | name3 | 2022-10-18 11:32:33.691182 | 2022-10-18 11:32:33.691182

4 | name4 | 2022-10-18 11:32:33.771843 | 2022-10-18 11:32:33.771843

5 | name5 | 2022-10-18 11:32:34.496502 | 2022-10-18 11:32:34.496502

(5 rows)

db_test=# insert into tb_test(name) values('name6');

INSERT 0 1

从库查询:

[postgres@host135 data]$ psql -h 192.168.1.135 -U postgres

Password for user postgres:

psql (13.8)

Type "help" for help.

postgres=# \c db_test;

You are now connected to database "db_test" as user "postgres".

db_test=# select * from tb_test;

id | name | createtime | modifytime ----+-------+----------------------------+----------------------------

1 | name1 | 2022-10-18 11:32:33.649901 | 2022-10-18 11:32:33.649901

2 | name2 | 2022-10-18 11:32:33.665863 | 2022-10-18 11:32:33.665863

3 | name3 | 2022-10-18 11:32:33.691182 | 2022-10-18 11:32:33.691182

4 | name4 | 2022-10-18 11:32:33.771843 | 2022-10-18 11:32:33.771843

5 | name5 | 2022-10-18 11:32:34.496502 | 2022-10-18 11:32:34.496502

6 | name6 | 2022-10-19 11:04:56.543939 | 2022-10-19 11:04:56.543939

(6 rows)

尝试从库写入数据
db_test=# insert into tb_test(name) values('name7');
ERROR: cannot execute INSERT in a read-only transaction

从库尝试归档
db_test=# select pg_switch_wal();
ERROR: recovery is in progress
HINT: WAL control functions cannot be executed during recovery.

#####################主从切换####################

1.主库停止,模拟故障
192.168.1.134上执行
##查看状态
[postgres@host134 data]$ pg_ctl -D /opt/pg13/data status
pg_ctl: server is running (PID: 24009)
/opt/pg13/bin/postgres "-D" "/opt/pg13/data"

[postgres@host134 data]$ pg_controldata /opt/pg13/data/| grep 'Database cluster state'
Database cluster state: in production

##停止数据库
[postgres@host134 data]$ pg_ctl -D /opt/pg13/data -l /opt/pg13/log/postgres.log stop -m fast
waiting for server to shut down.... done
server stopped

2.备库提升为新主库,对外提供服务
在备库192.168.1.135上执行
[postgres@host135 data]$ pg_ctl promote -D /opt/pg13/data
waiting for server to promote.... done
server promoted

重要1:启动备库为新主库的命令是pg_ctl promote。
提升备库为主库之后,可以看到,后台进程中不再有startup recovering,以及walreceiver streaming进程了.
同时,多了postgres: walwriter 写进程.

重要2:$PGDATA/standby.signal文件自动消失了. 这是告诉PostgreSQL,我现在不再是备库了,我的身份是主库了.

3.新主库删除primary_conninfo条目
192.168.1.135上操作

这里将之前主从同步的信息删除掉,postgresql.auto.conf文件中的 primary_conninfo

[postgres@host135 data]$ psql -h 192.168.1.135 -U postgres -p 5432

Password for user postgres:

psql (13.8)

Type "help" for help.

postgres=# show primary_conninfo;

primary_conninfo ------------------------------------------------------------

user=replica password=replica host=192.168.1.135 port=5432

(1 row)

postgres=# alter system set primary_conninfo='';

ALTER SYSTEM

或者

alter system set primary_conninfo=default; ##postgresql.auto.conf会删除条目,若postgresql.conf中定义了该参数,重启后会读取该文件的参数

重新 reload

[postgres@host135 data]$ pg_ctl -D /opt/pg13/data reload

[postgres@host135 data]$ psql -h 192.168.1.135 -U postgres -p 5432

postgres=# show primary_conninfo;

primary_conninfo ------------------

(1 row)

4.在新主库写入数据
192.168.1.135上执行

[postgres@host135 data]$ psql -h 192.168.1.135 -U hxl -d db_test -p 5432

insert into tb_test(name) values('name9');

insert into tb_test(name) values('name10');

insert into tb_test(name) values('name11');

insert into tb_test(name) values('name12');

insert into tb_test(name) values('name13');

insert into tb_test(name) values('name14');

insert into tb_test(name) values('name15');

insert into tb_test(name) values('name16');

insert into tb_test(name) values('name17');

insert into tb_test(name) values('name18');

insert into tb_test(name) values('name19');

insert into tb_test(name) values('name20');

db_test=> select * from tb_test;

id | name | createtime | modifytime ----+--------+----------------------------+----------------------------

1 | name1 | 2022-10-18 11:32:33.649901 | 2022-10-18 11:32:33.649901

2 | name2 | 2022-10-18 11:32:33.665863 | 2022-10-18 11:32:33.665863

3 | name3 | 2022-10-18 11:32:33.691182 | 2022-10-18 11:32:33.691182

4 | name4 | 2022-10-18 11:32:33.771843 | 2022-10-18 11:32:33.771843

5 | name5 | 2022-10-18 11:32:34.496502 | 2022-10-18 11:32:34.496502

6 | name6 | 2022-10-19 11:04:56.543939 | 2022-10-19 11:04:56.543939

7 | name7 | 2022-10-19 11:25:52.236651 | 2022-10-19 11:25:52.236651

8 | name8 | 2022-10-20 09:21:51.977815 | 2022-10-20 09:21:51.977815

41 | name9 | 2022-10-20 14:22:26.326255 | 2022-10-20 14:22:26.326255

42 | name10 | 2022-10-20 14:22:26.34316 | 2022-10-20 14:22:26.34316

43 | name11 | 2022-10-20 14:22:26.359988 | 2022-10-20 14:22:26.359988

44 | name12 | 2022-10-20 14:22:26.433694 | 2022-10-20 14:22:26.433694

45 | name13 | 2022-10-20 14:22:26.451945 | 2022-10-20 14:22:26.451945

46 | name14 | 2022-10-20 14:22:26.469966 | 2022-10-20 14:22:26.469966

47 | name15 | 2022-10-20 14:22:26.482091 | 2022-10-20 14:22:26.482091

48 | name16 | 2022-10-20 14:22:26.498319 | 2022-10-20 14:22:26.498319

49 | name17 | 2022-10-20 14:22:26.524554 | 2022-10-20 14:22:26.524554

50 | name18 | 2022-10-20 14:22:26.555449 | 2022-10-20 14:22:26.555449

51 | name19 | 2022-10-20 14:22:26.591774 | 2022-10-20 14:22:26.591774

52 | name20 | 2022-10-20 14:22:27.587955 | 2022-10-20 14:22:27.587955

5.新主库修改pg_hba.conf文件
192.168.1.135上操作
修改新主库(原备库192.168.1.135)的$PGDATA/pg_hba.conf文件,在其中添加允许新备库(原主库192.168.1.134)可以通过replica用户访问数据库的条目信息。

vi /opt/pg13/data/pg_hba.conf

host replication all 192.168.1.134/32 md5

若之前就是以网段的方式开通的话,可以不需要修改,如下:
host replication replica 192.168.1.0/24 md5

修改了pg_hba.conf文件不需要重新启动,重新加载即可
[postgres@host135 data]$ pg_ctl -D /opt/pg13/data reload
server signaled

6.原主库新建$PGDATA/standby.signal文件
192.168.1.134上操作
[postgres@host134 data]$ cd /opt/pg13/data
[postgres@host134 data]$ touch standby.signal

[postgres@host134 data]$ pwd
/opt/pg13/data
[postgres@host134 data]$ ll standby.signal
-rw-rw-r-- 1 postgres postgres 0 Oct 20 14:27 standby.signal

注意:这一步骤非常非常重要,如果不配置该文件的话,那么原来的主库一旦重新启动话,就将成为了1个新的独立主库,脱离了主从数据库环境

原主库修改$PGDATA/postgresql.conf文件,添加复制条目
192.168.1.134上操作
[postgres@host134 data]$ vi postgresql.conf
添加如下项:
primary_conninfo='user=replica password=replica host=192.168.1.135 port=5432'

primary_conninfo='user=replica password=1q2!Q@ host=192.168.88.130 port=5432'

启动原主库,变为新备库

192.168.1.134上操作

[postgres@host134 data]$pg_ctl -D /opt/pg13/data -l /opt/pg13/log/postgres.log start

[postgres@host134 data]$ ps -ef|grep postgres

postgres 6975 1 2 15:34 ? 00:00:00 /opt/pg13/bin/postgres -D /opt/pg13/data

postgres 6976 6975 0 15:34 ? 00:00:00 postgres: logger

postgres 6977 6975 0 15:34 ? 00:00:00 postgres: startup recovering 000000010000000000000007

postgres 6979 6975 0 15:34 ? 00:00:00 postgres: checkpointer

postgres 6980 6975 0 15:34 ? 00:00:00 postgres: background writer

postgres 6981 6975 0 15:34 ? 00:00:00 postgres: stats collector

postgres 6982 6975 0 15:34 ? 00:00:00 postgres: walreceiver idle

发现这里进程是:walreceiver idle,说明没有原来主库无法加入作为备库加入集群,看错误日志:

[postgres@host134 log]$ pwd/opt/pg13/log

[postgres@host134 log]$ tail -2f postgresql-2022-10-21.log2022-10-21 15:36:39 CST [6982]: [25-1] user=,db=,app=,client=LOG: primary server contains no more WAL on requested timeline 12022-10-21 15:36:39 CST [6977]: [28-1] user=,db=,app=,client=LOG: new timeline 2 forked off current database system timeline 1 before current recovery point 0/70000A0

解决办法:

[postgres@host134 pg13]$ pg_ctl -D /opt/pg13/data -l /opt/pg13/log/postgres.log stop -m fast

waiting for server to shut down.... done

server stopped

[postgres@host134 pg13]$ pg_rewind -D /opt/pg13/data --source-server='host=192.168.1.135 port=5432 user=postgres dbname=postgres password=postgres'

pg_rewind: servers diverged at WAL location 0/7000000 on timeline 1

pg_rewind: error: could not open file "/opt/pg13/data/pg_wal/000000010000000000000006": No such file or directory

pg_rewind: fatal: could not find previous WAL record at 0/6000410

这里提示wal日志不存在000000010000000000000006,将不存在的归档文件拷贝到wal目录,若还是提示wal日志文件不存在需要继续拷贝到wal目录

[postgres@host134 20221021]$ pwd/opt/pg13/archivelog/20221021

[postgres@host135 20221021]$ cp 000000010000000000000006 /opt/pg13/data/pg_wal/

[postgres@host134 20221021]$ pg_rewind -D /opt/pg13/data --source-server='host=192.168.1.135 port=5432 user=postgres dbname=postgres password=postgres'

pg_rewind: servers diverged at WAL location 0/7000000 on timeline 1

pg_rewind: rewinding from last common checkpoint at 0/5000060 on timeline 1

pg_rewind: Done!

使用了 pg_rewind 后,系统会把主库的postgresql.auto.conf和postgresql.conf文件都拷贝过来了,这个时候需要重新修改postgresql.conf文件中的primary_conninfo,其他的参数看情况修改

9.原主库修改$PGDATA/postgresql.conf文件
192.168.1.134上操作

pg_rewind后添加，若没有pg_remind操作，上面的步骤7已结添加了条目，该步骤可以省略
[postgres@host134 data]$ vi postgresql.conf
添加如下项:
primary_conninfo='user=replica password=replica host=192.168.1.135 port=5432'

10.重新生成standby.signal文件
pg_rewind后没有了该文件standby.signal,需要重新生成
[postgres@host134 data]$ cd /opt/pg13/data
[postgres@host134 data]$ touch standby.signal

11.重启动新备库
[postgres@host134 data]$ pg_ctl -D /opt/pg13/data -l /opt/pg13/log/postgres.log start

12.数据验证
新从库
psql -h 192.168.1.134 -U hxl -d db_test -p 5432

新主库
psql -h 192.168.1.135 -U hxl -d db_test -p 5432

postgresql13部署主从同步和切换+pg_rewind

################主从部署######################

3.停掉从库su - postgrespg_ctl -D /opt/pg13/data -l /opt/pg13/log/postgres.log stop

5.备库上执行对于主库的基础备份 [postgres@host135 pg13]$pg_basebackup -h 192.168.1.134 -p 5432 -U replica --password -X stream -Fp --progress -D /opt/pg13/data -R 注意,备份选项上带有-R选项.

6.备库就可以执行pg_ctl start启动了 这时,就可以看到备库服务器上自动生成了standby.signal文件,同时,也看到在$PGDATA路径下,数据库自动帮我们配置了关于流复制的主库的信息:

7.启动从库 pg_ctl -D /opt/pg13/data -l /opt/pg13/log/postgres.log start

9.进程查看 从库进程

10.数据验证

#####################主从切换####################

1.主库停止,模拟故障 192.168.1.134上执行 ##查看状态 [postgres@host134 data]$ pg_ctl -D /opt/pg13/data status pg_ctl: server is running (PID: 24009) /opt/pg13/bin/postgres "-D" "/opt/pg13/data"

2.备库提升为新主库,对外提供服务 在备库192.168.1.135上执行 [postgres@host135 data]$ pg_ctl promote -D /opt/pg13/data waiting for server to promote.... done server promoted

3.新主库删除primary_conninfo条目 192.168.1.135上操作

4.在新主库写入数据 192.168.1.135上执行

5.新主库修改pg_hba.conf文件 192.168.1.135上操作 修改新主库(原备库192.168.1.135)的$PGDATA/pg_hba.conf文件,在其中添加允许新备库(原主库192.168.1.134)可以通过replica用户访问数据库的条目信息。

6.原主库新建$PGDATA/standby.signal文件 192.168.1.134上操作 [postgres@host134 data]$ cd /opt/pg13/data [postgres@host134 data]$ touch standby.signal

注意:这一步骤非常非常重要,如果不配置该文件的话,那么原来的主库一旦重新启动话,就将成为了1个新的独立主库,脱离了主从数据库环境

9.原主库修改$PGDATA/postgresql.conf文件 192.168.1.134上操作

10.重新生成standby.signal文件 pg_rewind后没有了该文件standby.signal,需要重新生成 [postgres@host134 data]$ cd /opt/pg13/data [postgres@host134 data]$ touch standby.signal

11.重启动新备库 [postgres@host134 data]$ pg_ctl -D /opt/pg13/data -l /opt/pg13/log/postgres.log start

12.数据验证 新从库 psql -h 192.168.1.134 -U hxl -d db_test -p 5432

3.停掉从库
su - postgres
pg_ctl -D /opt/pg13/data -l /opt/pg13/log/postgres.log stop

5.备库上执行对于主库的基础备份
[postgres@host135 pg13]$pg_basebackup -h 192.168.1.134 -p 5432 -U replica --password -X stream -Fp --progress -D /opt/pg13/data -R
注意,备份选项上带有-R选项.

6.备库就可以执行pg_ctl start启动了
这时,就可以看到备库服务器上自动生成了standby.signal文件,同时,也看到在$PGDATA路径下,数据库自动帮我们配置了关于流复制的主库的信息:

7.启动从库
pg_ctl -D /opt/pg13/data -l /opt/pg13/log/postgres.log start

9.进程查看
从库进程

1.主库停止,模拟故障
192.168.1.134上执行
##查看状态
[postgres@host134 data]$ pg_ctl -D /opt/pg13/data status
pg_ctl: server is running (PID: 24009)
/opt/pg13/bin/postgres "-D" "/opt/pg13/data"

2.备库提升为新主库,对外提供服务
在备库192.168.1.135上执行
[postgres@host135 data]$ pg_ctl promote -D /opt/pg13/data
waiting for server to promote.... done
server promoted

3.新主库删除primary_conninfo条目
192.168.1.135上操作

4.在新主库写入数据
192.168.1.135上执行

5.新主库修改pg_hba.conf文件
192.168.1.135上操作
修改新主库(原备库192.168.1.135)的$PGDATA/pg_hba.conf文件,在其中添加允许新备库(原主库192.168.1.134)可以通过replica用户访问数据库的条目信息。

6.原主库新建$PGDATA/standby.signal文件
192.168.1.134上操作
[postgres@host134 data]$ cd /opt/pg13/data
[postgres@host134 data]$ touch standby.signal

9.原主库修改$PGDATA/postgresql.conf文件
192.168.1.134上操作

10.重新生成standby.signal文件
pg_rewind后没有了该文件standby.signal,需要重新生成
[postgres@host134 data]$ cd /opt/pg13/data
[postgres@host134 data]$ touch standby.signal

11.重启动新备库
[postgres@host134 data]$ pg_ctl -D /opt/pg13/data -l /opt/pg13/log/postgres.log start

12.数据验证
新从库
psql -h 192.168.1.134 -U hxl -d db_test -p 5432