【环境配置】
1.关闭各节点selinux,vi /etc/selinux/config:
SELINUX=disabled
保存退出,即可永久生效。
2.关闭各节点防护墙。
systemctl stop firewalld
systemctl disable firewalld
3.配置各节点hostname,vi /etc/hosts:
192.168.90.91 node1
192.168.90.92 node2
重启服务器,生效。
4.流复制集群环境搭建平台信息:
centos7
各节点安装postgresql14
master:192.168.90.91
standby:192.168.90.92
【搭建步骤】
1.主节点初始化数据库。
[postgres@node1 bin]$ /opt/pg14/bin/initdb -D /opt/pg14/data
2.各节点创建归档目录。
mkdir -p /opt/pg14/archivedir
3.修改主节点配置文件:
//vi postgresql.conf
listen_addresses = '*'
wal_level = replica
max_wal_senders = 3(节点数+1)
archive_mode = on
archive_command = 'cp %p /opt/pg14/archivedir/%f'
max_wal_size = 10GB
min_wal_size = 2GB
//vi pg_hba.conf
host all all 192.168.90.92/32 trust
host replication all 192.168.90.92/32 trust
4.启动主节点数据库。
[postgres@node1 bin]$ /opt/pg14/bin/pg_ctl start -D /opt/pg14/data
5.备节点执行备份,并修改data目录权限。
[postgres@node2 bin]$ /opt/pg14/bin/pg_basebackup -D /opt/pg14/data -Fp -Pv -Xf -R -h 192.168.90.91 -p5432 -Upostgres
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2000028 on timeline 1
59010/59010 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/2000138
pg_basebackup: syncing data to disk ...
pg_basebackup: renaming backup_manifest.tmp to backup_manifest
pg_basebackup: base backup completed
注:-R参数与以往版本不同,不再生成recovery.conf文件,而是生成standby.signal标志文件,且在postgresql.auto.conf文件中添加primary_conninfo
[postgres@node2 bin]$ chmod 0700 /opt/pg14/data
6.启动备节点数据库。
[postgres@node2 bin]$ /opt/pg14/bin/pg_ctl start -D /opt/pg14/data
waiting for server to start....2022-08-01 14:37:11.065 CST [118421] LOG: starting PostgreSQL 14.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
2022-08-01 14:37:11.066 CST [118421] LOG: listening on IPv4 address "0.0.0.0", port 5432
2022-08-01 14:37:11.066 CST [118421] LOG: listening on IPv6 address "::", port 5432
2022-08-01 14:37:11.067 CST [118421] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-08-01 14:37:11.071 CST [118422] LOG: database system was interrupted; last known up at 2022-08-01 14:33:49 CST
2022-08-01 14:37:11.084 CST [118422] LOG: entering standby mode
2022-08-01 14:37:11.086 CST [118422] LOG: redo starts at 0/2000028
2022-08-01 14:37:11.086 CST [118422] LOG: consistent recovery state reached at 0/2000100
2022-08-01 14:37:11.086 CST [118421] LOG: database system is ready to accept read-only connections
2022-08-01 14:37:11.095 CST [118426] LOG: started streaming WAL from primary at 0/3000000 on timeline 1
done
server started
7.主节点查看流复制状态。
[postgres@node1 bin]$ /opt/pg14/bin/psql -dpostgres -Upostgres
psql (14.0)
Type "help" for help.
postgres=# \x
Expanded display is on.
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid | 118427
usesysid | 10
usename | postgres
application_name | walreceiver
client_addr | 192.168.90.92
client_hostname |
client_port | 47924
backend_start | 2022-08-01 14:37:11.091569+08
backend_xmin |
state | streaming
sent_lsn | 0/3000060
write_lsn | 0/3000060
flush_lsn | 0/3000060
replay_lsn | 0/3000060
write_lag |
flush_lag |
replay_lag |
sync_priority | 0
sync_state | async
reply_time | 2022-08-01 14:37:51.177942+08
postgres=# \q
[postgres@node1 bin]$ ps -ef|grep postgres
postgres 118158 1 0 14:33 ? 00:00:00 /opt/pg14/bin/postgres -D /opt/pg14/data
postgres 118160 118158 0 14:33 ? 00:00:00 postgres: checkpointer
postgres 118161 118158 0 14:33 ? 00:00:00 postgres: background writer
postgres 118162 118158 0 14:33 ? 00:00:00 postgres: walwriter
postgres 118163 118158 0 14:33 ? 00:00:00 postgres: autovacuum launcher
postgres 118164 118158 0 14:33 ? 00:00:00 postgres: archiver last was 000000010000000000000002.00000028.backup
postgres 118165 118158 0 14:33 ? 00:00:00 postgres: stats collector
postgres 118166 118158 0 14:33 ? 00:00:00 postgres: logical replication launcher
postgres 118427 118158 0 14:37 ? 00:00:00 postgres: walsender postgres 192.168.90.92(47924) streaming 0/3000060
【主备切换】
1.停掉主节点数据库。
[postgres@node1 bin]$ /opt/pg14/bin/pg_ctl stop -D /opt/pg14/data
waiting for server to shut down....2022-08-01 14:40:05.322 CST [118158] LOG: received fast shutdown request
2022-08-01 14:40:05.323 CST [118158] LOG: aborting any active transactions
2022-08-01 14:40:05.323 CST [118158] LOG: background worker "logical replication launcher" (PID 118166) exited with exit code 1
2022-08-01 14:40:05.324 CST [118160] LOG: shutting down
2022-08-01 14:40:05.395 CST [118426] LOG: replication terminated by primary server
2022-08-01 14:40:05.395 CST [118426] DETAIL: End of WAL reached on timeline 1 at 0/40000A0.
2022-08-01 14:40:05.395 CST [118426] FATAL: could not send end-of-streaming message to primary: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
no COPY in progress
2022-08-01 14:40:05.395 CST [118422] LOG: invalid record length at 0/40000A0: wanted 24, got 0
2022-08-01 14:40:05.399 CST [118612] FATAL: could not connect to the primary server: connection to server at "192.168.90.91", port 5432 failed: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2022-08-01 14:40:05.400 CST [118158] LOG: database system is shut down
done
server stopped
此时备节点报错:
[postgres@node1 bin]$ 2022-08-01 14:40:10.403 CST [118617] FATAL: could not connect to the primary server: connection to server at "192.168.90.91", port 5432 failed: 拒绝连接
Is the server running on that host and accepting TCP/IP connections?
2.提升新主,在要提升为新主的备节点执行。
[postgres@node2 bin]$ /opt/pg14/bin/pg_ctl promote -D /opt/pg14/data
waiting for server to promote....2022-08-01 14:41:07.687 CST [118422] LOG: received promote request
2022-08-01 14:41:07.687 CST [118422] LOG: redo done at 0/4000028 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 236.60 s
2022-08-01 14:41:07.688 CST [118422] LOG: selected new timeline ID: 2
2022-08-01 14:41:07.713 CST [118422] LOG: archive recovery complete
2022-08-01 14:41:07.716 CST [118421] LOG: database system is ready to accept connections
done
server promoted
3.修改新主配置文件,vi pg_hba.conf:
host all all 192.168.90.91/32 trust
host replication all 192.168.90.91/32 trust
重载生效,pg_ctl reload。
4.修改旧主配置文件,vi postgresql.conf:
wal_log_hints = on
重启生效,然后关闭旧主节点数据库。
5.旧主节点追随新主节点。
[postgres@node1 bin]$ /opt/pg14/bin/pg_rewind --target-pgdata /opt/pg14/data --source-server='host=192.168.90.92 port=5432 user=postgres dbname=postgres' -P
pg_rewind: servers diverged at WAL location 0/40000A0 on timeline 1
pg_rewind: rewinding from last common checkpoint at 0/4000028 on timeline 1
pg_rewind: reading source file list
pg_rewind: reading target file list
pg_rewind: reading WAL in target
pg_rewind: need to copy 67 MB (total source directory size is 89 MB)
69485/69485 kB (100%) copied
pg_rewind: creating backup label and updating control file
pg_rewind: syncing target data directory
pg_rewind: Done!
6.旧主节点创建标志文件。
[postgres@node1 bin]$ touch standby.signal
7.修改旧主节点的primary_conninfo参数信息,vi postgresql.auto.conf:
将primary_conninfo中的host修改为新主端ip即192.168.90.92
8.启动旧主节点数据库。
[postgres@node1 bin]$ /opt/pg14/bin/pg_ctl start -D /opt/pg14/data
9.查看流复制状态。
[postgres@node2 bin]$ /opt/pg14/bin/psql -dpostgres -Upostgres
psql (14.0)
Type "help" for help.
postgres=# \x
Expanded display is on.
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid | 119151
usesysid | 10
usename | postgres
application_name | walreceiver
client_addr | 192.168.90.91
client_hostname |
client_port | 56426
backend_start | 2022-08-01 14:47:37.206275+08
backend_xmin |
state | streaming
sent_lsn | 0/40001B8
write_lsn | 0/40001B8
flush_lsn | 0/40001B8
replay_lsn | 0/40001B8
write_lag |
flush_lag |
replay_lag |
sync_priority | 0
sync_state | async
reply_time | 2022-08-01 14:49:07.358234+08
postgres=# \q
[postgres@node2 bin]$ ps -ef|grep postgres
postgres 118421 1 0 14:37 ? 00:00:00 /opt/pg14/bin/postgres -D /opt/pg14/data
postgres 118423 118421 0 14:37 ? 00:00:00 postgres: checkpointer
postgres 118424 118421 0 14:37 ? 00:00:00 postgres: background writer
postgres 118425 118421 0 14:37 ? 00:00:00 postgres: stats collector
postgres 118685 118421 0 14:41 ? 00:00:00 postgres: walwriter
postgres 118686 118421 0 14:41 ? 00:00:00 postgres: autovacuum launcher
postgres 118687 118421 0 14:41 ? 00:00:00 postgres: archiver last was 000000010000000000000004.partial
postgres 118688 118421 0 14:41 ? 00:00:00 postgres: logical replication launcher
postgres 119151 118421 0 14:47 ? 00:00:00 postgres: walsender postgres 192.168.90.91(56426) streaming 0/40001B8