一、配置两个主机root的互信
1、编辑hosts文件,添加node1和node2信息
192.168.18.11 node1
192.168.18.12 node2
2、node1 操作
ssh-keygen -t rsa
ssh-copy-id -i node2
3、node2操作
ssh-key-gen -t rsa
cd ~./ssh
cat *.pub >>authorized_keys
scp authorized_keys root@node1:/root/.ssh
4、验证
ssh node1 date
ssh node2 date
二、repmgr部署
- 将节点1的pg软件复制到节点2,创建账号和目录
groupadd pg14
useradd -g pg14 pg14
echo 123| passwd --stdin pg14
mkdir -p /home/pg14/soft
mkdir -p /home/pg14/data
mkdir -p /home/pg14/arch
scp -r /home/pg14/soft node2:/home/pg14
- 上传repmgr安装包、解压、安装
yum check-update
yum groupinstall "Development Tools" -y
yum install yum-utils openjade docbook-dtds docbook-style-dsssl docbook-style-xsl -y
yum-builddep postgresql96
节点1:
tar -zxf repmgr-5.3.3.tar.gz
./configure
make && make install
ssh node2 "cd /soft;tar -zxf repmgr-5.3.3.tar.gz"
ssh node2 "cd /soft/repmgr-5.3.3;./configure;make && make install"
- 创建复制账号
postgres=# create user repmgr with superuser password '1qaz@WSX';
CREATE ROLE
postgres=# create database repmgrdb with owner=repmgr;
CREATE DATABASE
postgres=#
- 节点1 参数配置
postgresql.auto.conf
max_wal_senders=10
max_replication_slots=10
wal_level='logical'
hot_standby=on
archive_mode=on
wal_keep_size = '128'
archive_command='cp %p /home/pg14/arch/%f'
- 修改hba文件
pg_hba_conf IPV4一栏和replication一栏分别添加下面两行
[pg14@node1 data]$ egrep repmgr $PGDATA/pg_hba.conf
host repmgrdb repmgr 192.168.18.0/24 trust
host replication repmgr 192.168.18.0/24 trust
- 节点1创建repmgr扩展
cp /soft/repmgr-5.3.3/repmgr.so /home/pg14/soft/lib/postgresql/
cp /soft/repmgr-5.3.3/repmgr.control /home/pg14/soft/share/postgresql/extension
cp /soft/repmgr-5.3.3/*.sql /home/pg14/soft/share/postgresql/extension
postgresql.auto.conf
shared_preload_libraries = 'repmgr,passwordcheck'
pg_ctl restart -D $PGDATA -l /tmp/logfile
psql -U repmgr -d repmgrdb
repmgrdb=# create extension repmgr;
CREATE EXTENSION
- repmgr配置文件
mkdir -p /home/pg14/conf
cd /home/pg14/conf
node1:
vi repmgr.conf
[pg14@node1 conf]$ vi repmgr.conf
#[base]
node_id=1
node_name=node1
conninfo ='host=node1 dbname=repmgrdb user=repmgr connect_timeout=2 password=1qaz@WSX'
data_directory='/home/pg14/data'
#[optional]
#[log]
log_level=info
log_facility=stderr
log_file='/home/pg14/conf/repmgr.log'
log_status_interval=300
node2
#[base]
node_id=2
node_name=node2
conninfo ='host=node2 dbname=repmgrdb user=repmgr connect_timeout=2 password=1qaz@WSX'
data_directory='/home/pg14/data'
#[optional]
#[log]
log_level=info
log_facility=stderr
log_file='/home/pg14/conf/repmgr.log'
log_status_interval=300
-
主节点注册
[pg14@node1 repmgr-5.3.3]$ repmgr -f /home/pg14/conf/repmgr.conf primary register INFO: connecting to primary database... INFO: "repmgr" extension is already installed NOTICE: primary node record (ID: 1) registered [pg14@node1 repmgr-5.3.3]$ repmgr -f /home/pg14/conf/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+-------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------- 1 | node1 | primary | * running | | default | 100 | 1 | host=node1 dbname=repmgrdb user=repmgr connect_timeout=2 password=1qaz@WSX
-
备库克隆
克隆测试准备
[pg14@node2 repmgr-5.3.3]$ repmgr -h node1 -U repmgr -d repmgrdb -f /home/pg14/conf/repmgr.conf standby clone --dry-run
NOTICE: destination directory "/home/pg14/data" provided
INFO: connecting to source node
DETAIL: connection string is: host=node1 user=repmgr dbname=repmgrdb
DETAIL: current installation size is 42 MB
INFO: "repmgr" extension is installed in database "repmgrdb"
INFO: replication slot usage not requested; no replication slot will be set up for this standby
INFO: parameter "max_wal_senders" set to 10
NOTICE: checking for available walsenders on the source node (2 required)
INFO: sufficient walsenders available on the source node
DETAIL: 2 required, 10 available
NOTICE: checking replication connections can be made to the source server (2 required)
INFO: required number of replication connections could be made to the source server
DETAIL: 2 replication connections required
NOTICE: standby will attach to upstream node 1
HINT: consider using the -c/--fast-checkpoint option
INFO: would execute:
pg_basebackup -l "repmgr base backup" -D /home/pg14/data -h node1 -p 5666 -U repmgr -X stream
INFO: all prerequisites for "standby clone" are met
开始克隆
[pg14@node2 repmgr-5.3.3]$ repmgr -h node1 -U repmgr -d repmgrdb -f /home/pg14/conf/repmgr.conf standby clone
NOTICE: destination directory "/home/pg14/data" provided
INFO: connecting to source node
DETAIL: connection string is: host=node1 user=repmgr dbname=repmgrdb
DETAIL: current installation size is 42 MB
INFO: replication slot usage not requested; no replication slot will be set up for this standby
NOTICE: checking for available walsenders on the source node (2 required)
NOTICE: checking replication connections can be made to the source server (2 required)
INFO: checking and correcting permissions on existing directory "/home/pg14/data"
NOTICE: starting backup (using pg_basebackup)...
HINT: this may take some time; consider using the -c/--fast-checkpoint option
INFO: executing:
pg_basebackup -l "repmgr base backup" -D /home/pg14/data -h node1 -p 5666 -U repmgr -X stream
NOTICE: standby clone (using pg_basebackup) complete
NOTICE: you can now start your PostgreSQL server
HINT: for example: pg_ctl -D /home/pg14/data start
HINT: after starting the server, you need to register this standby with "repmgr standby register"
- 启动备库,注册备库
pg_ctl start -D $PGDATA -l /tmp/logfile
[pg14@node2 ~]$ repmgr -f ~/conf/repmgr.conf standby register
[pg14@node1 repmgr-5.3.3]$ repmgr -f ~/conf/repmgr.conf service status
ID | Name | Role | Status | Upstream | repmgrd | PID | Paused? | Upstream last seen
----+-------+---------+-----------+----------+-------------+-----+---------+--------------------
1 | node1 | primary | * running | | not running | n/a | n/a | n/a
2 | node2 | standby | running | node1 | not running | n/a | n/a | n/a
- 验证主备状态
[pg14@node1 repmgr-5.3.3]$ repmgr -f ~/conf/repmgr.conf cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------------------
1 | node1 | primary | * running | | default | 100 | 1 | host=node1 dbname=repmgrdb user=repmgr connect_timeout=2 password=1qaz@WSX
2 | node2 | standby | running | node1 | default | 100 | 1 | host=node2 dbname=repmgrdb user=repmgr connect_timeout=2 password=1qaz@WSX
-
配置操作系统服务启动
yum install -y postgresql-server.x86_64 vi /usr/lib/systemd/system/postgresql.service [Unit] Description=PostgreSQL database server After=network.target [Service] Type=forking User=pg14 Group=pg14 Environment=PGPORT=5666 Environment=PGDATA=/home/pg14/data Environment=PGLOG=/tmp/logfile OOMScoreAdjust=-1000 ExecStart=/home/pg14/soft/bin/pg_ctl start -D ${PGDATA} -l ${PGLOG} ExecStop=/home/pg14/soft/bin/pg_ctl stop -D ${PGDATA} -l ${PGLOG} ExecRestart=/home/pg14/soft/bin/pg_ctl restart -D ${PGDATA} -l ${PGLOG} ExecReload=/home/pg14/soft/bin/pg_ctl reload -D ${PGDATA} -l ${PGLOG} TimeoutSec=300 [Install] WantedBy=multi-user.target
13.修改sudo权限
vi /etc/sudoers
# Refuse to run if unable to disable echo on the tty.
#
#Defaults !visiblepw
Defaults !requiretty
## Allow root to run any commands anywhere
root ALL=(ALL) ALL
pg14 ALL=(ALL) NOPASSWD:/bin/systemctl stop postgresql,\
/bin/systemctl start postgresql,\
/bin/systemctl restart postgresql,\
/bin/systemctl reload postgresql
14.修改repmg配置文件
vi /home/pg14/conf/repmgr.conf
#[service]
service_start_command='sudo systemctl start postgresql'
service_start_command='sudo systemctl stop postgresql'
service_start_command='sudo systemctl restart postgresql'
service_start_command='sudo systemctl relaod postgresql'
-
手动切换
节点2: [pg14@node2 data]$ repmgr -f /home/pg14/conf/repmgr.conf standby switchover NOTICE: executing switchover on node "node2" (ID: 2) NOTICE: attempting to pause repmgrd on 2 nodes NOTICE: local node "node2" (ID: 2) will be promoted to primary; current primary "node1" (ID: 1) will be demoted to standby NOTICE: stopping current primary node "node1" (ID: 1) NOTICE: issuing CHECKPOINT on node "node1" (ID: 1) DETAIL: executing server command "sudo systemctl stop postgresql" INFO: checking for primary shutdown; 1 of 60 attempts ("shutdown_check_timeout") NOTICE: current primary has been cleanly shut down at location 0/15000028 NOTICE: promoting standby to primary DETAIL: promoting server "node2" (ID: 2) using pg_promote() NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete NOTICE: STANDBY PROMOTE successful DETAIL: server "node2" (ID: 2) was successfully promoted to primary NOTICE: node "node2" (ID: 2) promoted to primary, node "node1" (ID: 1) demoted to standby NOTICE: switchover was successful DETAIL: node "node2" is now primary and node "node1" is attached as standby NOTICE: STANDBY SWITCHOVER has completed successfully [pg14@node2 data]$ repmgr -f /home/pg14/conf/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+-------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------- 1 | node1 | standby | running | node2 | default | 100 | 5 | host=node1 dbname=repmgrdb user=repmgr connect_timeout=2 password=1qaz@WSX 2 | node2 | primary | * running | | default | 100 | 6 | host=node2 dbname=repmgrdb user=repmgr connect_timeout=2 password=1qaz@WSX [pg14@node2 data]$
-
配置故障自动切换
repmgr配置文件添加
#[auto-failover]
failover=automatic
promote_command='/home/pg14/soft/bin/repmgr standby promote -f /home/pg14/conf/repmgr.conf --log-to-file'
follow_command='/home/pg14/soft/bin/repmgr standby follow -f /home/pg14/conf/repmgr.conf --log-to-file upstream-node-id=%n'
#[monitor]
monitoring_history=yes
monitor_interval_secs=1
reconnect_attempts=6
reconnect_interval=1
retry_promote_interval_secs=3
remgrd_standby_startup_timeout=1
sibling_nodes_disconnect_timeout=1
- 两节点启动repmgrd
[pg14@node1 conf]$ repmgrd -f /home/pg14/conf/repmgr.conf -v -d -p ~/conf/repmgrd.pid
[2022-12-22 05:16:43] [NOTICE] using provided configuration file "/home/pg14/conf/repmgr.conf"
[2022-12-22 05:16:43] [NOTICE] redirecting logging output to "/home/pg14/conf/repmgr.log"