PostgreSQL 数据库高可用软件对比之Repmgr & CLup

测试环境

项目说明
操作系统RockyLinux8.8
PostgreSQL12.16
Repmgryum源安装repmgr_12
CLup开源社区版
PostgreSQL系统用户pg12
PGDATA目录/data/pgdata12

主机列表

主机名IP
PG0110.16.18.160
PG0210.16.18.161
PG0310.16.18.162
ubuntu-KVM10.16.18.16

先放上两者官方文档地址

Repmgr: https://www.repmgr.org/docs/current/index.html

CLup: CLup5.x产品手册:CLup简介

2. Repmgr 安装部署

repmgr实际上是安装到PostgreSQL软件中的,更像是PostgreSQL的一个插件,但是在bin下多了些可执行文件。下文中我们将使用yum源安装,其会自动将PostgreSQL数据库软件也安装上。如果需要将其放到自己编译的软件目录中,需要迁移相关的文件,方法在下文中会有说明。本文中没有部署witness节点和配置repmgrd服务,所以当前还没有高可用的功能。主要是步骤太过繁琐了,后面找时间再补充上来。

2.1 安装repmgr软件

先安装PostgreSQL的yum源

dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm
​
dnf -qy module disable postgresql

安装repmgr,此时会新安装PostgreSQL,即使目录已存在(即使配置了PG_CONFIG环境变量也不行)

dnf install repmgr_12

迁移到自己的PostgreSQL软件中

cp /usr/pgsql-12/bin/repmgr* /usr/csupg-12.16/bin/.
cp /usr/pgsql-12/lib/repmgr* /usr/csupg-12.16/lib/.
cp /usr/pgsql-12/share/extension/repmgr* /usr/csupg-12.16/share/extension/xxx 

拷贝extension时注意看下自己的软件目录下是否还有子目录postgresql

剩下的两台机器可以用scp发过去。

2.2 初始化主库

10.16.18.160上初始化一个PostgreSQL数据库,初始化主要步骤如下,这里不再详写了:

  1. 创建用户和用户组pg12

  2. 添加环境变量

  3. 初始化数据库

  4. 修改数据库参数

  5. 启动数据库

本文中实际上是用CLup创建出来的,哈哈偷个懒,Web页面上点点更方便,如果是新手建议自己手工搭建。

2.3 配置主库

下面的操作在10.16.18.160上执行。

修改repmgr配置文件的属主

chown -R pg12:pg12 /etc/repmgr/12/repmgr.conf

登录数据库用户pg12

su - pg12

登录数据库,在主库中创建一个单独的database对象repmgr和超级用户repmgr:

create user repmgr with superuser password 'repmgr12';
create database repmgr owner repmgr;

添加.pgpass文件

10.16.18.160:5412:repmgr:repmgr:repmgr12
10.16.18.160:5412:replication:repmgr:repmgr12
10.16.18.161:5412:repmgr:repmgr:repmgr12
10.16.18.161:5412:replication:repmgr:repmgr12
10.16.18.162:5412:repmgr:repmgr:repmgr12
10.16.18.162:5412:replication:repmgr:repmgr12

修改文件权限

chmod 600 .pgpass

修改repmgr配置文件/etc/repmgr/12/repmgr.conf

node_id=1
node_name='10.16.18.160'
conninfo='host=10.16.18.160 user=repmgr dbname=repmgr connect_timeout=10'
data_directory='/data/pgdata12'
pg_bindir='/usr/csupg-12.16/bin/'

注册主库:

[pg12@PG01 ~]$ repmgr primary register
INFO: connecting to primary database...
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed

查看集群状态:

[pg12@PG01 ~]$ repmgr cluster show
 ID | Name | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                             
----+------+---------+-----------+----------+----------+----------+----------+--------------------
 1  | PG01 | primary | * running |          | default  | 100      | 1        | host=10.16.18.160 user=repmgr dbname=repmgr connect_timeout=10

主库需要开启wal_log_hints

wal_log_hints = on

重启下数据库

pg_ctl restart

2.4 配置备库

以下操作需要在两台备库(10.16.18.161, 10.16.18.162)上都要执行。

创建用户和组

groupadd -g 712 pg12
useradd -g 712 -u 712 -m pg12

配置环境变量(这里是参照CLup创建数据库时添加的环境变量)

export PATH=/usr/csupg-12.16/bin:$PATH
export LD_LIBRARY_PATH=/usr/csupg-12.16/lib:$LD_LIBRARY_PATH
export PGDATA=/data/pgdata12
export PGHOST='/tmp'
export PGPORT=5412
export LANG=en_US.UTF8

创建目录

mkdir -p /etc/repmgr/12

10.16.18.160上的配置文件拷贝过来

scp 10.16.18.160:/etc/repmgr/12/repmgr.conf /etc/repmgr/12/.

修改repmgr配置文件的属主

chown -R pg12:pg12 /etc/repmgr/12/repmgr.conf

登录数据库的系统用户

su - pg12

添加.pgpass文件

10.16.18.160:5412:repmgr:repmgr:repmgr12
10.16.18.160:5412:replication:repmgr:repmgr12
10.16.18.161:5412:repmgr:repmgr:repmgr12
10.16.18.161:5412:replication:repmgr:repmgr12
10.16.18.162:5412:repmgr:repmgr:repmgr12
10.16.18.162:5412:replication:repmgr:repmgr12

修改文件权限

chmod 600 .pgpass

修改10.16.18.161机器上的repmgr配置文件

node_id=2
node_name='10.16.18.161'
conninfo='host=10.16.18.161 user=repmgr dbname=repmgr connect_timeout=10'
data_directory='/data/pgdata12'
pg_bindir='/usr/csupg-12.16/bin/'

修改10.16.18.162机器上的repmgr配置文件

node_id=3
node_name='10.16.18.162'
conninfo='host=10.16.18.162 user=repmgr dbname=repmgr connect_timeout=10'
data_directory='/data/pgdata12'
pg_bindir='/usr/csupg-12.16/bin/'

2.5 搭建备库(clone)

以下操作需要在两台备库(10.16.18.161, 10.16.18.162)上都要执行。

测试clone命令,不真实执行,主要是检查下环境是否正常

[pg12@PG02 data]$ repmgr -h 10.16.18.160 -U repmgr -d repmgr standby clone --dry-run
NOTICE: destination directory "/data/pgdata12" provided
INFO: connecting to source node
DETAIL: connection string is: host=10.16.18.160 user=repmgr dbname=repmgr
DETAIL: current installation size is 32 MB
INFO: "repmgr" extension is installed in database "repmgr"
INFO: replication slot usage not requested;  no replication slot will be set up for this standby
INFO: parameter "max_wal_senders" set to 64
NOTICE: checking for available walsenders on the source node (2 required)
INFO: sufficient walsenders available on the source node
DETAIL: 2 required, 64 available
NOTICE: checking replication connections can be made to the source server (2 required)
INFO: required number of replication connections could be made to the source server
DETAIL: 2 replication connections required
NOTICE: standby will attach to upstream node 1
HINT: consider using the -c/--fast-checkpoint option
INFO: would execute:
  /usr/pgsql-12/bin/pg_basebackup -l "repmgr base backup"  -D /data/pgdata12 -h 10.16.18.160 -p 5412 -U repmgr -X stream 
INFO: all prerequisites for "standby clone" are met

出现上面的结果说明可以执行,否则的话按照提示先修复相关问题后再执行

repmgr -h 10.16.18.160 -U repmgr -d repmgr standby clone
​
[pg12@PG03 ~]$  repmgr -h 10.16.18.160 -U repmgr -d repmgr standby clone
NOTICE: destination directory "/data/pgdata12" provided
INFO: connecting to source node
DETAIL: connection string is: host=10.16.18.160 user=repmgr dbname=repmgr
DETAIL: current installation size is 32 MB
INFO: replication slot usage not requested;  no replication slot will be set up for this standby
NOTICE: checking for available walsenders on the source node (2 required)
NOTICE: checking replication connections can be made to the source server (2 required)
INFO: creating directory "/data/pgdata12"...
NOTICE: starting backup (using pg_basebackup)...
HINT: this may take some time; consider using the -c/--fast-checkpoint option
INFO: executing:
  /usr/csupg-12.16/bin/pg_basebackup -l "repmgr base backup"  -D /data/pgdata12 -h 10.16.18.160 -p 5412 -U repmgr -X stream 
NOTICE: standby clone (using pg_basebackup) complete
NOTICE: you can now start your PostgreSQL server
HINT: for example: pg_ctl -D /data/pgdata12 start
HINT: after starting the server, you need to register this standby with "repmgr standby register"

搭建好后启动备库:

pg_ctl start

注册备库

repmgr standby register --upstream-node-id 1
​
[pg12@PG03 ~]$ repmgr standby register --upstream-node-id 1
INFO: connecting to local node "PG03" (ID: 3)
INFO: connecting to primary database
INFO: standby registration complete
NOTICE: standby node "PG03" (ID: 3) successfully registered

查看集群状态

[pg12@PG03 ~]$ repmgr cluster show
 ID | Name | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                             
----+------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------
 1  | PG01 | primary | * running |          | default  | 100      | 1        | host=10.16.18.160 user=repmgr dbname=repmgr connect_timeout=10
 2  | PG02 | standby |   running | PG01     | default  | 100      | 1        | host=10.16.18.161 user=repmgr dbname=repmgr connect_timeout=10
 3  | PG03 | standby |   running | PG01     | default  | 100      | 1        | host=10.16.18.162 user=repmgr dbname=repmgr connect_timeout=10

3. CLup 安装部署

3.1 安装程序

先安装Server端(clup),这里是安装到ubuntu-KVM一台虚拟机上的,使用一键安装命令,SSH连接上Ubuntu执行

wget -qO /tmp/clup.sh --no-check-certificate https://get.csudata.com/csuinst/clup.sh && bash /tmp/clup.sh openclup install

如果是有多个网卡,需要选择一个IP作为服务的IP。然后安装Agent端时需要输入选择的这个IP。

然后再安装Agent端,在三台数据库主机(10.16.18.160, 10.16.18.161, 10.16.18.162)上执行

wget -qO /tmp/clup.sh --no-check-certificate https://get.csudata.com/csuinst/clup.sh && bash /tmp/clup.sh openclup-agent install

中间需要输入CLup Server所在主机的IP。

安装完成后登录Web页面查看(http://10.16.18.16:8090 用户名:admin,密码:openclup):

3.2 创建集群

创建步骤可以参考我的另一篇文章:新发现:一款非常好用的PostgreSQL数据库管理工具-CSDN博客

主要步骤如下:

  1. 首次需要添加一个VIP池

  2. 创建集群

添加VIP池

创建集群的步骤参考文章:新发现:一款非常好用的PostgreSQL数据库管理工具-CSDN博客

这里不在赘述,就是Web界面上填写一些主要信息,点击几下就可以将集群搭建起来了。

前面的文章中,是在10.16.18.16010.16.18.161上创建了一主一备的集群,然后这里再搭建一个备库

4. 一些使用场景的对比

4.1 Repmgr中的数据库切换

假设要把10.16.18.161机器切换成主库。需要先配置用户免密,如何配置用户免密网上文章有很多,这里不再赘述。

repmgr的操作,到10.16.18.161机器上执行下面的命令:

[pg12@PG02 ~]$ repmgr standby switchover
NOTICE: executing switchover on node "10.16.18.161" (ID: 2)
WARNING: 1 sibling nodes found, but option "--siblings-follow" not specified
DETAIL: these nodes will remain attached to the current primary:
  10.16.18.162 (node ID: 3)
NOTICE: attempting to pause repmgrd on 3 nodes
NOTICE: local node "10.16.18.161" (ID: 2) will be promoted to primary; current primary "10.16.18.160" (ID: 1) will be demoted to standby
NOTICE: stopping current primary node "10.16.18.160" (ID: 1)
NOTICE: issuing CHECKPOINT on node "10.16.18.160" (ID: 1) 
DETAIL: executing server command "/usr/csupg-12.16/bin/pg_ctl  -D '/data/pgdata12' -W -m fast stop"
INFO: checking for primary shutdown; 1 of 60 attempts ("shutdown_check_timeout")
INFO: checking for primary shutdown; 2 of 60 attempts ("shutdown_check_timeout")
NOTICE: current primary has been cleanly shut down at location 0/1B000028
NOTICE: promoting standby to primary
DETAIL: promoting server "10.16.18.161" (ID: 2) using pg_promote()
NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
NOTICE: STANDBY PROMOTE successful
DETAIL: server "10.16.18.161" (ID: 2) was successfully promoted to primary
WARNING: node "10.16.18.160" attached in state "startup"
INFO: waiting for node "10.16.18.160" (ID: 1) to connect to new primary; 1 of max 60 attempts (parameter "node_rejoin_timeout")
DETAIL: node "10.16.18.161" (ID: 1) is currently attached to its upstream node in state "startup"
NOTICE: node "10.16.18.161" (ID: 2) promoted to primary, node "10.16.18.160" (ID: 1) demoted to standby
NOTICE: switchover was successful
DETAIL: node "10.16.18.161" is now primary and node "10.16.18.160" is attached as standby
NOTICE: STANDBY SWITCHOVER has completed successfully

查看下当前集群的状态

[pg12@PG02 ~]$ repmgr cluster show
 ID | Name         | Role    | Status    | Upstream     | Location | Priority | Timeline | Connection string                                             
----+--------------+---------+-----------+--------------+----------+----------+----------+----------------------------------------------------------------
 1  | 10.16.18.160 | standby |   running | 10.16.18.161 | default  | 100      | 1        | host=10.16.18.160 user=repmgr dbname=repmgr connect_timeout=10
 2  | 10.16.18.161 | primary | * running |              | default  | 100      | 2        | host=10.16.18.161 user=repmgr dbname=repmgr connect_timeout=10
 3  | 10.16.18.162 | standby |   running | 10.16.18.160 | default  | 100      | 2        | host=10.16.18.162 user=repmgr dbname=repmgr connect_timeout=10

4.2 CLup中的数据库切换

先进入到集群的详情页

然后找到想要切换到的备库,点击切换为主库

点击提交后,这里会有一个是否保留级联关系的确认框,当前我们的备库都在主库上,所以保留和不保留级联关系,结果应该都是一样的。点击确定

切换过程会有日志打印,看到Success说明应该是成功了

返回到集群详情查看

4.2 高可用故障切换

repmgr需要配置witness节点和开启repmgrd服务才能完成故障切换,这里没有配置witness节点,暂时就先不演示了,后面再补。

CLup的故障切换,需要先将集群上线

然后我们将当前主库10.16.18.161主机停掉看下CLup是否会将执行故障切换。先是探测到Agent异常

然后等一会发现会有一个后台任务

点击显示详情里面就是切换的过程了

再看下集群详情,发现主库已经切换到10.16.18.160上去了(这里的时间忽略掉,中午休息去了哈哈)

5. 总结

主要功能对比:

功能RepmgrCLup
搭建备库
切换主库
集群状态监控
高可用故障切换
其他运维管理(启停库、查看和修改数据库参数、查看数据库日志等)

安装部署对比:

  • Repmgr的安装部署相对复杂,安装配置过程相比于CLup要多很多。然后使用clone搭建备库和后续的高可用都需要先配置免密,高可用还需要配置witness节点和repmgrd服务。

  • CLup的安装相对简单些,使用其一键安装的命令,默认也不需要修改什么配置,安装好POstgreSQL软件后就可以搭建集群了。然后得益于其有Web界面,操作可视化显得更加方便。

还有一点就是repmgr是命令行终端显示,CLup是Web页面可视化,相对而言CLup可能要更好用些,并且一次部署,可以管理很多套集群。不过大家可以都试试,学习嘛,就得多折腾,然后对比下哪个好用用哪个。对了本文中的CLup是开源社区版本,其还有一个商业版本,如果是企业使用的话,最好是用企业版本,毕竟有专人做技术支持更加放心些。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值