pgsql集群维护笔记
集群说明
软件说明
组件说明
三级目录
集群启停
启动
启动 etcd,三个 etcd 节点均需执行
systemctl start etcd
systemctl status etcd
Patroni 所在节点均需执行
systemctl start patroni
systemctl status patroni
关闭
如需要关闭集群,需按照以下顺序关闭
- 停止连接数据库的应用
- 停止 HAProxy
- 停止 Patroni
- 停止 ETCD
所有 patroni 节点均需执行
systemctl stop patroni
ps -ef | grep patroni
ps -ef | grep postgre
关闭 etcd,三个 etcd 节点均需执行
systemctl stop etcd
systemctl status etcd
重启
重启整个集群
patronictl -c /etc/patroni/patroni.yml restart {CLUSTER_NAME}
重启单个节点
patronictl -c /etc/patroni/patroni.yml restart {CLUSTER_NAME} {MEMBER_NAMES}
Patroni参数修改
更改完后在全局生效,依据该参数是动态或是静态参数来选择是否需要重启数据库
必须在 dcs 中更改的 数据库参数
[root@gs-server-9014 join_domain_linux]# patronictl -c /etc/patroni/patroni.yml
edit-config
---
+++
@@ -22,7 +22,7 @@
log_filename: postgresql-%H.log
log_line_prefix: '%t [%p-%l] %r %q%u@%d '
log_lock_waits: 'on'
- log_min_duration_statement: 1s
+ log_min_duration_statement: 2s
log_rotation_age: 1d
log_rotation_size: 10MB
log_truncate_on_rotation: 'on'
Apply these changes? [y/N]:
重新加载配置
patronictl -c /etc/patroni/patroni.yml reload {CLUSTER_NAME}
Patronictl 集群维护命令
查看集群状态
shell> patronictl -c /etc/patroni/patroni.yml list
+ Cluster: test (6924183876920460759) ------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+----------+---------------+---------+---------+----+-----------+
| test0 | 10.201.40.155 | Leader | running | 7 | |
| test1 | 10.201.40.225 | Replica | running | 7 | 0 |
+----------+---------------+---------+---------+----+-----------+
发送sql语句
shell> patronictl -c postgresmq.yml query test0 --command 'select count(*) from
test'
获取主节点dsn信息
shell> patronictl -c /etc/patroni/patroni.yml dsn
host=10.201.40.155 port=5432
重启集群
shell> patronictl -c /etc/patroni/patroni.yml restart {CLUSTER_NAME}
执行主备切换
shell> patronictl -c /etc/patroni/patroni.yml switchover {CLUSTER_NAME}
Master [ad_test0]:
Candidate ['ad_test1'] []:
When should the switchover take place (e.g. 2021-04-07T12:31 ) [now]:
Current cluster topology
+ Cluster: ad_test (6924183876920460759) ------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+----------+---------------+---------+---------+----+-----------+
| ad_test0 | 10.201.40.155 | Leader | running | 7 | |
| ad_test1 | 10.201.40.225 | Replica | running | 7 | 0 |
+----------+---------------+---------+---------+----+-----------+
Are you sure you want to switchover cluster ad_test, demoting current master
ad_test0? [y/N]:
手动failover一个节点
shell> patronictl -c /etc/patroni/patroni.yml failover {CLUSTER_NAME}
在DCS中删除集群信息
shell> patronictl -c /etc/patroni/patroni.yml remove {CLUSTER_NAME}
重新初始化接节点
shell> patronictl -c /etc/patroni/patroni.yml reinit {CLUSTER_NAME}
{MEMBER_NAMES}
PostgreSQL 数据库插件的加载
- 方法一
:每个节点都对插件编译后再在主节点create extensipn - 方法二
:给集群每一个节点发送指令
shell> patronictl -c postgresmq.yml query batman5 --command 'create
extension pg_prewarm'
日志级别调整
Patroni 该功能在只有在 patroni 1.3.4 及以后的版本才加入
log:
dir: /var/log/patroni
level: WARNING
Python 中的日志等级
- CRITICAL
- ERROR
- WARNING
- INFO
- DEBUG
- NOTSET
修改完毕后重启集群生效,重启 patroni 集群,在任意 patroni 节点执行
shell> patronictl -c /etc/patroni/patroni.yml restart {CLUSTER_NAME}
故障
故障检查流程
检查项
Patroni 日志
shell> ls /var/log/patroni/ -l
总用量 115476
-rw-r--r-- 1 postgres postgres 16206282 4月 7 15:00 patroni.log
-rw-r--r-- 1 postgres postgres 24999976 3月 25 06:11 patroni.log.1
-rw-r--r-- 1 postgres postgres 24999999 3月 10 12:08 patroni.log.2
-rw-r--r-- 1 postgres postgres 24999953 2月 24 04:57 patroni.log.3
-rw-r--r-- 1 postgres postgres 24998901 2月 20 18:25 patroni.log.4
Patroni 集群状态
shell> patronictl -c /etc/patroni/patroni.yml list
DCS 集群状态
shell> etcdctl --endpoints "http://192.168.10.85:2379" cluster-health
数据库日志
shell> ls -l /var/log/pg_log/
总用量 118624
-rw-r--r-- 1 root root 2153 3月 22 10:36 a
-rw------- 1 postgres postgres 17606 4月 7 11:31 postgresql-00.csv
-rw------- 1 postgres postgres 0 3月 20 00:00 postgresql-00.log
-rw------- 1 postgres postgres 10486211 4月 2 02:31 postgresql-01.csv
-rw------- 1 postgres postgres 10486260 4月 2 03:52 postgresql-02.csv