PostgreSQL+Keepalived集群配置手册

1.环境配置

ip

虚拟ip

版本

master

192.168.20.71

192.168.20.99

postgresql v14.2

keepalived-1.3.5-19.el7.x86_64

slave

192.168.20.75

postgresql v14.2

keepalived-1.3.5-19.el7.x86_64

2.安装步骤

安装postgresql

yum install -y postgresql14-server

以下命令在使用解压缩方式安装时执行,使用yum安装时不需要执行。

命令补充,如果使用解压缩方式安装,则需要创建postgres用户

adduser postgres

并且安装postgresql过程中需要手动安装如下依赖工具

yum install zlib-devel

yum install readline-devel

yum install bison

yum install flex

启动方法命令:

su - postgres
pg_ctl start
pg_ctl restart
pg_ctl stop

安装完以后,安装路径如下:

/var/lib/pgsql/14

进入postgres用户

su - postgres

 通过vi ~/.bash_profile查看安装路径和PDATA的配置

[ -f /etc/profile ] && source /etc/profile
PATH=/usr/pgsql-14/bin:$PATH
export PATH
PGDATA=/var/lib/pgsql/14/data
export PGDATA

# If you want to customize your settings,
# Use the file below. This is not overridden
# by the RPMS.

[ -f /var/lib/pgsql/.pgsql_profile ] && source /var/lib/pgsql/.pgsql_profile

可以看到postgresql安装路径在/usr/pgsql-14/bin下,PGDATA在路径/var/lib/pgsql/14/data下。

创建数据库Keepalived,并且创建表探测表sr_delay,后续Keepalived探测,刷新sr_delay表的last_alive字段为当前探测时间。这张表用来判断主备延迟情况,数据库故障切换时会用到这张表。

数据库配置:

使用psql命令进入postgres数据库后台

-bash-4.2$ psql
psql (14.2)
Type "help" for help.

创建测试表test

postgres=# create table test(id int);

创建keepalived用户和数据库,用于监测主备机器数据库启动情况

postgres=# create user keepalived password 'sw12345' CONNECTION LIMIT 4 ;
CREATE ROLE
postgres=# create database keepalived owner keepalived;
CREATE DATABASE
postgres=# \c keepalived keepalived
You are now connected to database "keepalived" as user "keepalived".
keepalived=> create table sr_delay(id int4, last_alive timestamp(0) without time zone);
CREATE TABLE

表sr_delay只允许写入一条记录,并且不允许删除此表数据,通过触发器实现。创建触发器函数,如下所示:

CREATE FUNCTION cannt_delete ()
RETURNS trigger
LANGUAGE plpgsql AS $$
BEGIN
RAISE EXCEPTION 'Table sr_delay can not delete !';
END;
$$;

创建触发器:

CREATE TRIGGER trigger_sr_delay_del
BEFORE DELETE ON sr_delay
FOR EACH ROW EXECUTE PROCEDURE cannt_delete() ;
CREATE TRIGGER trigger_sr_delay_tru BEFORE TRUNCATE ON sr_delay FOR STATEMENT EXECUTE PROCEDURE cannt_delete() ;

插入数据:

INSERT INTO sr_delay VALUES(1,now()) ;

退出pgsql后台,退出postgres用户

\q
exit

创建归档文件夹,并赋权

mkdir /var/lib/pgsql/archivedir
chown postgres /var/lib/pgsql/archivedir

进入postgres用户,打开postgres配置文件,如下参数需要修改:

su - postgres
vi /var/lib/pgsql/14/data/postgresql.conf

修改参数如下:

listen_address = '*'
wal_level = replica
hot_standby = on
wal_log_hints = on
max_wal_senders= 10  
wal_keep_segments=1024
archive_mode=on
archive_command='test ! -f /var/lib/pgsql/archivedir/%f && cp %p /var/lib/pgsql/archivedir/%f'

参数简要说明:

listen_address:  按需设置,本次测试配置为所有主机均可以访问,生产环境可以按需配置网段或IP等

wal_level:  设置流复制模式至少设置为replica

archive_mode:  本次启用归档

archive_command:WAL日志归档命令,生产环境可以将归档拷贝到对应目录或其他机器上,本次测试配置为归档到本机的另一个目录下

max_wal_senders:最大WAL发送进程数,此数量需大于等于从库个数且比max_connections小。

wal_keep_segments:pg_wal目录下保留WAL日志的个数,每个WAL文件默认16M,为保障从库能在应用归档落后时依旧能追上主库,此值建议设置较大一点。

hot_standby:此参数控制在恢复归档期间是否支持只读操作,设置为ON后从库为只读模式。

后续Keepalived会每隔指定时间探测PostgreSQL数据库存活, 并且以Keepalived用户登录Keepalived数据库刷新这张表, 配置主备库pg_hba.conf,增加如下内容:

vi /var/lib/pgsql/14/data/pg_hba.conf

host    keepalived    keepalived        192.168.20.71/32      md5
host    keepalived    keepalived        192.168.20.75/32      md5

为了使服务器之间可以通信,还需要添加如下配置

host    replication     all             192.168.20.99/32        trust
host    replication     all             192.168.20.75/24        md5
host    replication     all             192.168.20.71/24        md5
host    replication     all             192.168.20.1/24         md5
host    all             all             0.0.0.0/0               md5

随后对配置文件进行重新加载使配置生效。

pg_ctl reload

创建复制账号

为了数据安全及便于权限控制,创建一个复制专用的数据库账号

postgres=# create user repl  REPLICATION  LOGIN ENCRYPTED  PASSWORD 'sw12345';
CREATE ROLE

修改配置文件pg_hba.conf

添加复制账号的权限,因可能会主从切换,因此 主从机器的IP均添加。也可以设置网段,以便于后期添加从库。

# replication privilege.

host     replication     repl         192.168.20.71/24          md5
host     replication     repl         192.168.20.75/24          md5

备份数据,此步骤只需要在从机上操作,用于复制主机数据

从机上在线备份主库数据,并将数据放在指定路径,此路径建议与主库路径一致。可以使用

pg_basebackup

命令在线热备份,具体命令如下:

pg_basebackup -h 192.168.20.71 -U repl -p 5432 -F p   -X s  -v -P -R -D /var/lib/pgsql/14/data/ -l postgres32

    pg_basebackup命令中的参数说明:

    -h 指定连接的数据库的主机名或IP地址,这里就是主库的ip

    -U 指定连接的用户名,此处是我们刚才创建的专门负责流复制的repl用户

    -F 指定生成备份的数据格式,支持p(plain原样输出)或者t(tar格式输出)

    -X 表示备份开始后,启动另一个流复制连接从主库接收WAL日志,有 f(fetch)和s (stream)两种方式,建议使用s方式

    -P 表示显示数据文件、表空间传输的近似百分比 允许在备份过程中实时的打印备份的进度

    -v 表示启用verbose模式,命令执行过程中会打印各阶段日志,建议启用

    -R 表示会在备份结束后自动生成recovery.conf文件,这样也就避免了手动创建

    -D 指定把备份写到哪个目录,这里尤其要注意一点就是做基础备份之前从库的数据目录(/data/postgresql/data)目录需要手动清空

     -l 表示指定个备份的标识,运行命令后可以看到进度提示

以上备份命令输出过程如下

[postgres@PG33 data]$ pg_basebackup -h 192.168.20.71 -U repl -p 5432 -F p   -X s  -v -P -R -D /var/lib/pgsql/14/data/ -l postgres32
Password:
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/2000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_17737"
56041/56041 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/20000F8
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: base backup completed

从以上日志信息看出pg_basebackup命令首先对数据库做一次checkpoint,之后基于时间点做一个全库基准备份,全备过程中会拷贝$PGDATA数据文件和表空间文件到备库节点对应目录。

进入数据库,查看副本同步状态

select  pid ,usesysid,usename,client_addr,state,sync_state  from  pg_stat_replication;

查看是否是副本,如果结果返回f说明是主库,返回t说明是从库。

select pg_is_in_recovery();

安装keepalived

关闭防火墙和selinux

systemctl stop firewalld
vi /etc/selinux/config
SELINUX=disabled
getenforce

在71和75机器上分别进行如下操作,两台机器都要进行:

yum install gcc openssl openssldevel libnl libnldevel ipvsadm -y
yum install -y keepalived

编辑keepalived.conf配置文件

vi /etc/keepalived/keepalived.conf

内容如下:

说明:Keepalived主节点的priority参数配成100,备节点的priority参数改成90 ,主节点的state改为MASTER,备节点改为BACKUP,以及smtp_server 分别改为192.168.20.71和192.168.20.75,其余参数配置一样。

!Configuration File for keepalived

global_defs {
   notification_email {
     343547566@qq.com
   }
   notification_email_from Alexandre.Cassen@firewall.loc
   smtp_server 192.168.20.71
   smtp_connect_timeout 30
   router_id LVS_DEVEL
   vrrp_skip_check_adv_addr
   vrrp_strict
   vrrp_garp_interval 0
   vrrp_gna_interval 0
}


vrrp_script check_pg_alived {
  script "/etc/keepalived/scripts/pg_monitor.sh"
  interval 10
  fall 3
}


vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }

    track_script {
      check_pg_alived
    }

    virtual_ipaddress {
        192.168.20.99
    }

    notify_master "/etc/keepalived/scripts/failover.sh"
    notify_fault  "/etc/keepalived/scripts/fault.sh"
}

创建scripts路径

mkdir /etc/keepalived/scripts

创建pg_monitor.sh,failover.sh,fault.sh,并赋权限

touch /etc/keepalived/scripts/pg_monitor.sh
touch /etc/keepalived/scripts/failover.sh
touch /etc/keepalived/scripts/fault.sh

chmod a+x /etc/keepalived/scripts/pg_monitor.sh
chmod a+x /etc/keepalived/scripts/failover.sh
chmod a+x /etc/keepalived/scripts/fault.sh

分别编写pg_monitor.sh,failover.sh和fault.sh脚本

vi /etc/keepalived/scripts/pg_monitor.sh
#!/bin/bash
pgport=5432
pguser=keepalived
pgdb=keepalived
pgpwd='sw12345'
LANG=en_US.utf8
PGHOME=/usr/pgsql-14/
LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib
PATH=$PGHOME/bin:$PATH:.
MONITOR_LOG="/etc/keepalived/scripts/pg_monitor.log"
SQL1="UPDATE sr_delay SET last_alive= now();"
SQL2='SELECT 1;'
keeplognums=30000


#pg_port_status=`lsof i :$pgport | grep LISTEN | wc l`
#pg_port_status=`ps ef | grep LISTEN | wc l`
#if [ $pg_port_status lt 1 ];then
# echo e `date +"%F %T"` "Error: The postgreSQL is not running,please chec k the postgreSQL server status!" >> $LOGFILE
# exit 1
#fi


#此脚本不检查备库存活状态,如果是备库则退出
standby_flg=`psql -p $pgport -U postgres -At -c "SELECT pg_is_in_recovery();"`
if [ ${standby_flg} == 't' ];
then
echo e "`date +%F\ %T`: This is a standby database, exit!\n" > $MONITOR_LOG
exit 0
fi
export PGPASSWORD=$pgpwd
#主库更新sr_delay 表
echo $SQL1 | psql -At -p $pgport -U $pguser -d $pgdb >> $MONITOR_LOG
#判断主库是否可用
echo $SQL2 | psql -At -p $pgport -U $pguser -d $pgdb
if [ $? -eq 0 ] ;
then
echo -e "`date +%F\ %T`:Primary db is health." >> $MONITOR_LOG
exit 0
else
echo -e "`date +%F\ %T`:Attention: Primary db is not health!" >> $MONITOR_LOG
exit 1
fi



#日志保留 keeplognums 行
if [ ! -f ${MONITOR_LOG} ];
then touch ${MONITOR_LOG};
fi
lognums=`cat ${MONITOR_LOG} |wc -l`
catnum=$((${lognums} -${keeplognums}))
if [[ $lognums -gt ${keeplognums} ]] ; then sed -i "1,${catnum}d" ${MONITOR_LOG};
fi

编写failover.sh

vi /etc/keepalived/scripts/failover.sh

内容如下:

#!/bin/bash
export PGPORT=5432
export PGUSER=keepalived
export PG_OS_USER=postgres
export PGDBNAME=keepalived
export.UTF-8
export PGPATH=/usr/pgsql-14/bin/
export PATH=$PATH:$PGPATH
export PGMIP=127.0.0.1
LOGFILE='/etc/keepalived/scripts/failover.log'

# 主备数据库同步时延,单位为秒
sr_allowed_delay_time=100

SQL1='select pg_is_in_recovery from pg_is_in_recovery();'
SQL2="select count(1) as delay_time from sr_delay where now()<(last_alive + interval '${sr_allowed_delay_time} seconds');"
#SQL2="select count(1) as delay_time from sr_delay where now()<(last_alive + interval '100 seconds');"
sleep $sr_allowed_delay_time
db_role=`echo $SQL1 | psql -At -p $PGPORT -U $PGUSER -d $PGDBNAME -w`
db_sr_delaytime=`echo $SQL2 | psql -p $PGPORT -d $PGDBNAME -U $PGUSER -At -w`
SWITCH_COMMAND='pg_ctl promote -D /var/lib/pgsql/14/data/'

# 如果为备库,且延迟大于指定时间则切换为主库
if [ ${db_role} == 'f' ];
then
echo -e `date +"%F %T"` "Attention:The current postgreSQL DB is master database,cannot switched!" >> $LOGFILE
exit 0
fi

if [ $db_sr_delaytime -gt 0 ];
then
echo -e `date +"%F %T"` "Attention:The current master database is health,the standby DB cannot switched!" >> $LOGFILE
exit 0
fi

if [ !$db_sr_delaytime ];
then
echo -e `date +"%F %T"` "Attention:The current database is statndby,ready to switch master database!" >> $LOGFILE
su - $PG_OS_USER -c "$SWITCH_COMMAND"
elif [ $? -eq 0 ];
then
echo -e `date +"%F %T"` "success:The current standby database successed to switched the primary PG database !" >> $LOGFILE
exit 0
else
echo -e `date +"%F %T"` "Error: the standby database failed to switch the primary PG database !,pelease checked it!" >> $LOGFILE
exit 1
fi

编写fault.sh

vi /etc/keepalived/scripts/fault.sh

内容如下:

#!/bin/bash
GFILE=/etc/keepalived/scripts/pg_db_fault.log
PGPORT=5432
PGMIP=192.168.20.71
echo -e `date +"%F %T"` "Error:Because of the priamry DB happend some unknown problem,So turn off the PostgreSQL Database!" >> $LOGFILE
PGPID="`head -n1 /var/lib/pgsql/14/data/postmaster.pid`"
systemctl stop keepalived
kill -9 $PGPID
if [ $? -eq 0 ] ;
then
echo -e `date +"%F %T"` "Error:Because of the priamry DB happend some unknown problem,So turn off the PostgreSQL Database!" >> $LOGFILE
systemctl stop keepalived
exit 1
fi

Keepalived启动、停止和状态查看

systemctl  start keepalived
systemctl  stop keepalived
systemctl  status keepalived

3.测试

在主机上停掉数据库

[root@localhost ~]# su - postgres
[postgres@localhost ~]$ pg_ctl -D /var/lib/pgsql/14/data/ stop

到之前的从机上看看能否插入数据,如果显示insert 0 1 说明插入成功,说明主机切换成功。

[postgres@localhost ~]$ psql
psql (14.2)
Type "help" for help.
postgres=# insert into test(name) values (2);
INSERT 0 1

然后去老主机使用pg_rewind将现在主机的内容进行同步,具体命令如下:

pg_rewind -R --target-pgdata '/var/lib/pgsql/14/data/' --source-server 'host=192.168.20.75 port=5432 user=repl password=sw12345 dbname=postgres'

最后重新启动从库,并检查主备wal sender和receiver是否正常:

[postgres@localhost ~]$ pg_ctl -D /var/lib/pgsql/14/data/ start
[postgres@localhost ~]$ ps axu |grep walreceiver

由于postgresql12版本后的调整,需要在从库新增standby.signal文件

touch /var/lib/pgsql/14/data/standby.signal

再通过psql登录到数据库,查看主从情况:

select pg_is_in_recovery();

至此,整个PostgreSQL+Keepalived集群配置配置完成。

  • 1
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值