Keepalive+redis主从复制实现redis高可用
一、背景描述
项目上使用的软件是使用的单机版本redis,客户质疑为什么现在还用单机呢?故障了怎么办,要求做高可用。
redis有三种方式
①、主从复制
优点:数据高可用。
缺点:主节点故障后无法自动转移到从节点。
②、哨兵模式
优点:主节点故障可以不影响业务使用,做到了高可用。
缺点:程序需要连接哨兵IP+端口,意味着现在的业务代码需要做一些修改(不知道准确不)。
③、集群
优点:三主三从可以扩展分片数。
缺点:①、是否程序需要连接所有集群的IP+端口?如果是那代码就要重新发版才能用。②、redis集群后是以分槽位的方式将数据分散在多个master节点。因为项目是微服务并不确定使用redis 0数据库后会不会有key 冲突的问题!再者有部分命令也不支持。
主要是哨兵模式、集群用的资源也较多。
所以这里考虑用两台机器做主从,通过Keepalive+redis主从复制实现redis高可用。
关于主从复制、哨兵、集群可以参考:https://www.cnblogs.com/hanease/p/15916605.html
二、部署redis
注、Master、Slave安装部署相同。
①、资源规划
172.27.3.62 redis Master
172.27.3.63 redis Slave
172.27.3.66 keepalive VIP
②、安装redis
这里通过yum的方式部署
[root@zabbix-proxy opt]# yum -y install redis
[root@zabbix-proxy opt]# redis-server --version
Redis server v=3.2.12 sha=00000000:0 malloc=jemalloc-3.6.0 bits=64 build=7897e7d0e13773f
③、修改配置文件
[root@zabbix-proxy opt]# cat /etc/redis.conf |grep -v '#'|grep -v '^$'
bind 0.0.0.0
protected-mode yes
port 6379
tcp-backlog 511
timeout 0
tcp-keepalive 300
daemonize no
supervised no
pidfile /var/run/redis_6379.pid
loglevel notice
logfile /var/log/redis/redis.log
databases 96
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /var/lib/redis
masterauth lskjdlfkjaslf
slave-serve-stale-data yes
slave-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-disable-tcp-nodelay no
slave-priority 100
requirepass lskjdlfkjaslf
appendonly no
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-load-truncated yes
lua-time-limit 5000
slowlog-log-slower-than 10000
slowlog-max-len 128
latency-monitor-threshold 0
notify-keyspace-events ""
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-size -2
list-compress-depth 0
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
hll-sparse-max-bytes 3000
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yes
最主要的配置
bind 0.0.0.0
requirepass lskjdlfkjaslf #密码验证
masterauth lskjdlfkjaslf #master 、slave都需要配置主从验证密码,这个密码就是对方的requirepass
④、配置redis主从
配置redis主从有两种方式:
1、修改配置文件(永久生效)
在redis从修改配置文件添加以下配置即可。
slaveof 172.27.3.62 6379
2、命令行修改(重启后失效)。
命令行执行:
slaveof 172.27.3.62 6379
我们选择第二种方式。
以上redis安装部署完成后设置redis开机启动
systemctl enable redis && systemctl start redis
三、安装部署keepalived
注:这里keepalived部署为不抢占模式,记得关闭防火墙、selinux设置为disabled!
①、安装keepalived(主从)
yum -y install keepalived
mkdir /app/log/keepalived #创建keepalive脚本日志目录
touch /app/log/keepalived/status #生产eepalive脚本日志文件
②、redis Master配置keepalived
[root@localhost opt]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id redis01
}
vrrp_script chk_redis
{
script "/etc/keepalived/script/redis_check.sh" ##判断进程是否存在
interval 2
timeout 2
fall 3
}
vrrp_instance redis {
state BACKUP
interface ens192 ##需要修改为实际网卡名称
nopreempt
virtual_router_id 60
priority 100 ##权重,数字越大权重越大,主节点数值大于备节点
advert_int 1
authentication { #all node must same
auth_type PASS
auth_pass gkogi38GIOWE8398jd
}
virtual_ipaddress {
172.27.3.66 ##需要修改为实际分配的虚拟IP
}
track_script {
chk_redis
}
notify_master "/etc/keepalived/script/redis_master.sh 127.0.0.1 172.27.3.63 6379 Mcloud2021" ## 172.27.3.63 6379需要修改为实际的远端主机IP ## 修改auth为密码
notify_backup "/etc/keepalived/script/redis_backup.sh 127.0.0.1 172.27.3.63 6379 Mcloud2021" ## 172.27.3.63 6379需要修改为实际的远端主机IP ## 修改auth为密码
}
③、redis Master检测脚本 redis_check.sh (主从相同)
keepalived会定时check redis进程是否存在,如果掉了会尝试拉服务起来,拉不起来就会关闭keepalived服务。
[root@zabbix-proxy script]# pwd
/etc/keepalived/script
[root@zabbix-proxy script]# cat redis_check.sh
#!/bin/bash
#ALIVE=`/usr/bin/redis-cli -h $1 -p $2 -a $3 PING`
LOGFILE="/app/log/keepalived/status"
echo "[CHECK]" >> $LOGFILE
date >> $LOGFILE
d=`date --date today +%Y%m%d_%H:%M:%S`
n=`netstat -lntp | grep redis | wc -l`
if [ $n -eq 0 ]; then
systemctl start redis
echo "Redis is Down,but I start it! Please check it" >> $LOGFILE 2>&1
n2=`netstat -lntp | grep redis | wc -l`
if [ $n2 -eq 0 ]; then
echo "$d redis down,keepalived will stop" >> $LOGFILE 2>&1
systemctl stop keepalived
fi
else
echo "Success: redis-cli -h $1 -p $2 -a $3 PING $ALIVE" >> $LOGFILE 2>&1
# exit 1
fi
④、redis Master当keepalived状态为master是执行脚本redis_master.sh(主从相同)
本脚本不做过多的处理,当本机keepalived状态为master时清空redis从配置!
[root@zabbix-proxy script]# cat redis_master.sh
#!/bin/bash
REDISCLI="/usr/bin/redis-cli -h $1 -p $3 -a $4"
LOGFILE="/app/log/keepalived/status"
echo "[master]" >> $LOGFILE
date >> $LOGFILE
echo "Being master...." >> $LOGFILE
echo "Run SLAVEOF NO ONE cmd ..." >> $LOGFILE
${REDISCLI} SLAVEOF NO ONE >> $LOGFILE
⑤、redis Master当keepalived状态为backup是执行脚本redis_backup.sh(主从相同)
[root@zabbix-proxy script]# cat redis_backup.sh
#!/bin/bash
REDISCLI="/usr/bin/redis-cli -h $1 -p $3 -a $4"
LOGFILE="/app/log/keepalived/status"
echo "[backup]" >> $LOGFILE
date >> $LOGFILE
echo "Run SLAVEOF cmd ..." >> $LOGFILE
$REDISCLI SLAVEOF $2 $3 >> $LOGFILE 2>&1
##判断偏移量,也可以使用下面注释掉的sleep进行同步数据的等待
slave=$(/usr/bin/redis-cli -h $2 -a $4 info | grep slave0 | awk -F "=" '{print $5}' | awk -F "," '{print $1}')
master=$(/usr/bin/redis-cli -h $2 -a $4 info | grep master_repl_offset | awk -F ":" '{print $2}' | awk -F "\r" '{print $1}' )
while [ "$slave" != "$master" ];
do
slave=$(/usr/bin/redis-cli -h $2 -a $4 info | grep slave0 | awk -F "=" '{print $5}' | awk -F "," '{print $1}')
master=$(/usr/bin/redis-cli -h $2 -a $4 info | grep master_repl_offset | awk -F ":" '{print $2}' | awk -F "\r" '{print $1}' )
echo $slave
echo $master
done
#sleep 15
#sleep 15 #delay 15 s wait data sync exchange role
⑥、redis Slave配置keepalived
[root@zabbix-proxy script]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id redis02
}
vrrp_script chk_redis
{
script "/etc/keepalived/script/redis_check.sh" ##修改auth为密码
interval 2
timeout 2
fall 3
}
vrrp_instance redis {
state BACKUP
interface ens192 ##需要修改为实际网卡名称
virtual_router_id 60
nopreempt
priority 90
advert_int 1
authentication { #all node must same
auth_type PASS
auth_pass gkogi38GIOWE8398jd
}
virtual_ipaddress {
172.27.3.66 ##需要修改为实际分配的虚拟IP
}
track_script {
chk_redis
}
notify_master "/etc/keepalived/script/redis_master.sh 127.0.0.1 172.27.3.62 6379 Mcloud2021" ## 172.27.3.63 6379需要修改为实际的远端主机IP ## 修改auth为密码
notify_backup "/etc/keepalived/script/redis_backup.sh 127.0.0.1 172.27.3.62 6379 Mcloud2021" ## 172.27.3.63 6379需要修改为实际的远端主机IP ## 修改auth为密码
}
注:两外三个脚本内容与master服务器相同。
四、测试redis故障切换
①、查看现在redis服务状态及角色。
1.1、3.62
[root@localhost opt]# /usr/bin/redis-cli -h 172.27.3.62 -a lskjdlfkjaslf
172.27.3.62:6379> INFO Replication
# Replication
role:slave
master_host:172.27.3.63
master_port:6379
master_link_status:up
master_last_io_seconds_ago:7
master_sync_in_progress:0
slave_repl_offset:8695
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
172.27.3.62:6379>
1.2、3.63
[root@zabbix-proxy ~]# /usr/bin/redis-cli -h 172.27.3.63 -a lskjdlfkjaslf
172.27.3.63:6379> info replication
# Replication
role:slave
master_host:172.27.3.62
master_port:6379
master_link_status:up
master_last_io_seconds_ago:10
master_sync_in_progress:0
slave_repl_offset:71
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
172.27.3.63:6379>
②、重启3.62(因为改服务器状态目前时Master)
查看3.63 redis状态
172.27.3.63:6379> info replication
# Replication
role:master
connected_slaves:1
slave0:ip=172.27.3.62,port=6379,state=online,offset=1,lag=0
master_repl_offset:1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2
repl_backlog_histlen:0
172.27.3.63:6379>
再查看3.62状态
172.27.3.62:6379> info replication
# Replication
role:slave
master_host:172.27.3.63
master_port:6379
master_link_status:up
master_last_io_seconds_ago:9
master_sync_in_progress:0
slave_repl_offset:43
slave_priority:100
slave_read_only:1
connected_slaves:0
master_repl_offset:0
repl_backlog_active:0
repl_backlog_size:1048576
repl_backlog_first_byte_offset:0
repl_backlog_histlen:0
重启Master时Slave状态变为Master!
Master重启后keepalive状态为BACKUP!
数据验证自行验证1