1. redis 哨兵(Sentinel)
1.1 redis 集群介绍
主从架构无法实现master
和slave
角色的自动切换,即当master
出现redis
服务异常、主机断电、磁盘损坏等问题导致master
无法使用,而redis
主从复制无法实现自动的故障转移(将slave
自动提升为新master
),需要手动修改环境配置,才能切换到slave redis
服务器,另外当单台Redis
服务器性能无法满足业务写入需求的时候,也无法横向扩展Redis
服务的并行写入性能
需要解决以上的两个核心问题:
-
master
和slave
角色的无缝切换,让业务无感知从而不影响业务使用 -
可横向动态扩展
Redis
服务器,从而实现多台服务器并行写入以实现更高并发的目的
Redis
集群实现方式:
-
客户端分片: 由应用决定将不同的
KEY
发送到不同的Redis
服务器 -
代理分片: 由代理决定将不同的KEY发送到不同的
Redis
服务器,代理程序如:codis,twemproxy
等 -
Redis Cluster
2 哨兵 (Sentinel) 工作原理
2.1 sentinel 架构和故障转移
Sentinel 架构
Sentinel 故障转移
-
1.多个
sentinel
发现并确认master
有问题 -
2.选举出一个
sentinel
作为领导 -
3.选出一个
slave
作为master
-
4.通知其余
slave
成为新的master
的slave
-
5.通知客户端主从变化
-
6.等待老的
master
恢复后成为新的master
的slave
Sentinel
进程是用于监控redis集群中Master主服务器工作的状态,在Master主服务器发生故障的时候,可以实现Master和Slave服务器的切换,保证系统的高可用,此功能在redis2.6+
的版本已引用,Redis的哨兵模式到了Redis 2.8
版本之后就稳定了下来。一般在生产环境也建议使用Redis的2.8版本的以后版本
哨兵(Sentinel)是一个分布式系统,可以在一个架构中运行多个哨兵(sentinel)进程,这些进程使用流言 协议(gossip protocols)来接收关于Master主服务器是否下线的信息,并使用投票协议(Agreement Protocols)来决定是否执行自动故障迁移,以及选择哪个Slave作为新的Master
每个哨兵(Sentinel)进程会向其它哨兵(Sentinel)、Master、Slave定时发送消息,以确认对方是否”活”着,如果发现对方在指定配置时间(此项可配置)内未得到回应,则暂时认为对方已离线,也就是所谓的” 主观认为宕机” (主观:是每个成员都具有的独自的而且可能相同也可能不同的意识),英文名称: Subjective Down
,简称SDOWN
有主观宕机,对应的有客观宕机。当“哨兵群”中的多数Sentinel进程在对Master主服务器做出SDOWN 的判断,并且通过 SENTINEL is-master-down-by-addr 命令互相交流之后,得出的Master Server下线 判断,这种方式就是“客观宕机”(客观:是不依赖于某种意识而已经实际存在的一切事物),英文名称是: Objectively Down
, 简称 ODOWN
通过一定的vote算法,从剩下的slave从服务器节点中,选一台提升为Master服务器节点,然后自动修改相关配置,并开启故障转移(failover)
Sentinel 机制可以解决master和slave角色的自动切换问题,但单个 Master 的性能瓶颈问题无法解决, 类似于MySQL中的MHA功能
Redis Sentinel中的Sentinel节点个数应该为大于等于 3 且最好为奇数
客户端初始化时连接的是Sentinel节点集合,不再是具体的Redis节点,但Sentinel只是配置中心不是代理。
Redis Sentinel 节点与普通 redis 没有区别,要实现读写分离依赖于客户端程序
redis 3.0
之前版本中,生产环境一般使用哨兵模式,3.0后推出redis cluster功能,可以支持更大规模的生产环境
2.2 sentinel中的三个定时任务
-
每 10 秒每个
sentinel
对master
和slave
执行info
-
发现
slave
节点 -
确认主从关系
-
-
每 2 秒每个
sentinel
通过master
节点的channel
交换信息(pub/sub)-
通过sentinel__:hello频道交互
-
交互对节点的“看法”和自身信息
-
-
每 1 秒每个
sentinel
对其他sentinel
和redis
执行ping
2.3 实现哨兵
2.3.1 哨兵的准备实现主从复制架构
哨兵的前提是已经实现了一个 redis 的主从复制的运行环境,从而实现一个一主两从基于哨兵的高可用 redis 架构
注意: master 的配置文件中 masterauth 和 slave 都必须相同
示例:准备主从环境配置
在所有主从节点执行
mkdir -p /apps/redis/{run,data,logs,conf}
# redis 配置
cat >/apps/redis/conf/redis.conf <<EOF
bind 0.0.0.0
protected-mode no
daemonize no
loglevel notice
port 6379
tcp-backlog 2048
databases 16
dir /data
masterauth "123456"
requirepass "123456"
maxclients 10000
timeout 600
tcp-keepalive 600
slowlog-log-slower-than 1000
slowlog-max-len 128
maxmemory 1gb
maxmemory-policy allkeys-lru
dbfilename dump.rdb
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
EOF
# 安装docker-ce
yum -y install docker-ce docker-ce-cli containerd.io
systemctl enable --now docker
systemctl status docker
# 加速 docker 访问
cat > /etc/docker/daemon.json <<EOF
{
"registry-mirrors":["https://0cde955d3600f3000fe5c004160e0320.mirror.swr.myhuaweicloud.com"]
}
EOF
# 运行容器
docker run \
--privileged \
-p 6379:6379 \
--name redis-node \
-v /apps/redis/data:/data \
-v /apps/redis/conf/redis.conf:/etc/redis/redis.conf \
--net host \
-d redis:5.0 \
redis-server /etc/redis/redis.conf
# 修改配置文件属主和属组为redis
docker exec -it redis-node /bin/bash -c "chown redis.redis /etc/redis/redis.conf"
# 确认 redis.conf 和数据目录权限都为redis
docker exec -it redis-node /bin/bash -c "ls -ltr /etc/redis/redis.conf"
docker exec -it redis-node /bin/bash -c "ls -ltr /data"
在所有从节点执行
echo "replicaof 172.18.8.17 6379" >> /apps/redis/conf/redis.conf
docker restart redis-node
master服务器状态
[root@client overlay2]# docker exec -it redis-node /bin/bash
root@client:/data# redis-cli
127.0.0.1:6379> auth 123456
OK
127.0.0.1:6379> info replication
# Replication
role:master
connected_slaves:2
slave0:ip=172.18.8.27,port=6379,state=online,offset=28,lag=0
slave1:ip=172.18.8.37,port=6379,state=online,offset=28,lag=1
master_replid:e983cafad389ff5bab0b141023def02bd752db6f
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:28
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:28
127.0.0.1:6379>
Slave-1
[root@redis-slave-1 ~]# docker exec -it redis-node /bin/bash
root@redis-slave-1:/data# redis-cli
127.0.0.1:6379> auth 123456
OK
127.0.0.1:6379> info replication
# Replication
role:slave
master_host:172.18.8.17
master_port:6379
master_link_status:up
master_last_io_seconds_ago:9
master_sync_in_progress:0
slave_repl_offset:126
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:e983cafad389ff5bab0b141023def02bd752db6f
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:126
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:126
127.0.0.1:6379>
Slave-2
[root@redis-slave-2 ~]# docker exec -it redis-node /bin/bash
root@redis-slave-2:/data# redis-cli
127.0.0.1:6379> auth 123456
OK
127.0.0.1:6379> info replication
# Replication
role:slave
master_host:172.18.8.17
master_port:6379
master_link_status:up
master_last_io_seconds_ago:8
master_sync_in_progress:0
slave_repl_offset:196
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:e983cafad389ff5bab0b141023def02bd752db6f
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:196
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:196
127.0.0.1:6379>
主从测试
# 主机写入数据
127.0.0.1:6379> set city beijing
OK
127.0.0.1:6379> get city
"beijing"
127.0.0.1:6379>
# 从机查看数据
127.0.0.1:6379> get city
"beijing"
127.0.0.1:6379>
# 只有主机可以写入数据,从机无法写入数据,执行会报错
127.0.0.1:6379> set port 80
(error) READONLY You can't write against a read only replica.
127.0.0.1:6379>
2.3.2 编辑哨兵的配置文件
sentinel配置
Sentinel实际上是一个特殊的redis服务器,有些redis指令支持,但很多指令并不支持.默认监听在 26379/tcp端口.
哨兵可以不和Redis服务器部署在一起,但一般部署在一起以节约成本
所有redis节点使用相同的以下示例的配置文件
# 如果是编译安装,在源码目录有sentinel.conf,复制到安装目录即可,
# 如:/apps/redis/etc/sentinel.conf
cat /apps/redis/conf/redis-sentinel.conf
bind 0.0.0.0
port 26379
daemonize yes
pidfile "redis-sentinel.pid"
dir "/tmp" # 工作目录
sentinel monitor mymaster 172.18.8.17 6379 2
# mymaster是集群的名称,此行指定当前mymaster集群中master服务器的地址和端口
# 2为法定人数限制(quorum),即有几个sentinel认为master down了就进行故障转移,一般此值是所有 sentinel节点(一般总数是>=3的 奇数,如:3,5,7等)
# 的一半以上的整数值,比如,总数是3,即3/2=1.5, 取整为2,是master的ODOWN客观下线的依据
sentinel auth-pass mymaster 123456
# mymaster集群中master的密码,注意此行要在上面行的下面
sentinel down-after-milliseconds mymaster 30000
# (SDOWN)判断mymaster集群中所有节点的主观下线的时间,单位:毫秒,建议3000
sentinel parallel-syncs mymaster 1
# 发生故障转移后,可以同时向新master同步数据的slave的数量,数字越小总同步时间越长,但可以减轻新 master的负载压力
sentinel failover-timeout mymaster 180000
# 所有slaves指向新的master所需的超时时间,单位:毫秒
sentinel deny-scripts-reconfig yes # 禁止修改脚本
logfile /var/log/redis/sentinel.log
三个哨兵服务器的配置都如下
[root@client ~]# cat /apps/redis/conf/redis-sentinel.conf
bind 0.0.0.0
port 26379
daemonize no
pidfile "/var/run/redis-sentinel.pid"
logfile "/var/log/sentinel_26379.log"
dir "/tmp"
sentinel monitor mymaster 172.18.8.17 6379 2 #修改此行
sentinel auth-pass mymaster 123456 #增加此行
sentinel down-after-milliseconds mymaster 30000 #修改此行
sentinel parallel-syncs mymaster 1
sentinel failover-timeout mymaster 180000
sentinel deny-scripts-reconfig yes
[root@client ~]#
# 以下内容自动生成,不需要修改
sentinel myid bc719e80d29a4c3c1be4c33ec825cf7827aed92b
# 此行自动生成必须唯一,修改此值需重启redis和sentinel服务
.....
# Generated by CONFIG REWRITE
protected-mode no
supervised systemd
sentinel leader-epoch mymaster 0
sentinel known-replica mymaster 172.18.8.27 6379
sentinel known-replica mymaster 172.18.8.37 6379
sentinel current-epoch 0
[root@redis-master ~]#scp /apps/redis/conf/redis-sentinel.conf redis-slave-1:/apps/redis/conf/
[root@redis-master ~]#scp /apps/redis/conf/redis-sentinel.conf redis-slave-2:/apps/redis/conf/
2.3.3 启动哨兵
三台哨兵服务器都要启动
docker run \
--privileged \
--name redis-sentinel \
-p 26379:26379 \
-v /apps/redis/conf/redis-sentinel.conf:/etc/redis/redis-sentinel.conf \
--net host \
-d redis:5.0 \
redis-sentinel /etc/redis/redis-sentinel.conf
确保每个哨兵主机myid不同
# Master
[root@client ~]# docker exec -it redis-sentinel /bin/bash -c "cat /etc/redis/redis-sentinel.conf"
bind 0.0.0.0
port 26379
daemonize no
pidfile "/var/run/redis-sentinel.pid"
logfile "/var/log/sentinel_26379.log"
dir "/tmp"
sentinel myid bc719e80d29a4c3c1be4c33ec825cf7827aed92b
sentinel deny-scripts-reconfig yes
sentinel monitor mymaster 172.18.8.17 6379 2
sentinel auth-pass mymaster 123456
sentinel config-epoch mymaster 0
sentinel leader-epoch mymaster 0
# Generated by CONFIG REWRITE
sentinel known-replica mymaster 172.18.8.27 6379
sentinel known-replica mymaster 172.18.8.37 6379
sentinel current-epoch 0
[root@client ~]#
# Slave-1
[root@redis-slave-1 ~]# docker exec -it redis-sentinel /bin/bash -c "cat /etc/redis/redis-sentinel.conf"
bind 0.0.0.0
port 26379
daemonize no
pidfile "/var/run/redis-sentinel.pid"
logfile "/var/log/sentinel_26379.log"
dir "/tmp"
sentinel myid fdd64a65231a156179e0b8518015ec65c0d0f771
sentinel deny-scripts-reconfig yes
sentinel monitor mymaster 172.18.8.17 6379 2
sentinel auth-pass mymaster 123456
sentinel config-epoch mymaster 0
sentinel leader-epoch mymaster 0
# Generated by CONFIG REWRITE
sentinel known-replica mymaster 172.18.8.37 6379
sentinel known-replica mymaster 172.18.8.27 6379
sentinel known-sentinel mymaster 172.18.8.37 26379 0587279615db5680b6606d4d18952fad96a3fbe7
sentinel known-sentinel mymaster 172.18.8.17 26379 bc719e80d29a4c3c1be4c33ec825cf7827aed92b
sentinel current-epoch 0
[root@redis-slave-1 ~]#
# Slave-2
[root@redis-slave-2 ~]# docker exec -it redis-sentinel /bin/bash -c "cat /etc/redis/redis-sentinel.conf"
bind 0.0.0.0
port 26379
daemonize no
pidfile "/var/run/redis-sentinel.pid"
logfile "/var/log/sentinel_26379.log"
dir "/tmp"
sentinel myid 0587279615db5680b6606d4d18952fad96a3fbe7
sentinel deny-scripts-reconfig yes
sentinel monitor mymaster 172.18.8.17 6379 2
sentinel auth-pass mymaster 123456
sentinel config-epoch mymaster 0
sentinel leader-epoch mymaster 0
# Generated by CONFIG REWRITE
sentinel known-replica mymaster 172.18.8.37 6379
sentinel known-replica mymaster 172.18.8.27 6379
sentinel known-sentinel mymaster 172.18.8.17 26379 bc719e80d29a4c3c1be4c33ec825cf7827aed92b
sentinel known-sentinel mymaster 172.18.8.27 26379 fdd64a65231a156179e0b8518015ec65c0d0f771
sentinel current-epoch 0
[root@redis-slave-2 ~]#
2.3.4 查看端口
[root@redis-slave-1 ~]# ss -tnl
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 *:111 *:*
LISTEN 0 5 192.168.1.1:53 *:*
LISTEN 0 128 *:22 *:*
LISTEN 0 100 127.0.0.1:25 *:*
LISTEN 0 128 *:26379 *:*
LISTEN 0 128 *:6379 *:*
LISTEN 0 128 [::]:111 [::]:*
LISTEN 0 128 [::]:22 [::]:*
LISTEN 0 100 [::1]:25 [::]:*
[root@redis-slave-1 ~]#
2.3.5 查看哨兵日志
master的哨兵日志
[root@client ~]# docker exec -it redis-sentinel /bin/bash -c "tail -f /var/log/sentinel_26379.log"
1:X 06 Jul 2021 11:06:30.345 # Redis version=5.0.12, bits=64, commit=00000000, modified=0, pid=1, just started
1:X 06 Jul 2021 11:06:30.345 # Configuration loaded
1:X 06 Jul 2021 11:06:30.348 * Running mode=sentinel, port=26379.
1:X 06 Jul 2021 11:06:30.348 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:X 06 Jul 2021 11:06:30.351 # Sentinel ID is bc719e80d29a4c3c1be4c33ec825cf7827aed92b
1:X 06 Jul 2021 11:06:30.351 # +monitor master mymaster 172.18.8.17 6379 quorum 2
1:X 06 Jul 2021 11:06:30.352 * +slave slave 172.18.8.37:6379 172.18.8.37 6379 @ mymaster 172.18.8.17 6379
1:X 06 Jul 2021 11:06:30.353 * +slave slave 172.18.8.27:6379 172.18.8.27 6379 @ mymaster 172.18.8.17 6379
1:X 06 Jul 2021 11:59:02.363 * +sentinel sentinel fdd64a65231a156179e0b8518015ec65c0d0f771 172.18.8.27 26379 @ mymaster 172.18.8.17 6379
1:X 06 Jul 2021 11:59:05.818 * +sentinel sentinel 0587279615db5680b6606d4d18952fad96a3fbe7 172.18.8.37 26379 @ mymaster 172.18.8.17 6379
slave的哨兵日志
# Slave-1
[root@redis-slave-1 ~]# docker exec -it redis-sentinel /bin/bash -c "tail -f /var/log/sentinel_26379.log"
1:X 06 Jul 2021 11:59:00.331 # Redis version=5.0.12, bits=64, commit=00000000, modified=0, pid=1, just started
1:X 06 Jul 2021 11:59:00.331 # Configuration loaded
1:X 06 Jul 2021 11:59:00.335 * Running mode=sentinel, port=26379.
1:X 06 Jul 2021 11:59:00.335 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:X 06 Jul 2021 11:59:00.337 # Sentinel ID is fdd64a65231a156179e0b8518015ec65c0d0f771
1:X 06 Jul 2021 11:59:00.337 # +monitor master mymaster 172.18.8.17 6379 quorum 2
1:X 06 Jul 2021 11:59:00.339 * +slave slave 172.18.8.37:6379 172.18.8.37 6379 @ mymaster 172.18.8.17 6379
1:X 06 Jul 2021 11:59:00.340 * +slave slave 172.18.8.27:6379 172.18.8.27 6379 @ mymaster 172.18.8.17 6379
1:X 06 Jul 2021 11:59:00.527 * +sentinel sentinel bc719e80d29a4c3c1be4c33ec825cf7827aed92b 172.18.8.17 26379 @ mymaster 172.18.8.17 6379
1:X 06 Jul 2021 11:59:05.820 * +sentinel sentinel 0587279615db5680b6606d4d18952fad96a3fbe7 172.18.8.37 26379 @ mymaster 172.18.8.17 6379
# Slave-2
[root@redis-slave-2 ~]# docker exec -it redis-sentinel /bin/bash -c "tail -f /var/log/sentinel_26379.log"
1:X 06 Jul 2021 11:59:03.795 # Redis version=5.0.12, bits=64, commit=00000000, modified=0, pid=1, just started
1:X 06 Jul 2021 11:59:03.795 # Configuration loaded
1:X 06 Jul 2021 11:59:03.799 * Running mode=sentinel, port=26379.
1:X 06 Jul 2021 11:59:03.799 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:X 06 Jul 2021 11:59:03.801 # Sentinel ID is 0587279615db5680b6606d4d18952fad96a3fbe7
1:X 06 Jul 2021 11:59:03.801 # +monitor master mymaster 172.18.8.17 6379 quorum 2
1:X 06 Jul 2021 11:59:03.803 * +slave slave 172.18.8.37:6379 172.18.8.37 6379 @ mymaster 172.18.8.17 6379
1:X 06 Jul 2021 11:59:03.804 * +slave slave 172.18.8.27:6379 172.18.8.27 6379 @ mymaster 172.18.8.17 6379
1:X 06 Jul 2021 11:59:04.420 * +sentinel sentinel fdd64a65231a156179e0b8518015ec65c0d0f771 172.18.8.27 26379 @ mymaster 172.18.8.17 6379
1:X 06 Jul 2021 11:59:04.589 * +sentinel sentinel bc719e80d29a4c3c1be4c33ec825cf7827aed92b 172.18.8.17 26379 @ mymaster 172.18.8.17 6379
2.3.6 当前sentinel状态
在 sentinel
状态中尤其是最后一行,涉及到 masterIP
是多少,有几个 slave
,有几个 sentinels
,必须是符合全部服务器数量
[root@redis-master ~]# docker exec -it redis-sentinel /bin/bash
root@client:/data# redis-cli -p 26379
127.0.0.1:26379> info sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=172.18.8.17:6379,slaves=2,sentinels=3
127.0.0.1:26379>
# 两个 slave,三个 sentinel 服务器,如果 sentinels 值不符合,检查 myid 可能冲突
# 不进入容器,查看
[root@redis-master ~]# docker exec -it redis-sentinel /bin/bash -c "redis-cli -p 26379 info sentinel"
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=172.18.8.17:6379,slaves=2,sentinels=3
[root@redis-master ~]#
2.3.7 停止 Redis Master 节点测试故障转移
查看 Redis Master
的信息
[root@redis-master ~]# docker exec -it redis-node /bin/bash -c "redis-cli -a 123456 info replication"
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
# Replication
role:master
connected_slaves:2
slave0:ip=172.18.8.27,port=6379,state=online,offset=5737464,lag=0
slave1:ip=172.18.8.37,port=6379,state=online,offset=5737450,lag=0
master_replid:f355a23a3bc38fb44b5e87943716a2dbf17e0e5b
master_replid2:cf9ebd2f3a9928a4cafd4580094a1809361ea5f1
master_repl_offset:5737464
second_repl_offset:5670124
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:4688889
repl_backlog_histlen:1048576
[root@redis-master ~]#
停止 Redis Master
节点
[root@redis-master ~]# docker stop redis-node
redis-node
[root@redis-master ~]#
查看各节点上哨兵信息
[root@redis-slave-1 ~]# docker exec -it redis-sentinel /bin/bash -c "redis-cli -p 26379 info sentinel"
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=172.18.8.27:6379,slaves=2,sentinels=3
[root@redis-slave-2 ~]# docker exec -it redis-sentinel /bin/bash -c "redis-cli -p 26379 info sentinel"
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=172.18.8.27:6379,slaves=2,sentinels=3
[root@redis-slave-2 ~]#
查看主从信息
# Slave-1
[root@redis-slave-1 ~]# docker exec -it redis-node /bin/bash -c "redis-cli -a 123456 info replication"
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
# Replication
role:master
connected_slaves:1
slave0:ip=172.18.8.37,port=6379,state=online,offset=5793062,lag=1
master_replid:c30d447d0f70aaaef81242a4162bfc5582550ebb
master_replid2:f355a23a3bc38fb44b5e87943716a2dbf17e0e5b
master_repl_offset:5793062
second_repl_offset:5779667
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:4744487
repl_backlog_histlen:1048576
[root@redis-slave-1 ~]#
# Slave-2
[root@redis-slave-2 ~]# docker exec -it redis-node /bin/bash -c "redis-cli -a 123456 info replication"
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
# Replication
role:slave
master_host:172.18.8.27
master_port:6379
master_link_status:up
master_last_io_seconds_ago:2
master_sync_in_progress:0
slave_repl_offset:5836100
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:c30d447d0f70aaaef81242a4162bfc5582550ebb
master_replid2:f355a23a3bc38fb44b5e87943716a2dbf17e0e5b
master_repl_offset:5836100
second_repl_offset:5779667
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:5670124
repl_backlog_histlen:165977
[root@redis-slave-2 ~]#
故障转移时sentinel的信息
[root@redis-master ~]# docker exec -it redis-sentinel /bin/bash -c "tail -f /var/log/sentinel_26379.log"
1:X 07 Jul 2021 09:17:54.455 # +sdown slave 172.18.8.37:6379 172.18.8.37 6379 @ mymaster 172.18.8.17 6379
1:X 07 Jul 2021 09:18:05.093 # -sdown slave 172.18.8.37:6379 172.18.8.37 6379 @ mymaster 172.18.8.17 6379
1:X 07 Jul 2021 09:26:52.101 # +sdown master mymaster 172.18.8.17 6379
1:X 07 Jul 2021 09:26:52.163 # +new-epoch 3
1:X 07 Jul 2021 09:26:52.164 # +vote-for-leader 1fa399c7ed5b655e744470d93a45c95f6e65b45f 3
1:X 07 Jul 2021 09:26:52.831 # +config-update-from sentinel 1fa399c7ed5b655e744470d93a45c95f6e65b45f 172.18.8.27 26379 @ mymaster 172.18.8.17 6379
1:X 07 Jul 2021 09:26:52.831 # +switch-master mymaster 172.18.8.17 6379 172.18.8.27 6379
1:X 07 Jul 2021 09:26:52.831 * +slave slave 172.18.8.37:6379 172.18.8.37 6379 @ mymaster 172.18.8.27 6379
1:X 07 Jul 2021 09:26:52.831 * +slave slave 172.18.8.17:6379 172.18.8.17 6379 @ mymaster 172.18.8.27 6379
1:X 07 Jul 2021 09:27:22.912 # +sdown slave 172.18.8.17:6379 172.18.8.17 6379 @ mymaster 172.18.8.27 6379
2.3.8 故障转移后的redis配置文件会被自动修改
故障转移后 redis.conf 中的 replicaof 行的 master IP 会被修改
提升为主的机器则会删除 replicaof 这一行
[root@redis-slave-2 ~]# grep replicaof /apps/redis/conf/redis.conf
replicaof 172.18.8.27 6379
[root@redis-slave-2 ~]#
将原来的 Master 启动恢复重新加入 redis 集群,会成为新 Master 的 Slave,而不会替换现在的 Master
[root@redis-master ~]# docker start redis-node
redis-node
[root@redis-master ~]# grep replicaof /apps/redis/conf/redis.conf
replicaof 172.18.8.27 6379 # sentinel会自动添加此行并 指向新的 master
# 在原 master上观察状态
[root@redis-master ~]# docker exec -it redis-node /bin/bash -c "redis-cli -a 123456 info replication"
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
# Replication
role:slave
master_host:172.18.8.27
master_port:6379
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:5931311
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:c30d447d0f70aaaef81242a4162bfc5582550ebb
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:5931311
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:5917354
repl_backlog_histlen:13958
[root@redis-master ~]#
哨兵配置文件的sentinel monitor IP 同样也会被修改
# 原 Master 机器
[root@redis-master ~]# grep monitor /apps/redis/conf/redis-sentinel.conf
sentinel monitor mymaster 172.18.8.27 6379 2 # 自动修改此行
[root@redis-master ~]#
# 原 Slave-1 机器
[root@redis-slave-1 ~]# grep monitor /apps/redis/conf/redis-sentinel.conf
sentinel monitor mymaster 172.18.8.27 6379 2 # 自动修改此行
[root@redis-slave-1 ~]#
# 原 Slave-2 机器
[root@redis-slave-2 ~]# grep monitor /apps/redis/conf/redis-sentinel.conf
sentinel monitor mymaster 172.18.8.27 6379 2 # 自动修改此行
[root@redis-slave-2 ~]#
2.3.9 当前 redis 状态
新 master 状态
[root@redis-slave-1 ~]# docker exec -it redis-node /bin/bash -c "redis-cli -a 123456 info replication"
Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe.
# Replication
role:master # 提升为master
connected_slaves:2
slave0:ip=172.18.8.37,port=6379,state=online,offset=7168881,lag=0
slave1:ip=172.18.8.17,port=6379,state=online,offset=7169155,lag=0
master_replid:c30d447d0f70aaaef81242a4162bfc5582550ebb
master_replid2:f355a23a3bc38fb44b5e87943716a2dbf17e0e5b
master_repl_offset:7169155
second_repl_offset:5779667
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:6120580
repl_backlog_histlen:1048576
[root@redis-slave-1 ~]#
2.3.10 sentinel 运维
手动让主节点下线
sentinel failover <masterName>
示例:手动故障转移
[root@centos8 ~]#vim /etc/redis.conf
replica-priority 10 # 指定优先级,值越小sentinel会优先将之选为新的master,默为值为100
[root@centos8 ~]#redis-cli -p 26379
127.0.0.1:26379> sentinel failover mymaster
OK
2.3.11 应用程序如何连接 redis
Redis 官方客户端:https://redis.io/clients
2.3.11.1 客户端连接 sentinel 工作原理
- 选举出一个
sentinel
- 由这个
sentinel
通过masterName
获取master
节点信息
sentinel
发送role
指令确认mater
的信息
- 客户端订阅
sentinel
的相关频道,获取新的master
信息变化,并自动连接新的master
3 总结
- Sentinel 的作用是通知、选主、监控
- Sentinel 只能监控主机的健康性,不能监控从机的健康性
- 将原来的 Master 恢复后重新加入 redis 集群,会成为新 Master 的 Slave,而不会替换现在的 Master
- 编译安装或者 docker 运行的 redis,需要修改
/etc/redis/redis.conf
的属主和属组为redis
,否则在主从切换的时候,会出现命令行查看已修改到新主,而配置文件(/etc/redis/redis.conf
)里面的replicaof
并没有改变的情况