第 5 篇 : 安装Redis-1主2从3哨兵(高可用)

1. 需要完成1主2从,并记录快照 Redis已完成1主2从

2. 配置三个哨兵(160,163,164)

以主节点哨兵为例,复制哨兵配置文件到安装目录下

cp /usr/local/src/redis-6.0.16/sentinel.conf /usr/local/src/redis160/bin/
vi /usr/local/src/redis160/bin/sentinel.conf
i
:set nu

17 === 保护模式开启(和redis配置保持一致),如果有注释放开

protected-mode yes

21 === 端口(可以修改成自己喜欢的)

port 26379

26 === 开启后台运行

daemonize yes

36 === 日志文件位置(可选操作),建议放到redis160文件夹下

logfile "/usr/local/src/redis160/sentinel_26379.log"

84 === 哨兵监控的主机,端口号,需要2个哨兵完成投票

sentinel monitor mymaster 192.168.109.160 6379 2

86 === 哨兵权限,如果有注释放开

sentinel auth-pass mymaster root

125 === 默认检测宕机时间(毫秒)是30s,我修改成10s

sentinel down-after-milliseconds mymaster 10000

127 === 保护模式密码

requirepass root
Esc
:wq

创建空日志文件(也可以用vi加:wq)

touch /usr/local/src/redis160/sentinel_26379.log

开启26379端口

firewall-cmd --query-port=26379/tcp
firewall-cmd --zone=public --add-port=26379/tcp --permanent
firewall-cmd --reload

编辑哨兵启动脚本

vi sentinel-start.sh
i

脚本内容

cd /usr/local/src/redis160/bin
./redis-sentinel sentinel.conf
Esc
:wq
sudo chmod -R 777 sentinel-start.sh

依次完成163和164的哨兵配置

3. 启动哨兵并测试高可用

首先确定3台redis服务已经启动,再依次启动3个哨兵服务

./sentinel-start.sh

3.1 进入主库Redis客户端

/usr/local/src/redis160/bin/redis-cli

输入密码,并进行BUG调试

auth root
debug segfault
exit

查看当前涉及的redis的服务,只剩哨兵

ps -ef | grep redis

3.2 等待一段时间,进入从库163Redis客户端

/usr/local/src/redis163/bin/redis-cli

输入密码,并查询

auth root
info replication

3.3 进入从库164Redis客户端

/usr/local/src/redis164/bin/redis-cli

输入密码,并查询

auth root
info replication

3.4 再次启动160Redis服务,进入客户端

/usr/local/src/redis160/bin/redis-cli

输入密码,并查询

auth root
info replication

4. 日志分析

4.1 当3个哨兵全部启动后

160 日志如下

22:46:15.667 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
             Redis启动中
22:46:15.667 # Redis version=6.0.16, bits=64, commit=00000000, modified=0, pid=93243, just started
             Redis版本,位数
22:46:15.667 # Configuration loaded
             配置加载
22:46:15.668 * Increased maximum number of open files to 10032 (it was originally set to 1024).
             增加打开文件的最大容量到10032(以前是1024)
22:46:15.668 * Running mode=sentinel, port=26379.
             正在启动哨兵,端口号26379
22:46:15.668 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
             警告 : TCP连接的backlog(socket的监听队列)不能被强制执行到511,因为内核参数设定的是最小值128
             (❁´◡`❁)此警告可以通过 vi /etc/sysctl.conf 添加 net.core.somaxconn=1024 :wq sysctl -p 解决
22:46:15.669 # Sentinel ID is 89d9d89c66f30370b1d045ae21cfac9ee18e8f2e
             哨兵的ID
22:46:15.669 # +monitor master mymaster 192.168.109.160 6379 quorum 2
             监控主机 法定投票数 2
22:46:15.669 * +slave slave 192.168.109.163:6379 192.168.109.163 6379 @ mymaster 192.168.109.160 6379
             +从库 从库163@主库160
22:46:15.670 * +slave slave 192.168.109.164:6379 192.168.109.164 6379 @ mymaster 192.168.109.160 6379
             +从库 从库164@主库160
22:53:33.296 * +sentinel sentinel c5fdae5bac20a8e836e52442b5d43b03418522b0 192.168.109.163 26379 @ mymaster 192.168.109.160 6379
             +哨兵 163的哨兵 c5fdae5bac20a8e836e52442b5d43b03418522b0 监控主机160
22:58:37.751 * +sentinel sentinel 09f5f5062918b2c609124f359825290a883d1abd 192.168.109.164 26379 @ mymaster 192.168.109.160 6379
             +哨兵 164的哨兵 09f5f5062918b2c609124f359825290a883d1abd 监控主机160

163 日志如下

22:53:31.296 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
22:53:31.296 # Redis version=6.0.16, bits=64, commit=00000000, modified=0, pid=104788, just started
22:53:31.296 # Configuration loaded
22:53:31.297 * Increased maximum number of open files to 10032 (it was originally set to 1024).
22:53:31.297 * Running mode=sentinel, port=26379.
22:53:31.297 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
22:53:31.299 # Sentinel ID is c5fdae5bac20a8e836e52442b5d43b03418522b0
22:53:31.299 # +monitor master mymaster 192.168.109.160 6379 quorum 2
22:53:31.300 * +slave slave 192.168.109.163:6379 192.168.109.163 6379 @ mymaster 192.168.109.160 6379
22:53:31.301 * +slave slave 192.168.109.164:6379 192.168.109.164 6379 @ mymaster 192.168.109.160 6379
22:53:31.953 * +sentinel sentinel 89d9d89c66f30370b1d045ae21cfac9ee18e8f2e 192.168.109.160 26379 @ mymaster 192.168.109.160 6379
             +哨兵 160的哨兵 89d9d89c66f30370b1d045ae21cfac9ee18e8f2e 监控主机160
22:58:37.754 * +sentinel sentinel 09f5f5062918b2c609124f359825290a883d1abd 192.168.109.164 26379 @ mymaster 192.168.109.160 6379
             +哨兵 164的哨兵 09f5f5062918b2c609124f359825290a883d1abd 监控主机160

164 日志如下

22:58:35.682 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
22:58:35.682 # Redis version=6.0.16, bits=64, commit=00000000, modified=0, pid=113039, just started
22:58:35.682 # Configuration loaded
22:58:35.683 * Increased maximum number of open files to 10032 (it was originally set to 1024).
22:58:35.683 * Running mode=sentinel, port=26379.
22:58:35.683 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is se        t to the lower value of 128.
22:58:35.702 # Sentinel ID is 09f5f5062918b2c609124f359825290a883d1abd
22:58:35.702 # +monitor master mymaster 192.168.109.160 6379 quorum 2
22:58:35.706 * +slave slave 192.168.109.163:6379 192.168.109.163 6379 @ mymaster 192.168.109.160 6379
22:58:35.711 * +slave slave 192.168.109.164:6379 192.168.109.164 6379 @ mymaster 192.168.109.160 6379
22:58:36.850 * +sentinel sentinel c5fdae5bac20a8e836e52442b5d43b03418522b0 192.168.109.163 26379 @ mymaster 192.168.109.160 6379
             +哨兵 163的哨兵 c5fdae5bac20a8e836e52442b5d43b03418522b0 监控主机160
22:58:37.571 * +sentinel sentinel 89d9d89c66f30370b1d045ae21cfac9ee18e8f2e 192.168.109.160 26379 @ mymaster 192.168.109.160 6379
            +哨兵 160的哨兵 89d9d89c66f30370b1d045ae21cfac9ee18e8f2e 监控主机160

4.2 当主节点160宕机后

160 日志如下

23:02:42.470 # +sdown master mymaster 192.168.109.160 6379
             主观下线 160服务
164    =     在 23:02:42.603 对160服务实现客观下线
23:02:42.606 # +new-epoch 1
             递增新的版本号
164    =     在 23:02:42.603 尝试进行故障迁移,并选择自己为leader
23:02:42.607 # +vote-for-leader 09f5f5062918b2c609124f359825290a883d1abd 1
             选择164哨兵作为故障迁移leader
164    =     164服务当选主节点,164哨兵向所有从节点发送跟随操作
23:02:42.973 # +config-update-from sentinel 09f5f5062918b2c609124f359825290a883d1abd 192.168.109.164 26379 @ mymaster 192.168.109.160 6379
             收到164哨兵更新配置的消息
23:02:42.973 # +switch-master mymaster 192.168.109.160 6379 192.168.109.164 6379
23:02:42.973 * +slave slave 192.168.109.163:6379 192.168.109.163 6379 @ mymaster 192.168.109.164 6379
23:02:42.973 * +slave slave 192.168.109.160:6379 192.168.109.160 6379 @ mymaster 192.168.109.164 6379
             切换主机到164,增加2个从节点163和160
23:02:53.017 # +sdown slave 192.168.109.160:6379 192.168.109.160 6379 @ mymaster 192.168.109.164 6379
             检测到160服务下线

163日志如下,参考160

23:02:42.571 # +sdown master mymaster 192.168.109.160 6379
23:02:42.609 # +new-epoch 1
23:02:42.610 # +vote-for-leader 09f5f5062918b2c609124f359825290a883d1abd 1
23:02:42.662 # +odown master mymaster 192.168.109.160 6379 #quorum 3/2
             再次主观下线160服务
23:02:42.662 # Next failover delay: I will not start a failover before Wed Nov  3 23:08:43 2021
             下一个故障推迟 : 在 23:08:43 之前,我不会开始故障迁移
23:02:42.978 # +config-update-from sentinel 09f5f5062918b2c609124f359825290a883d1abd 192.168.109.164 26379 @ mymaste r 192.168.109.160 6379
23:02:42.978 # +switch-master mymaster 192.168.109.160 6379 192.168.109.164 6379
23:02:42.978 * +slave slave 192.168.109.163:6379 192.168.109.163 6379 @ mymaster 192.168.109.164 6379
23:02:42.978 * +slave slave 192.168.109.160:6379 192.168.109.160 6379 @ mymaster 192.168.109.164 6379
23:02:52.992 # +sdown slave 192.168.109.160:6379 192.168.109.160 6379 @ mymaster 192.168.109.164 6379
             检测到160服务下线

没有看redis源码,对打印的Next failover delay: I will not start a failover before xxx 做下猜想

160 = 23:02:42.470 # +sdown master mymaster 192.168.109.160 6379
164 = 23:02:42.527 # +sdown master mymaster 192.168.109.160 6379
163 = 23:02:42.571 # +sdown master mymaster 192.168.109.160 6379
客观下线操作任何哨兵都可以,投票和客观下线是并行操作,但客观下线只能有一次,redis默认会把投票数 = 2(quorum 的值)的那一个哨兵
作为故障迁移的leader, 所以当前哨兵又进行了一次主观下线,但是发现自己的投票已经是3了,强制自己不做leader,把自己可以进行故障迁移
的时间推迟6分钟,避免越位(从日期看,164主观下线后,达到2票,164是leader)
Think : 高并发下秒杀的商品的数量判断条件 number <= X,而不是 number  == X

164 日志如下

23:02:42.527 # +sdown master mymaster 192.168.109.160 6379
23:02:42.603 # +odown master mymaster 192.168.109.160 6379 #quorum 2/2
             哨兵客观下线160服务,投票满足
23:02:42.603 # +new-epoch 1
             递增新的版本号
23:02:42.603 # +try-failover master mymaster 192.168.109.160 6379
             164哨兵尝试对160服务进行故障迁移,开始投票
 23:02:42.604 # +vote-for-leader 09f5f5062918b2c609124f359825290a883d1abd 1
             164哨兵选举它自己为故障迁移的leader
23:02:42.606 # c5fdae5bac20a8e836e52442b5d43b03418522b0 voted for 09f5f5062918b2c609124f359825290a883d1abd 1
             163哨兵投票给了164哨兵
23:02:42.607 # 89d9d89c66f30370b1d045ae21cfac9ee18e8f2e voted for 09f5f5062918b2c609124f359825290a883d1abd 1
             160哨兵投票给了164哨兵
23:02:42.671 # +elected-leader master mymaster 192.168.109.160 6379
             开始选择新的主节点
23:02:42.671 # +failover-state-select-slave master mymaster 192.168.109.160 6379
             查询160旧节点下的从节点
23:02:42.724 # +selected-slave slave 192.168.109.164:6379 192.168.109.164 6379 @ mymaster 192.168.109.160 6379
             选择164服务作为新的主节点
23:02:42.724 * +failover-state-send-slaveof-noone slave 192.168.109.164:6379 192.168.109.164 6379 @ mymaster 192.168.109.160 6379
             哨兵向164服务发送 slaveof no one 指令
23:02:42.802 * +failover-state-wait-promotion slave 192.168.109.164:6379 192.168.109.164 6379 @ mymaster 192.168.109.160 6379
             等待其他哨兵确认新的主节点
23:02:42.917 # +promoted-slave slave 192.168.109.164:6379 192.168.109.164 6379 @ mymaster 192.168.109.160 6379
             其他哨兵确认了164服务为新的主节点
23:02:42.917 # +failover-state-reconf-slaves master mymaster 192.168.109.160 6379
             开始对所有从节点做配置更新
23:02:42.972 * +slave-reconf-sent slave 192.168.109.163:6379 192.168.109.163 6379 @ mymaster 192.168.109.160 6379
             向163服务发送跟随操作
23:02:43.719 # -odown master mymaster 192.168.109.160 6379
             客观下线160服务,因为自己是leader,所有没有163的 Next failover delay
23:02:43.934 * +slave-reconf-inprog slave 192.168.109.163:6379 192.168.109.163 6379 @ mymaster 192.168.109.160 6379
             163服务正在更新配置
23:02:43.934 * +slave-reconf-done slave 192.168.109.163:6379 192.168.109.163 6379 @ mymaster 192.168.109.160 6379
             163服务完成配置
23:02:43.999 # +failover-end master mymaster 192.168.109.160 6379
             本次164哨兵对160服务的故障迁移完毕
23:02:43.999 # +switch-master mymaster 192.168.109.160 6379 192.168.109.164 6379
23:02:43.999 * +slave slave 192.168.109.163:6379 192.168.109.163 6379 @ mymaster 192.168.109.164 6379
23:02:43.999 * +slave slave 192.168.109.160:6379 192.168.109.160 6379 @ mymaster 192.168.109.164 6379
             切换主机到164,增加2个从节点,163和160
23:02:54.036 # +sdown slave 192.168.109.160:6379 192.168.109.160 6379 @ mymaster 192.168.109.164 6379
             检测到160服务下线
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

哈哈兽0026

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值