[redis 源码走读] - sentinel 哨兵 - 脑裂处理方案

最新推荐文章于 2024-06-11 11:44:35 发布

smart哥

最新推荐文章于 2024-06-11 11:44:35 发布

阅读量728

点赞数 21

分类专栏： redis专题文章标签： redis sentinel

本文链接：https://blog.csdn.net/smart_an/article/details/139375802

版权

redis专题专栏收录该内容

53 篇文章 0 订阅

订阅专栏

作者简介：大家好，我是smart哥，前中兴通讯、美团架构师，现某互联网公司CTO

联系qq：184480602，加我进群，大家一起学习，一起进步，一起对抗互联网寒冬

学习必须往深处挖，挖的越深，基础越扎实！

阶段1、深入多线程

 阶段2、深入多线程设计模式

 阶段3、深入juc源码解析

阶段4、深入jdk其余源码解析

阶段5、深入jvm源码解析

码哥源码部分

码哥讲源码-原理源码篇【2024年最新大厂关于线程池使用的场景题】

码哥讲源码【炸雷啦！炸雷啦！黄光头他终于跑路啦！】

码哥讲源码-【jvm课程前置知识及c/c++调试环境搭建】

码哥讲源码-原理源码篇【揭秘join方法的唤醒本质上决定于jvm的底层析构函数】

码哥源码-原理源码篇【Doug Lea为什么要将成员变量赋值给局部变量后再操作？】

码哥讲源码【你水不是你的错,但是你胡说八道就是你不对了！】

码哥讲源码【谁再说Spring不支持多线程事务，你给我抽他！】

终结B站没人能讲清楚红黑树的历史，不服等你来踢馆！

打脸系列【020-3小时讲解MESI协议和volatile之间的关系，那些将x86下的验证结果当作最终结果的水货们请闭嘴】

哨兵模式的 redis 集群，假如原来只有一个主服务，经过故障转移后，产生多个主服务，这样脑裂现象出现了。通过合理部署配置 sentinel/master，可以降低脑裂问题出现的概率。

1. 原理

1.1. 概述

哨兵模式的 redis 集群有三种角色：sentinel/master/slave，它们通过 tcp 链接，相互建立联系。

sentinel 作为高可用集群管理者，它的功能主要是：检查故障，发现故障，故障转移。

1.2. 故障转移流程

在 redis 集群中，当 sentinel 检测到 master 出现故障，那么 sentinel 需要对集群进行故障转移。
当一个 sentinel 发现 master 下线，它会将下线的 master 确认为 主观下线 。
当“法定个数”（quorum）sentinel 已经发现该 master 节点下线，那么 sentinel 会将其确认为 客观下线 。
多个 sentinel 根据一定的逻辑，选举出一个 sentinel 作为 leader，由它去进行故障转移，将原来连接已客观下线 master 最优的一个 slave 提升为新 master 角色。旧 master 如果重新激活，它将被降级为 slave。

详细请参考：《[redis 源码走读] sentinel 哨兵 - 故障转移》

1.3. 脑裂场景

我们看看下面的部署：两个机器，分别部署了 redis 的三个角色。

如果将集群部署在两个机器上（如下图）。
sentinel 配置 quorum = 1，也就是一个 sentinel 发现故障，也可以选举自己为 leader，进行故障转移。

节点	描述
M	redis主服务master
R	redis副本replication/slave
S	redis哨兵sentinel
C	链接redis客户端

 
    +----+         +----+
    | M1 |---------| R1 |
    | S1 |         | S2 |
    +----+         +----+

因为某种原因，两个机器断开链接，S2 将同机器的 R1 提升角色为 master，这样集群里，出现了两个 master 同时工作 —— 脑裂出现了。不同的 client 链接到不同的 redis 进行读写，那么两台机器就出现了 redis 数据不一致的现象。

 
    +----+           +------+
    | M1 |----//-----| [M1] |
    | S1 |           |  S2  |
    +----+           +------+

2. 解决方案

通过合理部署配置 sentinel/master，降低脑裂出现概率。

2.1. sentienl 配置

合理部署 sentinel 的节点个数，以及配置 sentinel 选举的法定人数。

sentinel 节点个数最好 >= 3。
sentinel 节点个数最好是基数。
sentinel 的选举法定人数设置为 (n/2 + 1)。

配置

 
    # sentinel.conf
    # sentinel monitor <master-name> <ip> <redis-port> <quorum>

quorum

是法定人数。作用：多个 sentinel 进行相互选举，有超过法定人数的 sentinel 选举某个 sentinel 为 leader，那么他就成为 leader， leader 负责故障转移。这个法定人数，可以配置，一般是 sentinel 个数一半以上 (n/2 + 1) 比较合理。

如果 sentinel 个数总数为 3，那么最好 quorum == 2，这样最接近真实：少数服从多数，不会出现两个票数一样的 leader同时被选上，进行故障转移。

# sentinel.conf
sentinel monitor mymaster >127.0.0.1 6379 2

2.2. master 配置

如果 master 因为某些原因与一定数量的副本失去联系了，通过修改 master 配置项，可以禁止 client 向故障的 master 写数据。

2.2.1. 问题

按照上述的 sentinel 部署方案，下面三个机器，任何一个机器出现问题，只要两个 sentinel 能相互链接，故障转移是正常的。

 
           +----+
           | M1 |
           | S1 |
           +----+
              |
    +----+    |    +----+
    | R2 |----+----| R3 |
    | S2 |         | S3 |
    +----+         +----+
    
    Configuration: quorum = 2

假如 M1 机器与其它机器断开链接了，S2 和 S3 两个 sentinel 能相互链接，sentinel 能正常进行故障转移，sentinel leader 将 R2 提升为新的 master 角色 [M2]。但是客户端 C1 仍然能读写 M1，这样仍然会出现问题，所以我们不得不对 M1 进行限制。

 
             +----+
             | M1 |
             | S1 | <- C1 (writes will be lost)
             +----+
                |
                /
                /
    +------+    |    +----+
    | [M2] |----+----| R3 |
    | S2   |         | S3 |
    +------+         +----+

2.2.2. 解决方案

限制 M1 比较简单的方案，通过修改 redis 配置 redis.conf，检查 master 节点与其它副本的联系。当 master 发现它的副本下线或者通信超时的总数量小于阈值时，那么禁止 master 进行写数据。

但是这个方案也不是完美的，min-slaves-to-write 依赖于副本的链接个数，如果 slave 个数设置不合理，那么集群很难故障转移成功。

2.2.2.1. 配置

redis.conf

 
    # It is possible for a master to stop accepting writes if there are less than
    # N slaves connected, having a lag less or equal than M seconds.
    #
    # The N slaves need to be in "online" state.
    #
    # The lag in seconds, that must be <= the specified value, is calculated from
    # the last ping received from the slave, that is usually sent every second.
    #
    # This option does not GUARANTEE that N replicas will accept the write, but
    # will limit the window of exposure for lost writes in case not enough slaves
    # are available, to the specified number of seconds.
    #
    # For example to require at least 3 slaves with a lag <= 10 seconds use:
    #
    # min-slaves-to-write 3
    # min-slaves-max-lag 10
    #
    # Setting one or the other to 0 disables the feature.
    #
    # By default min-slaves-to-write is set to 0 (feature disabled) and
    # min-slaves-max-lag is set to 10.

 
    # master 须要有至少 x 个副本连接。
    min-slaves-to-write x
    # 数据复制和同步的延迟不能超过 x 秒。
    min-slaves-max-lag x

注意：高版本 redis 已经修改这个两个选项

# min-replicas-to-write x
# min-replicas-max-lag x

2.2.2.2. 源码实现流程

时钟定期检查副本链接健康情况。

 
    #define run_with_period(_ms_) if ((_ms_ <= 1000/server.hz) || !(server.cronloops%((_ms_)/(1000/server.hz))))
    
    int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) {
      run_with_period(1000) replicationCron();
    }
    
    /* Replication cron function, called 1 time per second. */
    // 复制周期执行的函数，每秒调用1次。
    void replicationCron(void) {
        // 更新延迟至 lag 小于 min-slaves-max-lag 的从服务器数量
        refreshGoodSlavesCount();
    }
    
    /* This function counts the number of slaves with lag <= min-slaves-max-lag.
     * If the option is active, the server will prevent writes if there are not
     * enough connected slaves with the specified lag (or less). */
    // 更新延迟至 lag 小 于min-slaves-max-lag 的从服务器数量
    void refreshGoodSlavesCount(void) {
        listIter li;
        listNode *ln;
        int good = 0;
    
        // 没设置限制则返回。
        if (!server.repl_min_slaves_to_write ||
            !server.repl_min_slaves_max_lag) return;
    
        listRewind(server.slaves,&li);
        // 遍历所有的从节点 client。
        while((ln = listNext(&li))) {
            client *slave = ln->value;
            // 计算延迟值
            time_t lag = server.unixtime - slave->repl_ack_time;
    
            // 计数小于延迟限制的个数。
            if (slave->replstate == SLAVE_STATE_ONLINE &&
                lag <= server.repl_min_slaves_max_lag) good++;
        }
        server.repl_good_slaves_count = good;
    }

超出配置范围，master 禁止写命令。

 
    int processCommand(client *c) {
        ...
        /* Don't accept write commands if there are not enough good slaves and
         * user configured the min-slaves-to-write option. */
        if (server.masterhost == NULL &&
            server.repl_min_slaves_to_write &&
            server.repl_min_slaves_max_lag &&
            c->cmd->flags & CMD_WRITE &&
            server.repl_good_slaves_count < server.repl_min_slaves_to_write)
        {
            flagTransaction(c);
            addReply(c, shared.noreplicaserr);
            return C_OK;
        }
        ...
    }

3. 小结

redis 脑裂主要表现为：同一个 redis 集群，原来的 master，经过故障转移后，出现多个 master。
解决方案主要通过 sentinel 哨兵的配置和 redis 的配置去解决问题。
上述方案也是有不足的地方，例如 redis 配置限制可能会受到副本个数的影响，所以具体设置，要看具体的业务场景。主要是怎么通过比较小的代价去解决问题，或者降低出现问题的概率。
redis 虽然已经发布了 gossip 协议的无中心集群，sentinel 哨兵模式还是比较常用的，我们不建议直接使用 sentinel，可以考虑使用 codis。

smart哥

关注

21
点赞
踩
10

收藏

觉得还不错? 一键收藏
0
评论
[redis 源码走读] - sentinel 哨兵 - 脑裂处理方案

哨兵模式的 redis 集群有三种角色：sentinel/master/slave，它们通过 tcp 链接，相互建立联系。sentinel 作为高可用集群管理者，它的功能主要是：检查故障，发现故障，故障转移。
复制链接

扫一扫

专栏目录