Redis集群分析（31）

最新推荐文章于 2022-12-03 18:04:58 发布

huserblog

最新推荐文章于 2022-12-03 18:04:58 发布

阅读量127

点赞数

文章标签： redis nosql 数据库

本文链接：https://blog.csdn.net/qq_39210987/article/details/111823555

版权

1、头领选举

在（30）中分析了在头领选举时哨兵服务器之际的交互方式，接着我们继续分析头领选举时如何统计选票，确认头领。

在（30）中提到了如下代码：

在这里插入图片描述

在（30）中解析了交互用的sentinelAskMasterStateToOtherSentinels方法，这里继续解析下面的sentinelFailoverStateMachine方法，其内容如下：

void sentinelFailoverStateMachine(sentinelRedisInstance *ri) {
    serverAssert(ri->flags & SRI_MASTER);

    if (!(ri->flags & SRI_FAILOVER_IN_PROGRESS)) return;

    switch(ri->failover_state) {
        case SENTINEL_FAILOVER_STATE_WAIT_START:
            sentinelFailoverWaitStart(ri);
            break;
        case SENTINEL_FAILOVER_STATE_SELECT_SLAVE:
            sentinelFailoverSelectSlave(ri);
            break;
        case SENTINEL_FAILOVER_STATE_SEND_SLAVEOF_NOONE:
            sentinelFailoverSendSlaveOfNoOne(ri);
            break;
        case SENTINEL_FAILOVER_STATE_WAIT_PROMOTION:
            sentinelFailoverWaitPromotion(ri);
            break;
        case SENTINEL_FAILOVER_STATE_RECONF_SLAVES:
            sentinelFailoverReconfNextSlave(ri);
            break;
    }
}

这个方法很简单，就是一个switch语句，其中switch的参数ri->failover_state在（30）中提到过。在具体分析这个参数前，先看其取值，即：
SENTINEL_FAILOVER_STATE_WAIT_START
SENTINEL_FAILOVER_STATE_SELECT_SLAVE
SENTINEL_FAILOVER_STATE_SEND_SLAVEOF_NOONE
SENTINEL_FAILOVER_STATE_WAIT_PROMOTION
SENTINEL_FAILOVER_STATE_RECONF_SLAVES

他们都有相同的前缀SENTINEL_FAILOVER_STATE。如果去掉这些前缀，他们就变成了WAIT_START、SELECT_SLAVE、SEND_SLAVEOF_NOONE、WAIT_PROMOTION、RECONF_SLAVES。这里我们首先看第一个:WAIT_START（第7行），即等待开始。这里要需要先解释故障转移的具体操作流程，才能更好的理解WAIT_START的意义。

对于哨兵的故障转移的起点是主客观下线，主客观下线的具体操作在之前解析过了。当哨兵判断主服务器客观下线之后，就需要对主服务器进行故障转移，但是哨兵是一个集群，不止一台机器，而故障转移只需要一台机器进行执行便可。所以在开始真正的故障转移前需要选择一台哨兵，而这个选择就是通过头领选举来实现的。上面的WAIT_START的意义就是等待头领选举完成，然后开进行真正的故障转移。

然后是第二个：SELECT_SLAVE（第10行），即选择从服务器。真正的故障转移其实很简单，就是从剩余的还活着的从服务器中选择一台作为新的主服务器，然后对外提供服务器。所以这里的第二步是SELECT_SLAVE（选择一个从服务器）。

然后是第三个：SEND_SLAVEOF_NOONE（第13行），即发送slaveof no one命令。这一步的意义在与将从服务器变为主服务器。在分析redis的主从模式的时候，讲解了slaveof命令，当时提到了no one的意思是将从服务器转变为主服务器。

然后是第四个：WAIT_PROMOTION（第16行），即等待转变成功。

最后是第五个：RECONF_SLAVES（第19行），即重新配置从服务器。有了新的主服务器后，需要将其他的从服务器设置为新的主服务器的从服务器。

然后我们再继续分析ri->failover_state参数的取值问题。在（30）中我们分析了，在sentinelStartFailoverIfNeeded方法中，如果主服务器是客观下线的话，会执行一个sentinelStartFailover方法。这个方法会将ri->failover_state的值修改为SENTINEL_FAILOVER_STATE_WAIT_START。所以这里我们首先看第7，8行failover_state为SENTINEL_FAILOVER_STATE_WAIT_START的情况。
这里的处理也很简单就是执行了一个sentinelFailoverWaitStart方法，其内容如下：

/* ---------------- Failover state machine implementation ------------------- */
void sentinelFailoverWaitStart(sentinelRedisInstance *ri) {
    char *leader;
    int isleader;

    /* Check if we are the leader for the failover epoch. */
    leader = sentinelGetLeader(ri, ri->failover_epoch);
    isleader = leader && strcasecmp(leader,sentinel.myid) == 0;
    sdsfree(leader);

    /* If I'm not the leader, and it is not a forced failover via
     * SENTINEL FAILOVER, then I can't continue with the failover. */
    if (!isleader && !(ri->flags & SRI_FORCE_FAILOVER)) {
        int election_timeout = SENTINEL_ELECTION_TIMEOUT;

        /* The election timeout is the MIN between SENTINEL_ELECTION_TIMEOUT
         * and the configured failover timeout. */
        if (election_timeout > ri->failover_timeout)
            election_timeout = ri->failover_timeout;
        /* Abort the failover if I'm not the leader after some time. */
        if (mstime() - ri->failover_start_time > election_timeout) {
            sentinelEvent(LL_WARNING,"-failover-abort-not-elected",ri,"%@");
            sentinelAbortFailover(ri);
        }
        return;
    }
    sentinelEvent(LL_WARNING,"+elected-leader",ri,"%@");
    if (sentinel.simfailure_flags & SENTINEL_SIMFAILURE_CRASH_AFTER_ELECTION)
        sentinelSimFailureCrash();
    ri->failover_state = SENTINEL_FAILOVER_STATE_SELECT_SLAVE;
    ri->failover_state_change_time = mstime();
    sentinelEvent(LL_WARNING,"+failover-state-select-slave",ri,"%@");
}

首先是第7行，这里调用了一个sentinelGetLeader方法，这个方法会统计投票的结果，其代码如下：

/* Scan all the Sentinels attached to this master to check if there
 * is a leader for the specified epoch.
 *
 * To be a leader for a given epoch, we should have the majority of
 * the Sentinels we know (ever seen since the last SENTINEL RESET) that
 * reported the same instance as leader for the same epoch. */
char *sentinelGetLeader(sentinelRedisInstance *master, uint64_t epoch) {
    dict *counters;
    dictIterator *di;
    dictEntry *de;
    unsigned int voters = 0, voters_quorum;
    char *myvote;
    char *winner = NULL;
    uint64_t leader_epoch;
    uint64_t max_votes = 0;

    serverAssert(master->flags & (SRI_O_DOWN|SRI_FAILOVER_IN_PROGRESS));
    counters = dictCreate(&leaderVotesDictType,NULL);

    voters = dictSize(master->sentinels)+1; /* All the other sentinels and me.*/

    /* Count other sentinels votes */
    di = dictGetIterator(master->sentinels);
    while((de = dictNext(di)) != NULL) {
        sentinelRedisInstance *ri = dictGetVal(de);
        if (ri->leader != NULL && ri->leader_epoch == sentinel.current_epoch)
            sentinelLeaderIncr(counters,ri->leader);
    }
    dictReleaseIterator(di);

    /* Check what's the winner. For the winner to win, it needs two conditions:
     * 1) Absolute majority between voters (50% + 1).
     * 2) And anyway at least master->quorum votes. */
    di = dictGetIterator(counters);
    while((de = dictNext(di)) != NULL) {
        uint64_t votes = dictGetUnsignedIntegerVal(de);

        if (votes > max_votes) {
            max_votes = votes;
            winner = dictGetKey(de);
        }
    }
    dictReleaseIterator(di);

    /* Count this Sentinel vote:
     * if this Sentinel did not voted yet, either vote for the most
     * common voted sentinel, or for itself if no vote exists at all. */
    if (winner)
        myvote = sentinelVoteLeader(master,epoch,winner,&leader_epoch);
    else
        myvote = sentinelVoteLeader(master,epoch,sentinel.myid,&leader_epoch);

    if (myvote && leader_epoch == epoch) {
        uint64_t votes = sentinelLeaderIncr(counters,myvote);

        if (votes > max_votes) {
            max_votes = votes;
            winner = myvote;
        }
    }

    voters_quorum = voters/2+1;
    if (winner && (max_votes < voters_quorum || max_votes < master->quorum))
        winner = NULL;

    winner = winner ? sdsnew(winner) : NULL;
    sdsfree(myvote);
    dictRelease(counters);
    return winner;
}

这个方法会统计选票，确定选举的结果。首先看第18行，这里创建了一个名为counters的字典，字典的key为候选服务器的runid，value为其票数。然后是第20行的voters，这个参数代表了投票的总数。然后是第22行到29行，这段代码在统计每台服务器的票数。这段代码其实也很简单，首先是23行从参数master->sentinels（这个参数在解析哨兵如何发现其他哨兵服务器的时候提到过，这个参数中存储的是其发现的其他哨兵服务器。）中取出所有的哨兵服务器。然后是第24行使用一个while循环遍历所有的服务器，对于其中的每一个服务器，首先检查其epoch和leader是否符合条件（第26行），若符合条件则执行sentinelLeaderIncr方法（第27行），统计票数。

sentinelLeaderIncr方法的内容如下：

/* Helper function for sentinelGetLeader, increment the counter
 * relative to the specified runid. */
int sentinelLeaderIncr(dict *counters, char *runid) {
    dictEntry *existing, *de;
    uint64_t oldval;

    de = dictAddRaw(counters,runid,&existing);
    if (existing) {
        oldval = dictGetUnsignedIntegerVal(existing);
        dictSetUnsignedIntegerVal(existing,oldval+1);
        return oldval+1;
    } else {
        serverAssert(de != NULL);
        dictSetUnsignedIntegerVal(de,1);
        return 1;
    }
}

这个方法很简单，如果传入的runid在counters中已经存在，那么在已经存在的数据上加一，若不存在则新建一个并将其值设置为1。

到这里其统计票数的代码便结束了。为了更好的理解其统计方式我们需要简单总结一下其投票的过程。同样是从主客观下线开始，当其判断主服务器客观下线后，便会立刻调用sentinelAskMasterStateToOtherSentinels方法，这个方法我们之前解析过，他会向其他的哨兵投票命令，并注册一个名叫sentinelReceiveIsMasterDownReply的方法来处理其返回结果。当其他的哨兵接收到这个投票命令后，若未投票则将票投给他，若已投票则向其返回其投票的服务器的runid。发送投票的哨兵在接收到其返回后，会将数据记录在代表该服务器的实例（ri）中，而这个实例是存储在参数master->sentinels中的。

然后便是这里的统计票数的代码，它只需要遍历一下所有的服务器将其投的票统计出来便可，统计的方法便是sentinelLeaderIncr方法。

然后是sentinelGetLeader方法的第34行到43行，这段代码很简单就是遍历一下统计结果，拿到票数最多的服务器runid和其票数。winner为其runid，max_votes为其得票数。

然后是第48行到60行，这段代码主要是统计当前哨兵服务器的票。其中49行和51行的sentinelVoteLeader方法，在之前分析过，它会根据epoch来判断是否投过票，不会重复投票。

最后是63行的if语句，这里会有两个条件：1、得票数过半；2、得票数超过其设置的quorum（配置哨兵服务器时候设置的满足客观下线的哨兵数）。如果不满足这两个条件，那么这次选举不成立winner会被设置null。

至此，统计投票结果的sentinelGetLeader方法便分析完了。接着我们继续看调用这个方法的sentinelFailoverWaitStart方法。调用统计投票结果的代码在第7行，拿到leader后，第8行会比较leader和其自身的runid，判断其自身是否是leader。如果不是leader，则执行第13行到26行的代码。这段代码的主要作用是退出故障转移。若是leader，则执行第27行及之后的代码，继续执行故障转移。其中重点在第30行，会将参数 ri->failover_state 的值设置为 SENTINEL_FAILOVER_STATE_SELECT_SLAVE。

huserblog

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Redis集群分析（31）

1、头领选举在（30）中分析了在头领选举时哨兵服务器之际的交互方式，接着我们继续分析头领选举时如何统计选票，确认头领。在（30）中提到了如下代码：在（30）中解析了交互用的sentinelAskMasterStateToOtherSentinels方法，这里继续解析下面的sentinelFailoverStateMachine方法，其内容如下：void sentinelFailoverStateMachine(sentinelRedisInstance *ri) { serverAsse
复制链接

扫一扫