一:手动故障转移
Redis集群支持手动故障转移。也就是向从节点发送”CLUSTER FAILOVER”命令,使其在主节点未下线的情况下,发起故障转移流程,升级为新的主节点,而原来的主节点降级为从节点。
为了不丢失数据,向从节点发送”CLUSTER FAILOVER”命令后,流程如下:
a:从节点收到命令后,向主节点发送CLUSTERMSG_TYPE_MFSTART包;
b:主节点收到该包后,会将其所有客户端置于阻塞状态,也就是在10s的时间内,不再处理客户端发来的命令;并且在其发送的心跳包中,会带有CLUSTERMSG_FLAG0_PAUSED标记;
c:从节点收到主节点发来的,带CLUSTERMSG_FLAG0_PAUSED标记的心跳包后,从中获取主节点当前的复制偏移量。从节点等到自己的复制偏移量达到该值后,才会开始执行故障转移流程:发起选举、统计选票、赢得选举、升级为主节点并更新配置;
”CLUSTER FAILOVER”命令支持两个选项:FORCE和TAKEOVER。使用这两个选项,可以改变上述的流程。
如果有FORCE选项,则从节点不会与主节点进行交互,主节点也不会阻塞其客户端,而是从节点立即开始故障转移流程:发起选举、统计选票、赢得选举、升级为主节点并更新配置。
如果有TAKEOVER选项,则更加简单粗暴:从节点不再发起选举,而是直接将自己升级为主节点,接手原主节点的槽位,增加自己的configEpoch后更新配置。
因此,使用FORCE和TAKEOVER选项,主节点可以已经下线;而不使用任何选项,只发送”CLUSTER FAILOVER”命令的话,主节点必须在线。
在clusterCommand函数中,处理”CLUSTER FAILOVER”命令的部分代码如下:
else if (!strcasecmp(c->argv[1]->ptr,"failover") &&
(c->argc == 2 || c->argc == 3))
{
/* CLUSTER FAILOVER [FORCE|TAKEOVER] */
int force = 0, takeover = 0;
if (c->argc == 3) {
if (!strcasecmp(c->argv[2]->ptr,"force")) {
force = 1;
} else if (!strcasecmp(c->argv[2]->ptr,"takeover")) {
takeover = 1;
force = 1; /* Takeover also implies force. */
} else {
addReply(c,shared.syntaxerr);
return;
}
}
/* Check preconditions. */
if (nodeIsMaster(myself)) {
addReplyError(c,"You should send CLUSTER FAILOVER to a slave");
return;
} else if (myself->slaveof == NULL) {
addReplyError(c,"I'm a slave but my master is unknown to me");
return;
} else if (!force &&
(nodeFailed(myself->slaveof) ||
myself->slaveof->link == NULL))
{
addReplyError(c,"Master is down or failed, "
"please use CLUSTER FAILOVER FORCE");
return;
}
resetManualFailover();
server.cluster->mf_end = mstime() + REDIS_CLUSTER_MF_TIMEOUT;
if (takeover) {
/* A takeover does not perform any initial check. It just
* generates a new configuration epoch for this node without
* consensus, claims the master's slots, and broadcast the new
* configuration. */
redisLog(REDIS_WARNING,"Taking over the master (user request).");
clusterBumpConfigEpochWithoutConsensus();
clusterFailoverReplaceYourMaster();
} else if (force) {
/* If this is a forced failover, we don't need to talk with our
* master to agree about the offset. We just failover taking over
* it without coordination. */
redisLog(REDIS_WARNING,"Forced failover user request accepted.");
server.cluster->mf_can_start = 1;
} else {
redisLog(REDIS_WARNING,"Manual failover user request accepted.");
clusterSendMFStart(myself->slaveof);
}
addReply(c,