Raft论文翻译(5.2)——Leader选举

触发选举的几种情况:

1.集群启动,每个节点都是follower,然后经过随机的时间超时,变为candidate,发起选举;

2.leader到follower的心跳超时,follower变为candidate然后发起选举

3.成员变更发起选举;

5.2 Leader election

leader选举
        Raft uses a heartbeat mechanism to trigger leader elec- tion. When servers start up, they begin as followers. A server remains in follower state as long as it receives valid RPCs from a leader or candidate. Leaders send periodic heartbeats (AppendEntries RPCs that carry no log entries) to all followers in order to maintain their authority. If a follower receives no communication over a period of time called the election timeout, then it assumes there is no vi- able leader and begins an election to choose a new leader.

        Raft使用心跳机制触发leader选举。当server启动的时候,开始状态时follower。一个server保持follower状态知道收到leader或者candidate的消息。leader发送每个心跳(携带空log  AppendEntries  RPCs )给所有follower 以维持其权威性。如果一个follower在一轮选举超时时间没收到消息,它会假设整理没有有效的leader并且开始选举。


        To begin an election, a follower increments its current term and transitions to candidate state. It then votes for itself and issues RequestVote RPCs in parallel to each of the other servers in the cluster. A candidate continues in this state until one of three things happens: (a) it wins the election, (b) another server establishes itself as leader, or (c) a period of time goes by with no winner. These out- comes are discussed separately in the paragraphs below.

        开始一个选举,follower增加 current term并转换为candidate 状态。然后投票给他自己,并且并发给每个其他servers发送RequestVote RPCs。一个候选人持续这个状态直到3个情况发生:

a.他赢得选举

b.其他server 成为leader

c.一段时间内没有获胜者

这些结果将在下面单独讨论。


        A candidate wins an election if it receives votes from a majority of the servers in the full cluster for the same term. Each server will vote for at most one candidate in a given term, on a first-come-first-served basis (note: Sec- tion 5.4 adds an additional restriction on votes). The ma- jority rule ensures that at most one candidate can win the election for a particular term (the Election Safety Prop- erty in Figure 3). Once a candidate wins an election, it becomes leader. It then sends heartbeat messages to all of the other servers to establish its authority and prevent new elections.

        一个候选人如果得到集群大多数server 相同的term的投票则赢得选举。每个server每个term将最多投出1票。 基于先到先得的基础  (note: Sec- tion 5.4 adds an additional restriction on votes).多条规则确保最多一个candidate可以在具体一个term中赢得选举(the Election Safety Prop- erty in Figure 3)。一旦候选人赢得选举,它将成为leader。他随即发送心跳信息给所有其他servers去宣告他的权威和避免新的选举。


        While waiting for votes, a candidate may receive an AppendEntries RPC from another server claiming to be leader. If the leader’s term (included in its RPC) is at least as large as the candidate’s current term, then the candidate recognizes the leader as legitimate and returns to follower state. If the term in the RPC is smaller than the candidate’s current term, then the candidate rejects the RPC and con- tinues in candidate state.

        等待投票期间,候选人可能会收到另外一个生成是leader的AppendEntries RPC消息。如果leader 的这个rpc中携带的term大于candidate的current term。那么candidate 承认leader是合法的,并且自己返回follower的状态。如果term(in the RPC)小于candidate current term。那么candidate 拒绝RPC 并且继续candidate 状态。


        The third possible outcome is that a candidate neither wins nor loses the election: if many followers become candidates at the same time, votes could be split so that no candidate obtains a majority. When this happens, each candidate will time out and start a new election by incre- menting its term and initiating another round of Request- Vote RPCs. However, without extra measures split votes could repeat indefinitely.

        第三种可能,候选人在选举中既不赢也不输:如果许多追随者同时成为候选人,选票可能被分割,因此没有候选人获得多数票。当这种情况发生时,每位候选人将暂停并开始新的选举,增加其任期,并启动另一轮请求投票RPC。然而,如果不采取额外措施,分裂投票可能会无限期地重复。


        Raft uses randomized election timeouts to ensure that split votes are rare and that they are resolved quickly. To prevent split votes in the first place, election timeouts are chosen randomly from a fixed interval (e.g., 150–300ms). This spreads out the servers so that in most cases only a single server will time out; it wins the election and sends heartbeats before any other servers time out. The same mechanism is used to handle split votes. Each candidate restarts its randomized election timeout at the start of an election, and it waits for that timeout to elapse before starting the next election; this reduces the likelihood of another split vote in the new election. Section 9.3 shows that this approach elects a leader rapidly.

        Raft使用随机选举超时,以确保分裂选票很少,并且能够迅速解决。为了防止首先出现分裂投票,选举暂停时间从固定间隔(例如150–300毫秒)中随机选择。这将分散服务器,以便在大多数情况下只有一台服务器超时;它赢得选举并在其他服务器超时之前发送心跳信号。同样的机制也用于处理分割投票。每个候选人在选举开始时重新启动其随机选举超时,并在开始下一次选举之前等待该超时过去;这降低了在新的选举中再次出现分裂投票的可能性。第9.3节表明,这种方法可以快速选出领导者。

 Elections are an example of how understandability guided our choice between design alternatives. Initially we planned to use a ranking system: each candidate was assigned a unique rank, which was used to select between competing candidates. If a candidate discovered another candidate with higher rank, it would return to follower state so that the higher ranking candidate could more eas- ily win the next election. We found that this approach created subtle issues around availability (a lower-ranked server might need to time out and become a candidate again if a higher-ranked server fails, but if it does so too soon, it can reset progress towards electing a leader). We made adjustments to the algorithm several times, but after each adjustment new corner cases appeared. Eventually we concluded that the randomized retry approach is more obvious and understandable.


 Figure 6: Logs are composed of entries, which are numbered sequentially. Each entry contains the term in which it was created (the number in each box) and a command for the state machine. An entry is considered committed if it is safe for that entry to be applied to state machines.

发生脑裂怎么办

比如A(leader)BCDE的集群,因为网络分区变成AB,CDE了会怎么样?

AB——写入一直不成功比如又写了2条记录到A->B,新写入的数据不会commit;

CDE——选出新的term,的新leader,并且可以写入成功;并且等网络分区恢复之后,上面AB写入的不match的数据会被新leader强势的冲掉

动画演示 raft 在脑裂发生之后仍然可以正常工作吗?_哔哩哔哩_bilibili

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值