Cockroach Design 翻译 ( 十二) Raft – Range副本一致性

16  Raft - Consistency of RangeReplicas (Raft – Range副本一致性

Each range isconfigured to consist of three or more replicas, as specified by their ZoneConfig.The replicas in a range maintain their own instance of a distributed consensusalgorithm. We use the Raft consensusalgorithm as it is simpler to reason about and includes areference implementation covering important details. ePaxos has promising performancecharacteristics for WAN-distributed replicas, but it does not guarantee aconsistent ordering between replicas.

每个range可配置成包含三个或者更多的副本,在它们的ZoneConfig中指定。一个range的副本们维护它们自己的分布式一致算法实例。我们采用Raft一致性算法(Raft consensus algorithm )是因为其简单并且有一个包含重要细节的参考实现。ePaxos算法虽然承诺在WAN环境下分布式复制具有高性能,但它无法保障副本间操作顺序的一致性。

Raft elects arelatively long-lived leader which must be involved to propose commands. Itheartbeats followers periodically and keeps their logs replicated. In theabsence of heartbeats, followers become candidates after randomized electiontimeouts and proceed to hold new leader elections. Cockroach weights randomtimeouts such that the replicas with shorter round trip times to peers are morelikely to hold elections first (not implemented yet). Only the Raft leader maypropose commands; followers will simply relay commands to the last knownleader.

Raft选择一个相对长寿命的leader,该leader负责发出命令。它周期性的向追随者发送心跳并保持它们的日志被复制。当无心跳时,追随者们随机选择超时时间进行等待,过了超时时间后成为候选者,并举行一次新的leader选举。Cockroach为随机超时定义权重以使得带有较短往返时间的副本更可能首先举行选举(还没实现)。仅Raft Leader可以发出命令;追随者们只简单回应命令给最后已知的leader

Our Raftimplementation was developed together with CoreOS, but adds an extra layer ofoptimization to account for the fact that a single Node may have millions ofconsensus groups (one for each Range). Areas of optimization are chieflycoalesced heartbeats (so that the number of nodes dictates the number ofheartbeats as opposed to the much larger number of ranges) and batch processingof requests. Future optimizations may include two-phase elections andquiescentranges (i.e. stopping traffic completely for inactive ranges).

我们的Raft实现与CoreOS一起开发,但多加了一层优化,是考虑这样一个事实:单节点可以有千百万个一致的小组(一个或者每个range)。优化点主要是合并心跳(所以节点的数量确定了心跳的数量,而不是range的更大数量)和批量处理请求。将来优化可能包含两阶段选举和休眠range(即:完全停止非活动range间的通信)。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值