paxos

Paxos is a family of protocols for solving consensus(一致) in a network of unreliable(不可靠的) processors(处理器). Consensus is the process of agreeing on one result among a group of participants. This problem becomes difficult when the participants or their communication medium may experience failures.[1]

Paxos是一组协议,用于解决在不可靠网络当中的一致性。一致性指在一组参与者当中获取一个结果(有且只有一个)。这个问题在参与者之间通讯存在通讯故障时变得困难。

当参与者之间通讯出现故障时,无法保证每一个参与者都能从其他的任意一个伙伴处获取最新的消息,这阻碍了一致性的保证。


Consensus protocols are the basis for the state machine approach to distributed computing, as suggested by Leslie Lamport[2] and surveyed(审查) by Fred Schneider.[3] The state machine approach is a technique for converting an algorithm into a fault-tolerant, distributed implementation. Ad-hoc techniques may leave important cases of failures unresolved. The principled approach proposed by Lamport et al. ensures all cases are handled safely.

The Paxos protocol was first published in 1989 and named after a fictional legislative consensus system used on the Paxos island in Greece.[4] It was later published as a journal article in 1998.[5]

The Paxos family of protocols includes a spectrum(范围,系列) of trade-offs(交易) between the number of processors, number of message delays(延迟) before learning the agreed value, the activity(活跃) level of individual participants, number of messages sent, and types of failures.

Paxos协议簇包括一系列的交易:处理器数量,延迟信息的数值,参与者个体的权限级别,被发送信息的数值,失效的类型之间的交易。


Although no deterministic fault-tolerant consensus protocol can guarantee progress in an asynchronous network (a result proved in a paper by Fischer, Lynch and Paterson[6]), Paxos guarantees safety (freedom from inconsistency), and the conditions that could prevent it from making progress are difficult to provoke.[5][7][8][9][10]

Paxos is normally used in situations requiring durability (for example, to replicate a file or a database), in which the amount of durable state could be large. The protocol attempts to make progress even during periods when some bounded number of replicas are unresponsive. However, a reconfiguration mechanism(结构) is available, and can be used to drop a permanently failed replica, or to add new replicas to the group.

Paxos通常被用在需要持久性的情景中(例如复制文件或者数据库),该情景下的持久状态很大(长)。该协议尝试当某个数量范围内的备份(机器)不再响应时仍然行之有效。(这里某个数量范围指的是一个少数派,少数派和多数派在数学上有严格的定义,后面再说)。当然重新配置也是可行的,可以关掉一个永久失效的备份(机器),然后加入一个新的备份到集群(组)中。






---------------------------------------------------------------------------------

One.Assumptions
一、假设

In order to simplify the presentation of Paxos, the following assumptions and definitions are made explicit. Techniques to broaden the applicability are known in the literature, and are not covered in this article; please see references for further reading.

1.Processors

    Processors operate at arbitrary speed.
    Processors may experience failures.
    Processors with stable storage may re-join the protocol after failures (following a crash-recovery failure model).
    Processors do not collude, lie, or otherwise attempt to subvert the protocol. (That is, Byzantine failures don't occur. See Byzantine Paxos for a solution that tolerates failures that arise from arbitrary/malicious behavior of the processes.)

1.处理器
    处理器速度任意
    处理器可能出现故障(导致无法通讯)
    存储稳定的处理器可能在故障恢复之后重新加入
    处理器不勾结,撒谎,或者颠覆协议(永远遵守协议)
    
    
    
2.Network

    Processors can send messages to any other processor.
    Messages are sent asynchronously and may take arbitrarily long to deliver.
    Messages may be lost, reordered, or duplicated.
    Messages are delivered without corruption. (That is, Byzantine failures don't occur. See Byzantine Paxos for a solution which tolerates corrupted messages that arise from arbitrary/malicious behavior of the messaging channels.)
    
2.网络
    处理器可以给其他任意的处理器发送消息
    消息可以异步发送,可以花费任意长的时间
    消息可以丢失,复制,重新安排
    消息被诚信发送(没有贪污受贿,欺骗的行为)
    
3.Number of processors

In general, a consensus algorithm can make progress using 2F+1 processors despite the simultaneous failure of any F processors.[15] However, using reconfiguration, a protocol may be employed which survives any number of total failures as long as no more than F fail simultaneously.

通常,一个一致性算法可以生效:当总处理器有2F+1个,而不超过F个处理器同时失效。(这是因为当最多F个处理器失效之后,仍然有F+1个处理正常,而F+1正好是一个多数派的最小值)。


Two.Roles
二、角色

Paxos describes the actions of the processes by their roles in the protocol: client, acceptor, proposer, learner, and leader. In typical implementations, a single processor may play one or more roles at the same time. This does not affect the correctness of the protocol—it is usual to coalesce roles to improve the latency and/or number of messages in the protocol.

Paxos定义了每个角色所能执行的操作(过程)。
角色有:client(客户端), acceptor(接受者), proposer(申请人), learner(学习者), and leader(领导者)。
在典型的实现中,一个处理器可在同时间内担任一个或者多个角色。





Client
    The Client issues a request to the distributed system, and waits for a response. For instance, a write request on a file in a distributed file server.
客户端
    客户端想分布式系统发出一个请求,并且等待回应。例如:对分布式系统的一个文件的写请求。

Acceptor (Voters)
    The Acceptors act as the fault-tolerant "memory" of the protocol. Acceptors are collected into groups called Quorums(法定人数). Any message sent to an Acceptor must be sent to a Quorum of Acceptors. Any message received from an Acceptor is ignored unless a copy is received from each Acceptor in a Quorum.
    
接受者(投票人)
    接受者在协议中担任一个容错的“内存”。接受者的集合被称为Quorums(法定人数)。发送给一个接受者的消息必须发送给法定人数内的所有接受者。被其中一个接受者接受的消息是无效的,除非同样的消息被法定人数中的每一个接受者接受。
    
    

Proposer
    A Proposer advocates(拥护) a client request, attempting to convince the Acceptors to agree on it, and acting as a coordinator to move the protocol forward when conflicts occur.

申请人(提案人)
    一个提案人拥护一个客户端请求,并尝试说服其他的接受者同意该请求,当冲突发生时起协调作用推动协议进展。



Learner
    Learners act as the replication factor for the protocol. Once a Client request has been agreed on by the Acceptors, the Learner may take action (i.e.: execute the request and send a response to the client). To improve availability of processing, additional Learners can be added.
    
学习者
    

Leader
    Paxos requires a distinguished Proposer (called the leader) to make progress. Many processes may believe they are leaders, but the protocol only guarantees progress if one of them is eventually chosen. If two processes believe they are leaders, they may stall the protocol by continuously proposing conflicting updates. However, the safety properties are still preserved on that case.
    
    
    
    
    
    
Quorums

Quorums express the safety properties of Paxos by ensuring at least some surviving processor retains knowledge of the results.

Quorums are defined as subsets of the set of Acceptors such that any two subsets (that is, any two Quorums) share at least one member. Typically, a Quorum is any majority of participating Acceptors. For example, given the set of Acceptors {A,B,C,D}, a majority Quorum would be any three Acceptors: {A,B,C}, {A,C,D}, {A,B,D}, {B,C,D}. More generally, arbitrary positive weights can be assigned to Acceptors and a Quorum defined as any subset of Acceptors with the summary weight greater than half of the total weight of all Acceptors.




Proposal Number & Agreed Value
提案编号&赞成值

Each attempt to define an agreed value v is performed with proposals which may or may not be accepted by Acceptors. Each proposal is uniquely numbered for a given Proposer. The value corresponding to a numbered proposal can be computed as part of running the Paxos protocol, but does not have to.


每一个提案都被唯一编号。

    

Safety and liveness properties

In order to guarantee safety, Paxos defines three safety properties and ensures they are always held, regardless of the pattern of failures:

Non-triviality
    Only proposed values can be learned.[8]
Consistency
    At most one value can be learned (i.e., two different learners cannot learn different values).[8][9]
Liveness(C;L)
    If value C has been proposed, then eventually learner L will learn some value (if sufficient processors remain non-faulty).[9]
    
    
    
    
    
    
====================================================================================================





Basic Paxos


This protocol is the most basic of the Paxos family. Each instance of the Basic Paxos protocol decides(决定) on a single output value. The protocol proceeds over several rounds. A successful round has two phases. A Proposer should not initiate Paxos if it cannot communicate with at least a Quorum of Acceptors:

Quorum:法定人数,指代所有Acceptors中的一个多数派
如果一个Proposer不能够和所有Acceptors中的一个Quorum通讯,则不能开始一个Paxos


Phase 1a: Prepare

    A Proposer (the leader) creates a proposal identified with a number N. This number must be greater than any previous proposal number used by this Proposer. Then, it sends a Prepare message containing this proposal to a Quorum of Acceptors.

Phase 1a: 准备
    提案者(leader)创建一个由数值N唯一标志的提案。这个数字必须大于该提案者之前的提案号。然后它发送一个包含了该提案的消息到接受者中的一个多数派(Quorum)。

Phase 1b: Promise

    If the proposal's number N is higher than any previous proposal number received from any Proposer by the Acceptor, then the Acceptor must return a promise to ignore all future proposals having a number less than N. If the Acceptor accepted a proposal at some point in the past, it must include the previous proposal number and previous value in its response to the Proposer.

    Otherwise, the Acceptor can ignore the received proposal. It does not have to answer in this case for Paxos to work. However, for the sake of optimization, sending a denial (Nack) response would tell the Proposer that it can stop its attempt to create consensus with proposal N.   
   
Phase 1b: 允诺
    如果Acceptor发现提案的编号N大于之前接受到任何提案的编号,则该Acceptor必须返回不再接受任何编号小于N的提案。(这种情况属于之前没有接受过其他提案,也即第一次接受Proposer的提案)
    如果Acceptor在之前某时刻接受了一个提案,则需要在返回的消息中包含:之前的提案号和提案值。(这种情况至少要从第二次算起,不再废话)
    
    否则,Acceptor可以忽略该提案,或者发送一个Nack给Proposer,以告知其不要再创建编号为N的提案。
    
    这个条件说明Acceptor接受提案的条件是必须要大于之前接受过的提案号,否则不予接受,等于也不行。
    
    
    

Phase 2a: Accept Request

    If a Proposer receives enough promises from a Quorum of Acceptors, it needs to set a value to its proposal. If any Acceptors had previously accepted any proposals, then they'll have sent their values to the Proposer, who now must set the value of its proposal to the value associated with the highest proposal number reported by the Acceptors. If none of the Acceptors had accepted a proposal up to this point, then the Proposer may choose any value for its proposal.[17]

    The Proposer sends an Accept Request message to a Quorum of Acceptors with the chosen value for its proposal.
    
Phase 2a: 接受请求
    如果Proposer收到了来自一个Quorum的足够Promise(允诺),它需要为该proposal设置其值。两种情况:
    1>Acceptors之前已经接受过某个提案,并已将提案号和值发给了Proposer,此时,Proposer需要在这些已经接受了提案的Acceptors的回复消息中找出提案号最大的一个,然后将提案值设置为相应的值。
    2>假设所有的Acceptor之前都没有接受过任何提案,本次是破天荒第一次,那么Proposer就可以随便为本次proposal设置value。
    
    设置完毕之后,Proposer将发送一个Accept请求(包含本次提案设定的value)到Quorum。
       

Phase 2b: Accepted

    If an Acceptor receives an Accept Request message for a proposal N, it must accept it if and only if it has not already promised to only consider proposals having an identifier greater than N. In this case, it should register the corresponding value v and send an Accepted message to the Proposer and every Learner. Else, it can ignore the Accept Request.

    Rounds fail when multiple Proposers send conflicting Prepare messages, or when the Proposer does not receive a Quorum of responses (Promise or Accepted). In these cases, another round must be started with a higher proposal number.

    Notice that when Acceptors accept a request, they also acknowledge the leadership of the Proposer. Hence, Paxos can be used to select a leader in a cluster of nodes.

    Here is a graphic representation of the Basic Paxos protocol. Note that the values returned in the Promise message are null the first time a proposal is made, since no Acceptor has accepted a value before in this round.
    
Phase 2b: 已被接受
    如果一个Acceptor收到一个Accept Request消息(提案号为N),当且仅当它没有允诺编号大于N的任何提案时,它必须接受该提案。在这种情况下,它要注册一个一致的value V并且发送一个Accepted消息给Proposer和所有的Learner。否则,将忽略该Accept Request。
    
    
    
    
    
A typical deployment of Paxos requires a continuous stream of agreed values acting as commands to a distributed state machine. If each command is the result of a single instance of the Basic Paxos protocol, a significant amount of overhead would result.

If the leader is relatively stable, phase 1 becomes unnecessary. Thus, it is possible to skip phase 1 for future instances of the protocol with the same leader.

To achieve this, the instance number I is included along with each value. Multi-Paxos reduces the failure-free message delay (proposal to learning) from 4 delays to 2 delays.
    


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值