Raft 论文阅读笔记

最新推荐文章于 2024-07-02 21:25:27 发布

rsy56640

最新推荐文章于 2024-07-02 21:25:27 发布

阅读量694

点赞数 2

分类专栏：分布式系统

本文链接：https://blog.csdn.net/rsy56640/article/details/89116768

版权

16 篇文章 3 订阅

订阅专栏

In Search of an Understandable Consensus Algorithm (Extended Version) 阅读笔记

replicate log 来达成确定性状态机

在这里插入图片描述

共识算法需要保证 replicated log 一致性。

难以理解，难以实现。
Paxos 让每个 instance 独立，所以还需要合并成 sequential log；应该围绕 log 来设计整个系统。

只要看懂两个基本就能懂了：

全局维护的协议（Figure 3）
执行流程（Figure 2）

对系统的要求：broadcastTime ≪ electionTimeout ≪ MTBF（结点平均 crash 时间）

Election Safety：每个 term 最多只有一个 leader
Leader Append-Only：leader 绝不覆盖或删除自己的 log
Log Matching：如果2个 log 拥有相同的 term 和 index，那么 <= index 的所有 log entry 相同
Leader Completeness：leader 拥有 commit 过的所有 log
State Machine Safety：如果某个结点 apply 了 index 的 log，那么其他节点如果在这个 index 的 log 不一样，则不能 apply（这条感觉没啥用，毕竟 apply 之前都 commit 过了）

在这里插入图片描述

每一个 server 有三种 member 状态：follower, leader, candidate。

在这里插入图片描述

每一个 term 最多只有一个 leader。term 通常由 election 和 normal operation 两个阶段组成。

在这里插入图片描述

被持久化在 server 上的数据
- currentTerm：见到的最大的 term
- votedFor：在这个 currentTerm 给谁 vote（没有就是 null）
- log[]：log entries
在 server 内存中的数据
- commitIndex：commit 的最大的 log 的 index
- lastApplied：apply 到 state machine 的最大的 log 的 index
在 leader 内存中的数据（每次 election 后初始化）
- nextIndex[]：nextIndex[i] 表示发送给 server[i] 的下一个 log index
- matchIndex[]：matchIndex[i] 表示同步到 server[i] 的最大 index

所有 server
- 如果 commitIndex > lastApplied，apply log 到 state machine
- 如果发现 term T > currentTerm，就设置 currentTerm = T，变为 Follower
Follower
- 回复 Candidate 和 Leader 的 RPC
- 如果超时，没有 Leader 的 Append-Log RPC 或者 Candidate 的 Vote RPC，就变为 Candidate
Candidate
- 变为 Candidate 时，开启 election
  - currentTerm++
  - vote 自己
  - 重设 election timer
  - RequestVote RPC 发向其他 server
- 如果获得 majority votes，就变为 Leader
- 如果收到 Append-Log RPC，并且 term >= currentTerm，就变为 Follower
- 如果超时，开启新的 election
Leader
- election 成功后，发送空Append-Log RPC（heartbeat），防止超时
- 收到 client 的请求，append local log。回头 apply 到 state machine 后向 client 回复
- 如果 log index > nextIndex[i]，就向 server[i] 发送 Append-Log RPC，发送 log 从 nextIndex[i] 开始
  - 如果成功，更新 nextIndex[i]
  - 如果因为 log inconsistent 失败，nextIndex[i]– 并 retry
- 如果存在一个 N 使得 (1) N >commitIndex，(2) majority matchIndex[i] >= N，并且 (3) log[N].term == currentTerm，就设置 commitIndex = N

参数
- term
- candidateId
- lastLogIndex
- lastLogTerm
返回值
- term：currentTerm ，Candidate 落后，让其更新自己
- voteGranted：bool 值，表明是否 vote
vote 规则
- reject 如果 Candidate.term < currentTerm
- 如果 self.votedFor 为 null 或者 candidateId，并且 Candidate.log 不旧于自己（已 commit 的），就 vote（votedFor 为 candidateId 还要再发，是因为 消息可能乱序）

参数
- term：leader.term
- leaderId
- prevLogIndex：下面 entries[] 之前的 entry 的 index
- prevLogTerm：下面 entries[] 之前的 entry 的 term
- entries[]：empty 表明 heartbeat
- leaderCommit
返回值
- term：currentTerm ，Leader 落后，让其更新自己
- success：true 表明 prevLogIndex 和 prevLogTerm match 并更新
接受者规则
- 返回 false 如果 Leader.term < currentTerm
- 返回 false 如果 prevLogIndex 和 prevLogTerm 对不上
- prevLogIndex 和 prevLogTerm 对上之后，后面全部替换为 entries[]
- 如果 leaderCommit > self.commitIndex，就设置 self.commitIndex = min(leaderCommit, last index of log)