2020 6.824 的 Raft Lab 2A

前言

做2020的MIT6.824,刚刚开始了第二个实验,是Raft 2A,这个实验是实现不带log的leader election以及heartbeat,花了很多时间在理解paper上面,然后花了很多时间在debug上面。这个实验如果理解了Raft要干什么的话,难点就是避免deadlock。下面有几个链接对我的实验很有帮助,可以参考

  1. MIT对go/raft讲解的lecture
  2. 并行运行测试的shellscript
##每5个test并行运行,运行100次2A的test
sh test_many.sh 100 5 2A 

一、Raft

流程

我是看着动画流程实现的,这个很有助于理解
动画流程

我实现的大概流程

  1. Make: 初始化一个Raft,然后在return前用另外一个go程做LeaderElection以及监控当前的state
  2. LeaderElection:设置一个timmer,超时kickoffElection
  3. kickoffElection: 向每个peer发送sendRequestVote的rpc
    3.1. 如果收到小于大多数的票,return
    3.2. 如果收到多数票,convertToLeader并且SendHeartbeat
  4. SendHeartbeat: 向每个peer发送sendAppendEntries
    4.1. 如果reply.Success不为true: convertToFollower
    4.2. 其他情况直接return

struct

我基本就是对着Raft Paper的Figure2填的property

type Raft struct {
	mu        sync.Mutex          // Lock to protect shared access to this peer's state
	peers     []*labrpc.ClientEnd // RPC end points of all peers
	persister *Persister          // Object to hold this peer's persisted state
	me        int                 // this peer's index into peers[]
	dead      int32               // set by Kill()

	//Persistent state on all servers
	currentTerm int
	votedFor    int
	log         []int

	//additional property
	currentState ServerState
	lastReceived time.Time

	//Volatile state on all servers
	commitIndex int
	lastApplied int

	//Volatile state on leaders
	nextIndex  []int
	matchIndex []int
}

AppendEntriesArgs跟AppendEntriesReply对着RequestVoteArgs照葫芦画瓢就行了

//AppendEntries
type AppendEntriesArgs struct {
	Term         int
	LeaderId     int
	PrevLogIndex int
	PrevLogTerm  int
	Entries      []int
	LeaderCommit int
}

type AppendEntriesReply struct {
	Term    int
	Success bool
}

Make 重点在于go rf.LeaderElection() 以及go rf.StateMonitor()

func Make(peers []*labrpc.ClientEnd, me int,
		persister *Persister, applyCh chan ApplyMsg) *Raft {
	rf := &Raft{}
	rf.peers = peers
	rf.persister = persister
	rf.me = me

	// Your initialization code here (2A, 2B, 2C).
	rf.currentTerm = 1
	rf.votedFor = -1
	rf.log = []int{}
	rf.commitIndex = 0
	rf.lastApplied = 0

	//addition property
	rf.currentState = Follower
	rf.lastReceived = time.Now()

	go rf.LeaderElection()
	go rf.StateMonitor()

	// initialize from state persisted before a crash
	rf.readPersist(persister.ReadRaftState())
	return rf
}

StateMonitor 主要监控raft是否被killed了

func (rf *Raft) StateMonitor() {
	for {
		rf.mu.Lock()
		if rf.killed() {
			rf.currentState = Dead
		}
		rf.mu.Unlock()
		time.Sleep(10 * time.Millisecond)
	}
}

这样把property填好,Make函数实现之后,剩下就是election跟heartbeat了


二、选举实现

选举总的来说可以分为两个步骤

  1. 超时出发选举
  2. 给peers发送选举RPC

超时选举的实现

func (rf *Raft) LeaderElection() {
	for {
		electionTimeout := ElectionInterval + rand.Intn(200)
		startTime := time.Now()
		time.Sleep(time.Duration(electionTimeout) * time.Millisecond)
		rf.mu.Lock()
		if rf.currentState == Dead {
			rf.mu.Unlock()
			return
		}
		if rf.lastReceived.Before(startTime) {
			if rf.currentState != Leader {
				go rf.kickoffElection()
			}
		}
		rf.mu.Unlock()
	}
}

注意: 给peers发送选举RPC

  1. kickoffElection需要convertToCandidate
  2. 如果reply.Term > rf.currentTerm需要把当前raft convertToFollower
  3. 我一开始是把rf.SendHeartbeat()写在了rf.convertToLeader()函数里面的,导致很多lock问题,后来把rf.SendHeartbeat()分出来就解决了,convertToLeader其实是一个很快的操作,这样拆分出来每次对status的转换就需要等待SendHeartbeat了
func (rf *Raft) kickoffElection() {
	rf.mu.Lock()
	defer rf.mu.Unlock()
	rf.lastReceived = time.Now()
	rf.convertToCandidate()
	args := RequestVoteArgs{rf.currentTerm, rf.me, 0, 0}
	numVote := 1
	for i := 0; i < len(rf.peers); i++ {
		if i != rf.me {
			go func(p int) {
				reply := RequestVoteReply{}
				ok := rf.sendRequestVote(p, &args, &reply)
				if !ok {
					return
				}
				rf.mu.Lock()
				defer rf.mu.Unlock()
				if !reply.VoteGranted {
					if reply.Term > rf.currentTerm {
						rf.convertToFollower(reply.Term)
					}
					return
				}
				numVote++
				if numVote > len(rf.peers)/2 {
					rf.convertToLeader()
					go rf.SendHeartbeat()
					return
				}
			}(i)
		}
	}
}

RequestVote 的 RPC
对着Raft Paper的Figure2 RequestVote RPC部分填就行了,有几个点注意一下

  1. 每次收到Vote,raft都set一个timmer: rf.lastReceived = time.Now()
  2. 如果term < currentTerm, reply.VoteGranted = false的同时也要set reply.Term = rf.currentTerm
  3. 如果args.Term > rf.currentTerm,需要把当前raft编程follower
func (rf *Raft) RequestVote(args *RequestVoteArgs, reply *RequestVoteReply) {
	// Your code here (2A, 2B).
	rf.mu.Lock()
	defer rf.mu.Unlock()
	rf.lastReceived = time.Now()

	//Reply false if term < currentTerm
	if args.Term < rf.currentTerm {
		reply.Term = rf.currentTerm
		reply.VoteGranted = false
		return
	}

	if args.Term > rf.currentTerm {
		rf.convertToFollower(args.Term)
	}

	//If votedFor is null or candidateId, and
	//candidate’s log is at least as up-to-date as receiver’s log, grant vote
	if rf.votedFor == -1 || rf.votedFor == args.CandidateId {
		rf.votedFor = args.CandidateId
		reply.VoteGranted = true
	}

}

三、心跳实现

在Candidate成功成为Leader之后,就会开始向peers发心跳

if numVote > len(rf.peers)/2 {
	rf.convertToLeader()
	go rf.SendHeartbeat()
	return
}

SendHeartbeat 就是心跳的具体实现,每次发送完都会有sleep,相当于定时器

func (rf *Raft) SendHeartbeat() {
	for {
		rf.mu.Lock()
		if rf.currentState != Leader || rf.killed() {
			rf.mu.Unlock()
			return
		}
		args := AppendEntriesArgs{
			Term:     rf.currentTerm,
			LeaderId: rf.me,
		}
		rf.mu.Unlock()

		for i := 0; i < len(rf.peers); i++ {
			if i != rf.me {
				go func(p int) {
					reply := AppendEntriesReply{}
					ok := rf.sendAppendEntries(p, &args, &reply)
					if !ok {
						return
					}
					rf.mu.Lock()
					defer rf.mu.Unlock()
					if !reply.Success {
						if reply.Term > rf.currentTerm {
							rf.convertToFollower(reply.Term)
						}
					}
					return
				}(i)
			}
		}

		time.Sleep(time.Duration(HeartBeatInterval) * time.Millisecond)
	}
}

SendAppendEntry

  1. 心跳发送是一个RPC,跟vote很相似
  2. 当peer收到心跳AppendEntries的RPC的时候,对着Raft Paper的Figure2 AppendEntries RPC部分填就行了,由于Lab2A对log没有要求,其实paper中很多的要求我暂时也没有实现
  3. 还是要注意reset timer,忘记reset导致我花了不少时间figure out
  4. TODO:我是在这里把candidate covert成follower的,paper没有这样么说,目前测试是可以通过的,后面就不确定是不是可以这么来了
func (rf *Raft) AppendEntries(args *AppendEntriesArgs, reply *AppendEntriesReply) {
	rf.mu.Lock()
	defer rf.mu.Unlock()

	rf.lastReceived = time.Now()
	if args.Term < rf.currentTerm {
		reply.Term = rf.currentTerm
		reply.Success = false
		return
	}
	if rf.currentState == Candidate {
		rf.convertToFollower(args.Term)
	}
	reply.Success = true
}

总结

Raft 2A 整个实验需要花很多时间去理解Paper,建议结合动画流程实现代码,如果想明白了,实现起来不算太复杂,就是选举,发送心跳,还有Servers不同转态下的切换。当然debug是相当有挑战的,我基本都是用锁来共享数据的,可以尝试通过通信的方式共享数据,也就是使用channel

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 4
    评论
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值