6.824 lab4 Part B

阅读文档

https://pdos.csail.mit.edu/6.824/labs/lab-shard.html

定义

Shard

把所有数据按照Key Hash然后取模10,把数据切分成10片.每一片称为一个Shard, 其中包括一组键值对

ShardServer (Group)

每个Group都是一个Raft集群, 通过Raft保证这组服务器上的数据一致性. 一组服务器负责几个Shard的读写请求.整个集群由N个Group组成

Client

客户端, 发起读写请求.

ShardMaster

是集群中的协调者, 他负责调整Shard在集群间的分配, 以及集群路由的查询工作. 

扩容: 新加入了一组服务器Group3, 需要将针对某个Shard的历史数据和后续的读写请求由Group1交给Group3负责以扩容集群.

缩容: 需要下线Group3, 需要将Group3负责的Shard历史数据和后续读写请求交给其他Group负责.扩缩容操作后需要保证Group间的负载均衡

路由查询: 假设Client 需要 Put(key:"name", value: "L" ), "name"这个key存储在Shard1上. 需要Client根据从ShardMaster获取到的配置来确认向哪个Group发起写请求.

Config

由ShardMaster维护, 客户端和ShardServer拉取的配置, 其中包括的所有服务器分组地址以及集群路由信息. 每次集群动作的变更都会引起Config Version的更新.

Recoverable

快照和Raft中的状态要及时落盘,每一台服务器都要能够在故障后重启恢复

线性一致性

需要能立刻读到之前完成的写请求

负载均衡

Shard在Group间的分布要尽可能均衡, 可以通过一致性哈希或者其他方法来实现

ShardMaster的Config更新

新的Config会导致负责Shard的改变,执行新的config,需要确保不会再写入现在不负责的Shard, 并且不影响一直维护的Shard。需要将不再负责的Shard 发送给目前负责的Group,并且在完成后清理掉Shard。

如果一个REPLICA GROUP A得到一个SHARD 1,对应B 失去一个SHARD 1。如果是A检测到多了一个SHARD,去等待B 通知A,等待时间长且增加了B 的工作量。

需要让A去问B要SHARD,B如果发现了新的CONFIG,可以直接更新让它立刻生效。A需要等待PULL成功后,更新CONFIG让它生效。

每个RAFT GROUP都是由LEADER负责发送和接受RPC。FOLLOWER只负责从APPLY MSG里去和LEADER SYNC状态。

我们不能直接从DB里去取数据,如果我们没有实现清理数据的前提下,因为数据不清理。所以我们会有多的数据。我们先接受SHARD1,然后不接受,再重新接受SHARD1,此时做迁移,会是一个并集。而我们只是希望是重新接受的那部分。基于上述考虑。我们需要基于每一个CONFIG,单独把要迁移的数据给抽出来。这样依据CONFIG来做迁移。

common中新增

type MigrateArgs struct {
	Shard     int
	ConfigNum int
}

type MigrateReply struct {
	Err         Err
	ConfigNum   int
	Shard       int
	DB          map[string]string
	Cid2Seq     map[int64]int
}

server添加方法

func (kv *ShardKV) ShardMigration(args *MigrateArgs, reply *MigrateReply) {
	reply.Err, reply.Shard, reply.ConfigNum = ErrWrongLeader, args.Shard, args.ConfigNum
	if _,isLeader := kv.rf.GetState(); !isLeader {return}
	kv.mu.Lock()
	defer kv.mu.Unlock()
	reply.Err = ErrWrongGroup
	if args.ConfigNum >= kv.cfg.Num {return}
	reply.Err,reply.ConfigNum, reply.Shard = OK, args.ConfigNum, args.Shard
	reply.DB, reply.Cid2Seq = kv.deepCopyDBAndDedupMap(args.ConfigNum,args.Shard)
}

func (kv *ShardKV) deepCopyDBAndDedupMap(config int,shard int) (map[string]string, map[int64]int) {
	db2 := make(map[string]string)
	cid2Seq2 := make(map[int64]int)
	for k, v := range kv.toOutShards[config][shard] {
		db2[k] = v
	}
	for k, v := range kv.cid2Seq {
		cid2Seq2[k] = v
	}
	return db2, cid2Seq2
}

PULL DATA

如果我们选择让LEADER去交互,如何Leader挂掉的时候,需要有新的LEADER来负责PULL DATA。

所以在所有节点上必须得存好要问哪里去PULL DATA。如果PULL到,我们需要确保LEADER会往RAFT里发CMD(这个CMD是让节点同步数据,同时删掉那个维护的哪里去PULL DATA的地方)

而且我们必须额外开一个后台进程与循环的做这件事。不然LEADER转移过去之后,就没有人PULL DATA了。 

因为要后台循环去PULL DATA,我们拿到DATA后,送进RAFT,再进入到APPLY CH,需要所有的节点都可以同步这个数据。一旦同步成功,我们需要清理这个要等待的数据。这样后台线程可以少发很多无用的RPC。

type ShardKV struct {
	mu           sync.Mutex
	me           int
	rf           *raft.Raft
	applyCh      chan raft.ApplyMsg
	make_end     func(string) *labrpc.ClientEnd
	gid          int
	masters      []*labrpc.ClientEnd
	maxraftstate int // snapshot if log grows this big

	// Your definitions here.
	mck             *shardmaster.Clerk
	cfg             shardmaster.Config
	persist         *raft.Persister
	db              map[string]string
	chMap           map[int]chan Op
	cid2Seq         map[int64]int

	toOutShards     map[int]map[int]map[string]string "cfg num -> (shard -> db)"
	comeInShards    map[int]int     "shard->config number"
	myShards        map[int]bool    "to record which shard i can offer service"
	garbages        map[int]map[int]bool              "cfg number -> shards"

	killCh      chan bool
}

server.go新增

func (kv *ShardKV) tryPullShard() {
	_, isLeader := kv.rf.GetState();
	kv.mu.Lock()
	if  !isLeader || len(kv.comeInShards) == 0 { // 需要转移的SHARD还没做完。不要立刻去拿下一个CONFIG
		kv.mu.Unlock()
		return
	}
	var wait sync.WaitGroup
	for shard, idx := range kv.comeInShards {
		wait.Add(1)
		go func(shard int, cfg shardmaster.Config) {
			defer wait.Done()
			args := MigrateArgs{shard, cfg.Num}
			gid := cfg.Shards[shard]
			for _, server := range cfg.Groups[gid] {
				srv := kv.make_end(server)
				reply := MigrateReply{}
				if ok := srv.Call("ShardKV.ShardMigration", &args, &reply); ok && reply.Err == OK {
					kv.rf.Start(reply)
				}

			}
		}(shard, kv.mck.Query(idx))
	}
	kv.mu.Unlock()
	wait.Wait()
}
func (kv *ShardKV) daemon(do func(), sleepMS int) {
	for {
		select {
		case <-kv.killCh:
			return
		default:
			do()
		}
		time.Sleep(time.Duration(sleepMS) * time.Millisecond)
	}
}
StartServer方法添加定时执行
go kv.daemon(kv.tryPollNewCfg,50)
go kv.daemon(kv.tryPullShard,80)

Raft处理

收到的新的CONFIG和拿到的MIGRATION DATA打给放进RAFT的LOG去做线性一致的排序。

StartServer添加方法通过applyCh获取RAFT返回消息

go func() {
		for {
			select {
			case <- kv.killCh:
				return
			case applyMsg := <- kv.applyCh:
				if !applyMsg.CommandValid {
					kv.readSnapShot(applyMsg.CommandData)
					continue
				}
				kv.apply(applyMsg)
			}
		}
	}()
func (kv *ShardKV) apply(applyMsg raft.ApplyMsg) {
	if cfg, ok := applyMsg.Command.(shardmaster.Config); ok {
		kv.updateInAndOutDataShard(cfg)
	} else if migrationData, ok := applyMsg.Command.(MigrateReply); ok{
		kv.updateDBWithMigrateData(migrationData)
	}else {
		op := applyMsg.Command.(Op)
		if op.OpType == "GC" {
			cfgNum,_ := strconv.Atoi(op.Key)
			kv.gc(cfgNum,op.SeqNum);
		} else {
			kv.normal(&op)
		}
		if notifyCh := kv.put(applyMsg.CommandIndex,false); notifyCh != nil {
			send(notifyCh,op)
		}
	}
	if kv.needSnapShot() {
		go kv.doSnapShot(applyMsg.CommandIndex)
	}

}

MIGRATION DATA REPLY乱序

REPLY 发到RAFT里面,虽然有顺序,但返回的时候顺序可能是乱的。比如现在我的CONFIG已经更新到9,这个时候RAFT才把CONFIG的6 返回回来。我们应该直接忽略这个版本。

在收到CONFIG 变更刷新CONFIG,然后更新COME IN SHARD,随后后台线程会去PULL。从更新COME IN SHARD到数据SHARD过来。这段时间内,必须得拒绝掉所有的索要该SHARD的请求。所以我们不能直接从CONFIG来判断是不是WRONG GROUP。

ShardKV添加myShards继续当前SHARD是否可以提供服务

myShards        map[int]bool
if _, ok := kv.myShards[migrationData.Shard]; !ok {
		kv.myShards[migrationData.Shard] = true
		for k, v := range migrationData.DB {
			kv.db[k] = v
		}
		for k, v := range migrationData.Cid2Seq {
			kv.cid2Seq[k] = Max(v,kv.cid2Seq[k])
		}
	}

实现Shard更新

根据新的CONFIG来,判断要送出去的数据是哪些,自己要接受进来的数组是哪些。

func (kv *ShardKV) updateInAndOutDataShard(cfg shardmaster.Config) {
	kv.mu.Lock()
	defer kv.mu.Unlock()
	if cfg.Num <= kv.cfg.Num { //only consider newer config
		return
	}
	oldCfg, toOutShard := kv.cfg, kv.myShards
	kv.myShards, kv.cfg = make(map[int]bool), cfg
	for shard, gid := range cfg.Shards {
		if gid != kv.gid {continue}
		if _, ok := toOutShard[shard]; ok || oldCfg.Num == 0 {
			kv.myShards[shard] = true
			delete(toOutShard, shard)
		} else {
			kv.comeInShards[shard] = oldCfg.Num
		}
	}
	if len(toOutShard) > 0 { // prepare data that needed migration
		kv.toOutShards[oldCfg.Num] = make(map[int]map[string]string)
		for shard := range toOutShard {
			outDb := make(map[string]string)
			for k, v := range kv.db {
				if key2shard(k) == shard {
					outDb[k] = v
					delete(kv.db, k)
				}
			}
			kv.toOutShards[oldCfg.Num][shard] = outDb
		}
	}
}

WRONG GROUP

数据会在APPLY CH收到新的CONFIG,一部分要TO OUT的数据就会从DB里DELETE掉。为了确保NOTIFY CH的传输过程中,这个DB的更改不会影响到实际的GET的返回值。我们需要在接到APPLY CH的时候就把结果给注入到OP里。不然等OP发过去再从DB拿,有一定概率此时另一个线程已经再DELETE DB了。

func (kv *ShardKV) normal(op *Op) {
	shard := key2shard(op.Key)
	kv.mu.Lock()
	if _, ok := kv.myShards[shard]; !ok {
		op.OpType = ErrWrongGroup
	} else {
		maxSeq,found := kv.cid2Seq[op.Cid]
		if !found || op.SeqNum > maxSeq {
			if op.OpType == "Put" {
				kv.db[op.Key] = op.Value
			} else if op.OpType == "Append" {
				kv.db[op.Key] += op.Value
			}
			kv.cid2Seq[op.Cid] = op.SeqNum
		}
		if op.OpType == "Get" {
			op.Value = kv.db[op.Key]
		}
	}
	kv.mu.Unlock()
}
func (kv *ShardKV) templateStart(originOp Op) (Err, string) {
	index,_,isLeader := kv.rf.Start(originOp)
	if isLeader {
		ch := kv.put(index, true)
		op := kv.beNotified(ch, index)
		if equalOp(originOp, op) { return OK, op.Value }
		if op.OpType == ErrWrongGroup { return ErrWrongGroup, "" }
	}
	return ErrWrongLeader,""
}
func equalOp(a Op, b Op) bool{
	return a.Key == b.Key &&  a.OpType == b.OpType && a.SeqNum == b.SeqNum && a.Cid == b.Cid
}
func (kv *ShardKV) beNotified(ch chan Op,index int) Op{
	select {
	case notifyArg,ok := <- ch :
		kv.mu.Lock()
		if ok {
			close(ch)
		}
		delete(kv.chMap,index)
		kv.mu.Unlock()
		return notifyArg
	case <- time.After(time.Duration(1000)*time.Millisecond):
		return Op{}
	}
}

实现新的SNAPSHOT

func (kv *ShardKV) doSnapShot(index int) {
	w := new(bytes.Buffer)
	e := labgob.NewEncoder(w)
	kv.mu.Lock()
	e.Encode(kv.db)
	e.Encode(kv.cid2Seq)
	e.Encode(kv.comeInShards)
	e.Encode(kv.toOutShards)
	e.Encode(kv.myShards)
	e.Encode(kv.cfg)
	e.Encode(kv.garbages)
	kv.mu.Unlock()
	kv.rf.ReplaceLogWithSnapshot(index,w.Bytes())
}
func (kv *ShardKV) readSnapShot(snapshot []byte) {
	kv.mu.Lock()
	defer kv.mu.Unlock()
	if snapshot == nil || len(snapshot) < 1 {return}
	r := bytes.NewBuffer(snapshot)
	d := labgob.NewDecoder(r)
	var db map[string]string
	var cid2Seq map[int64]int
	var toOutShards map[int]map[int]map[string]string
	var comeInShards map[int]int
	var myShards    map[int]bool
	var garbages    map[int]map[int]bool
	var cfg shardmaster.Config
	if d.Decode(&db) != nil || d.Decode(&cid2Seq) != nil || d.Decode(&comeInShards) != nil ||
		d.Decode(&toOutShards) != nil || d.Decode(&myShards) != nil || d.Decode(&cfg) != nil ||
		d.Decode(&garbages) != nil {
		log.Fatal("readSnapShot ERROR for server ",kv.me)
	} else {
		kv.db, kv.cid2Seq, kv.cfg = db, cid2Seq, cfg
		kv.toOutShards, kv.comeInShards, kv.myShards, kv.garbages = toOutShards,comeInShards,myShards,garbages
	}
}

GC

func (kv *ShardKV) GarbageCollection(args *MigrateArgs, reply *MigrateReply) {
	reply.Err = ErrWrongLeader
	if _, isLeader := kv.rf.GetState(); !isLeader {return}
	kv.mu.Lock()
	defer kv.mu.Unlock()
	if _,ok := kv.toOutShards[args.ConfigNum]; !ok {return}
	if _,ok := kv.toOutShards[args.ConfigNum][args.Shard]; !ok {return}
	originOp := Op{"GC",strconv.Itoa(args.ConfigNum),"",nrand(),args.Shard}
	kv.mu.Unlock()
	reply.Err,_ = kv.templateStart(originOp)
	kv.mu.Lock()
}
func (kv *ShardKV) gc(cfgNum int, shard int) {
	kv.mu.Lock()
	defer kv.mu.Unlock()
	if _, ok := kv.toOutShards[cfgNum]; ok {
		delete(kv.toOutShards[cfgNum], shard)
		if len(kv.toOutShards[cfgNum]) == 0 {
			delete(kv.toOutShards, cfgNum)
		}
	}
}

后台gc

func (kv *ShardKV) tryGC() {
	_, isLeader := kv.rf.GetState();
	kv.mu.Lock()
	if !isLeader || len(kv.garbages) == 0{
		kv.mu.Unlock()
		return
	}
	var wait sync.WaitGroup
	for cfgNum, shards := range kv.garbages {
		for shard := range shards {
			wait.Add(1)
			go func(shard int, cfg shardmaster.Config) {
				defer wait.Done()
				args := MigrateArgs{shard, cfg.Num}
				gid := cfg.Shards[shard]
				for _, server := range cfg.Groups[gid] {
					srv := kv.make_end(server)
					reply := MigrateReply{}
					if ok := srv.Call("ShardKV.GarbageCollection", &args, &reply); ok && reply.Err == OK {
						kv.mu.Lock()
						defer kv.mu.Unlock()
						delete(kv.garbages[cfgNum], shard)
						if len(kv.garbages[cfgNum]) == 0 {
							delete(kv.garbages, cfgNum)
						}
					}
				}
			}(shard, kv.mck.Query(cfgNum))
		}
	}
	kv.mu.Unlock()
	wait.Wait()
}

代码总结

common

package shardkv

//
// Sharded key/value server.
// Lots of replica groups, each running op-at-a-time paxos.
// Shardmaster decides which group serves each shard.
// Shardmaster may change shard assignment from time to time.
//
// You will have to modify these definitions.
//

const (
	OK            = "OK"
	ErrNoKey      = "ErrNoKey"
	ErrWrongGroup = "ErrWrongGroup"
	ErrWrongLeader = "ErrWrongLeader"
)

type Err string

// Put or Append
type PutAppendArgs struct {
	// You'll have to add definitions here.
	Key   string
	Value string
	Op    string // "Put" or "Append"
	Cid    int64 "client unique id"
	SeqNum int   "each request with a monotonically increasing sequence number"
}

type PutAppendReply struct {
	WrongLeader bool
	Err         Err
}

type GetArgs struct {
	Key string
	// You'll have to add definitions here.
}

type GetReply struct {
	WrongLeader bool
	Err         Err
	Value       string
}

type MigrateArgs struct {
	Shard     int
	ConfigNum int
}

type MigrateReply struct {
	Err         Err
	ConfigNum   int
	Shard       int
	DB          map[string]string
	Cid2Seq     map[int64]int
}

func Max(x, y int) int {
	if x > y {
		return x
	}
	return y
}

client

package shardkv

//
// client code to talk to a sharded key/value service.
//
// the client first talks to the shardmaster to find out
// the assignment of shards (keys) to groups, and then
// talks to the group that holds the key's shard.
//

import (
	"labrpc"
)
import "crypto/rand"
import "math/big"
import "shardmaster"
import "time"

//
// which shard is a key in?
// please use this function,
// and please do not change it.
//
func key2shard(key string) int {
	shard := 0
	if len(key) > 0 {
		shard = int(key[0])
	}
	shard %= shardmaster.NShards
	return shard
}

func nrand() int64 {
	max := big.NewInt(int64(1) << 62)
	bigx, _ := rand.Int(rand.Reader, max)
	x := bigx.Int64()
	return x
}

type Clerk struct {
	sm       *shardmaster.Clerk
	config   shardmaster.Config
	make_end func(string) *labrpc.ClientEnd
	// You will have to modify this struct.
	lastLeader  int
	id          int64
	seqNum      int
}

//
// the tester calls MakeClerk.
//
// masters[] is needed to call shardmaster.MakeClerk().
//
// make_end(servername) turns a server name from a
// Config.Groups[gid][i] into a labrpc.ClientEnd on which you can
// send RPCs.
//
func MakeClerk(masters []*labrpc.ClientEnd, make_end func(string) *labrpc.ClientEnd) *Clerk {
	ck := new(Clerk)
	ck.sm = shardmaster.MakeClerk(masters)
	ck.make_end = make_end
	// You'll have to add code here.
	ck.id = nrand()//give each client a unique identifier, and then have them
	ck.seqNum = 0// tag each request with a monotonically increasing sequence number.
	ck.lastLeader = 0
	return ck
}

//
// fetch the current value for a key.
// returns "" if the key does not exist.
// keeps trying forever in the face of all other errors.
// You will have to modify this function.
//
func (ck *Clerk) Get(key string) string {
	args := GetArgs{}
	args.Key = key

	for {
		shard := key2shard(key)
		gid := ck.config.Shards[shard]
		if servers, ok := ck.config.Groups[gid]; ok {
			// try each server for the shard.
			for si := 0; si < len(servers); si++ {
				srv := ck.make_end(servers[si])
				var reply GetReply
				ok := srv.Call("ShardKV.Get", &args, &reply)
				if ok && reply.WrongLeader == false && (reply.Err == OK || reply.Err == ErrNoKey) {
					return reply.Value
				}
				if ok && (reply.Err == ErrWrongGroup) {
					break
				}
			}
		}
		time.Sleep(100 * time.Millisecond)
		// ask master for the latest configuration.
		ck.config = ck.sm.Query(-1)
	}

	return ""
}

//
// shared by Put and Append.
// You will have to modify this function.
//
func (ck *Clerk) PutAppend(key string, value string, op string) {
	args := PutAppendArgs{key,value,op,ck.id,ck.seqNum}
	ck.seqNum++
	for {
		shard := key2shard(key)
		gid := ck.config.Shards[shard]
		if servers, ok := ck.config.Groups[gid]; ok {
			for si := 0; si < len(servers); si++ {
				srv := ck.make_end(servers[si])
				var reply PutAppendReply
				ok := srv.Call("ShardKV.PutAppend", &args, &reply)
				if ok && reply.WrongLeader == false && reply.Err == OK {
					return
				}
				if ok && reply.Err == ErrWrongGroup {
					break
				}
			}
		}
		time.Sleep(100 * time.Millisecond)
		// ask master for the latest configuration.
		ck.config = ck.sm.Query(-1)
	}
}

func (ck *Clerk) Put(key string, value string) {
	ck.PutAppend(key, value, "Put")
}
func (ck *Clerk) Append(key string, value string) {
	ck.PutAppend(key, value, "Append")
}

server

package shardmaster

import (
	"log"
	"math"
	"raft"
	"time"
)
import "labrpc"
import "sync"
import "labgob"

type ShardMaster struct {
	mu      sync.Mutex
	me      int
	rf      *raft.Raft
	applyCh chan raft.ApplyMsg
	// Your data here.
	configs []Config // indexed by config num
	chMap   map[int]chan Op
	cid2Seq map[int64]int
	killCh  chan bool
}

type Op struct {
	OpType  string "operation type(eg. join/leave/move/query)"
	Args    interface{} // could be JoinArgs, LeaveArgs, MoveArgs and QueryArgs, in reply it could be config
	Cid     int64
	SeqNum  int
}

func (sm *ShardMaster) Join(args *JoinArgs, reply *JoinReply) {
	originOp := Op{"Join",*args,args.Cid,args.SeqNum}
	reply.WrongLeader = sm.templateHandler(originOp)
}

func (sm *ShardMaster) Leave(args *LeaveArgs, reply *LeaveReply) {
	originOp := Op{"Leave",*args,args.Cid,args.SeqNum}
	reply.WrongLeader = sm.templateHandler(originOp)
}

func (sm *ShardMaster) Move(args *MoveArgs, reply *MoveReply) {
	originOp := Op{"Move",*args,args.Cid,args.SeqNum}
	reply.WrongLeader = sm.templateHandler(originOp)
}

func (sm *ShardMaster) Query(args *QueryArgs, reply *QueryReply) {
	reply.WrongLeader = true;
	originOp := Op{"Query",*args,nrand(),-1}
	reply.WrongLeader = sm.templateHandler(originOp)
	if !reply.WrongLeader {
		sm.mu.Lock()
		defer sm.mu.Unlock()
		if args.Num >= 0 && args.Num < len(sm.configs) {
			reply.Config = sm.configs[args.Num]
		} else {
			reply.Config = sm.configs[len(sm.configs) - 1]
		}
	}
}

func (sm *ShardMaster) templateHandler(originOp Op) bool {
	wrongLeader := true
	index,_,isLeader := sm.rf.Start(originOp)
	if !isLeader {return wrongLeader}
	ch := sm.getCh(index,true)
	op := sm.beNotified(ch,index)
	if equalOp(op,originOp) {
		wrongLeader = false
	}
	return wrongLeader
}

func (sm *ShardMaster) beNotified(ch chan Op, index int) Op {
	select {
	case notifyArg := <- ch :
		sm.mu.Lock()
		close(ch)
		delete(sm.chMap,index)
		sm.mu.Unlock()
		return notifyArg
	case <- time.After(time.Duration(600)*time.Millisecond):
		return Op{}
	}
}

func equalOp(a Op, b Op) bool{
	return a.SeqNum == b.SeqNum && a.Cid == b.Cid && a.OpType == b.OpType
}

func (sm *ShardMaster) Kill() {
	sm.rf.Kill()
	sm.killCh <- true
}
// needed by shardkv tester
func (sm *ShardMaster) Raft() *raft.Raft {
	return sm.rf
}

func (sm *ShardMaster) getCh(idx int, createIfNotExists bool) chan Op{
	sm.mu.Lock()
	defer sm.mu.Unlock()
	if _, ok := sm.chMap[idx]; !ok {
		if !createIfNotExists {return nil}
		sm.chMap[idx] = make(chan Op,1)
	}
	return sm.chMap[idx]
}

func (sm *ShardMaster) updateConfig(op string, arg interface{}) {
	cfg := sm.createNextConfig()
	if op == "Move" {
		moveArg := arg.(MoveArgs)
		if _,exists := cfg.Groups[moveArg.GID]; exists {
			cfg.Shards[moveArg.Shard] = moveArg.GID
		} else {return}
	}else if op == "Join" {
		joinArg := arg.(JoinArgs)
		for gid,servers := range joinArg.Servers {
			newServers := make([]string, len(servers))
			copy(newServers, servers)
			cfg.Groups[gid] = newServers
			sm.rebalance(&cfg,op,gid)
		}
	} else if op == "Leave"{
		leaveArg := arg.(LeaveArgs)
		for _,gid := range leaveArg.GIDs {
			delete(cfg.Groups,gid)
			sm.rebalance(&cfg,op,gid)
		}
	} else {
		log.Fatal("invalid area",op)
	}
	sm.configs = append(sm.configs,cfg)
}

func (sm *ShardMaster) createNextConfig() Config {
	lastCfg := sm.configs[len(sm.configs)-1]
	nextCfg := Config{Num: lastCfg.Num + 1, Shards: lastCfg.Shards, Groups: make(map[int][]string)}
	for gid, servers := range lastCfg.Groups {
		nextCfg.Groups[gid] = append([]string{}, servers...)
	}
	return nextCfg
}

func (sm *ShardMaster) rebalance(cfg *Config, request string, gid int) {
	shardsCount := sm.groupByGid(cfg) // gid -> shards
	switch request {
	case "Join":
		avg := NShards / len(cfg.Groups)
		for i := 0; i < avg; i++ {
			maxGid := sm.getMaxShardGid(shardsCount)
			cfg.Shards[shardsCount[maxGid][0]] = gid
			shardsCount[maxGid] = shardsCount[maxGid][1:]
		}
	case "Leave":
		shardsArray,exists := shardsCount[gid]
		if !exists {return}
		delete(shardsCount,gid)
		if len(cfg.Groups) == 0 { // remove all gid
			cfg.Shards = [NShards]int{}
			return
		}
		for _,v := range shardsArray {
			minGid := sm.getMinShardGid(shardsCount)
			cfg.Shards[v] = minGid
			shardsCount[minGid] = append(shardsCount[minGid], v)
		}
	}
}
func (sm *ShardMaster) groupByGid(cfg *Config) map[int][]int {
	shardsCount := map[int][]int{}
	for k,_ := range cfg.Groups {
		shardsCount[k] = []int{}
	}
	for k, v := range cfg.Shards {
		shardsCount[v] = append(shardsCount[v], k)
	}
	return shardsCount
}
func (sm *ShardMaster) getMaxShardGid(shardsCount map[int][]int) int {
	max := -1
	var gid int
	for k, v := range shardsCount {
		if max < len(v) {
			max = len(v)
			gid = k
		}
	}
	return gid
}
func (sm *ShardMaster) getMinShardGid(shardsCount map[int][]int) int {
	min := math.MaxInt32
	var gid int
	for k, v := range shardsCount {
		if min > len(v) {
			min = len(v)
			gid = k
		}
	}
	return gid
}
func send(notifyCh chan Op,op Op) {
	select{
	case <-notifyCh:
	default:
	}
	notifyCh <- op
}
func StartServer(servers []*labrpc.ClientEnd, me int, persister *raft.Persister) *ShardMaster {
	sm := new(ShardMaster)
	sm.me = me
	sm.configs = make([]Config, 1)
	sm.configs[0].Groups = map[int][]string{}
	labgob.Register(Op{})
	labgob.Register(JoinArgs{})
	labgob.Register(LeaveArgs{})
	labgob.Register(MoveArgs{})
	labgob.Register(QueryArgs{})
	sm.applyCh = make(chan raft.ApplyMsg)
	sm.rf = raft.Make(servers, me, persister, sm.applyCh)
	// Your code here.
	sm.chMap = make(map[int]chan Op)
	sm.cid2Seq = make(map[int64]int)
	sm.killCh = make(chan bool,1)
	go func() {
		for {
			select {
			case <-sm.killCh:
				return
			case applyMsg := <-sm.applyCh:
				if !applyMsg.CommandValid {continue}
				op := applyMsg.Command.(Op)
				sm.mu.Lock()
				maxSeq,found := sm.cid2Seq[op.Cid]
				if op.SeqNum >= 0 && (!found || op.SeqNum > maxSeq) {
					sm.updateConfig(op.OpType,op.Args)
					sm.cid2Seq[op.Cid] = op.SeqNum
				}
				sm.mu.Unlock()
				if notifyCh := sm.getCh(applyMsg.CommandIndex,false); notifyCh != nil {
					send(notifyCh,op)
				}
			}
		}
	}()
	return sm
}

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值