阅读文档
https://pdos.csail.mit.edu/6.824/labs/lab-shard.html
定义
Shard
把所有数据按照Key Hash然后取模10,把数据切分成10片.每一片称为一个Shard, 其中包括一组键值对
ShardServer (Group)
每个Group都是一个Raft集群, 通过Raft保证这组服务器上的数据一致性. 一组服务器负责几个Shard的读写请求.整个集群由N个Group组成
Client
客户端, 发起读写请求.
ShardMaster
是集群中的协调者, 他负责调整Shard在集群间的分配, 以及集群路由的查询工作.
扩容:
新加入了一组服务器Group3, 需要将针对某个Shard的历史数据和后续的读写请求由Group1交给Group3负责以扩容集群.
缩容:
需要下线Group3, 需要将Group3负责的Shard历史数据和后续读写请求交给其他Group负责.扩缩容操作后需要保证Group间的负载均衡
路由查询:
假设Client 需要 Put(key:"name", value: "L" ), "name"这个key存储在Shard1上. 需要Client根据从ShardMaster获取到的配置来确认向哪个Group发起写请求.
Config
由ShardMaster维护, 客户端和ShardServer拉取的配置, 其中包括的所有服务器分组地址以及集群路由信息. 每次集群动作的变更都会引起Config Version的更新.
Recoverable
快照和Raft中的状态要及时落盘,每一台服务器都要能够在故障后重启恢复
线性一致性
需要能立刻读到之前完成的写请求
负载均衡
Shard在Group间的分布要尽可能均衡, 可以通过一致性哈希或者其他方法来实现
ShardMaster的Config更新
新的Config会导致负责Shard的改变,执行新的config,需要确保不会再写入现在不负责的Shard, 并且不影响一直维护的Shard。需要将不再负责的Shard 发送给目前负责的Group,并且在完成后清理掉Shard。
如果一个REPLICA GROUP A得到一个SHARD 1,对应B 失去一个SHARD 1。如果是A检测到多了一个SHARD,去等待B 通知A,等待时间长且增加了B 的工作量。
需要让A去问B要SHARD,B如果发现了新的CONFIG,可以直接更新让它立刻生效。A需要等待PULL成功后,更新CONFIG让它生效。
每个RAFT GROUP都是由LEADER负责发送和接受RPC。FOLLOWER只负责从APPLY MSG里去和LEADER SYNC状态。
我们不能直接从DB里去取数据,如果我们没有实现清理数据的前提下,因为数据不清理。所以我们会有多的数据。我们先接受SHARD1,然后不接受,再重新接受SHARD1,此时做迁移,会是一个并集。而我们只是希望是重新接受的那部分。基于上述考虑。我们需要基于每一个CONFIG,单独把要迁移的数据给抽出来。这样依据CONFIG来做迁移。
common中新增
type MigrateArgs struct {
Shard int
ConfigNum int
}
type MigrateReply struct {
Err Err
ConfigNum int
Shard int
DB map[string]string
Cid2Seq map[int64]int
}
server添加方法
func (kv *ShardKV) ShardMigration(args *MigrateArgs, reply *MigrateReply) {
reply.Err, reply.Shard, reply.ConfigNum = ErrWrongLeader, args.Shard, args.ConfigNum
if _,isLeader := kv.rf.GetState(); !isLeader {return}
kv.mu.Lock()
defer kv.mu.Unlock()
reply.Err = ErrWrongGroup
if args.ConfigNum >= kv.cfg.Num {return}
reply.Err,reply.ConfigNum, reply.Shard = OK, args.ConfigNum, args.Shard
reply.DB, reply.Cid2Seq = kv.deepCopyDBAndDedupMap(args.ConfigNum,args.Shard)
}
func (kv *ShardKV) deepCopyDBAndDedupMap(config int,shard int) (map[string]string, map[int64]int) {
db2 := make(map[string]string)
cid2Seq2 := make(map[int64]int)
for k, v := range kv.toOutShards[config][shard] {
db2[k] = v
}
for k, v := range kv.cid2Seq {
cid2Seq2[k] = v
}
return db2, cid2Seq2
}
PULL DATA
如果我们选择让LEADER去交互,如何Leader挂掉的时候,需要有新的LEADER来负责PULL DATA。
所以在所有节点上必须得存好要问哪里去PULL DATA。如果PULL到,我们需要确保LEADER会往RAFT里发CMD(这个CMD是让节点同步数据,同时删掉那个维护的哪里去PULL DATA的地方)
而且我们必须额外开一个后台进程与循环的做这件事。不然LEADER转移过去之后,就没有人PULL DATA了。
因为要后台循环去PULL DATA,我们拿到DATA后,送进RAFT,再进入到APPLY CH,需要所有的节点都可以同步这个数据。一旦同步成功,我们需要清理这个要等待的数据。这样后台线程可以少发很多无用的RPC。
type ShardKV struct {
mu sync.Mutex
me int
rf *raft.Raft
applyCh chan raft.ApplyMsg
make_end func(string) *labrpc.ClientEnd
gid int
masters []*labrpc.ClientEnd
maxraftstate int // snapshot if log grows this big
// Your definitions here.
mck *shardmaster.Clerk
cfg shardmaster.Config
persist *raft.Persister
db map[string]string
chMap map[int]chan Op
cid2Seq map[int64]int
toOutShards map[int]map[int]map[string]string "cfg num -> (shard -> db)"
comeInShards map[int]int "shard->config number"
myShards map[int]bool "to record which shard i can offer service"
garbages map[int]map[int]bool "cfg number -> shards"
killCh chan bool
}
server.go新增
func (kv *ShardKV) tryPullShard() {
_, isLeader := kv.rf.GetState();
kv.mu.Lock()
if !isLeader || len(kv.comeInShards) == 0 { // 需要转移的SHARD还没做完。不要立刻去拿下一个CONFIG
kv.mu.Unlock()
return
}
var wait sync.WaitGroup
for shard, idx := range kv.comeInShards {
wait.Add(1)
go func(shard int, cfg shardmaster.Config) {
defer wait.Done()
args := MigrateArgs{shard, cfg.Num}
gid := cfg.Shards[shard]
for _, server := range cfg.Groups[gid] {
srv := kv.make_end(server)
reply := MigrateReply{}
if ok := srv.Call("ShardKV.ShardMigration", &args, &reply); ok && reply.Err == OK {
kv.rf.Start(reply)
}
}
}(shard, kv.mck.Query(idx))
}
kv.mu.Unlock()
wait.Wait()
}
func (kv *ShardKV) daemon(do func(), sleepMS int) {
for {
select {
case <-kv.killCh:
return
default:
do()
}
time.Sleep(time.Duration(sleepMS) * time.Millisecond)
}
}
StartServer方法添加定时执行
go kv.daemon(kv.tryPollNewCfg,50)
go kv.daemon(kv.tryPullShard,80)
Raft处理
收到的新的CONFIG和拿到的MIGRATION DATA打给放进RAFT的LOG去做线性一致的排序。
StartServer添加方法通过applyCh获取RAFT返回消息
go func() {
for {
select {
case <- kv.killCh:
return
case applyMsg := <- kv.applyCh:
if !applyMsg.CommandValid {
kv.readSnapShot(applyMsg.CommandData)
continue
}
kv.apply(applyMsg)
}
}
}()
func (kv *ShardKV) apply(applyMsg raft.ApplyMsg) {
if cfg, ok := applyMsg.Command.(shardmaster.Config); ok {
kv.updateInAndOutDataShard(cfg)
} else if migrationData, ok := applyMsg.Command.(MigrateReply); ok{
kv.updateDBWithMigrateData(migrationData)
}else {
op := applyMsg.Command.(Op)
if op.OpType == "GC" {
cfgNum,_ := strconv.Atoi(op.Key)
kv.gc(cfgNum,op.SeqNum);
} else {
kv.normal(&op)
}
if notifyCh := kv.put(applyMsg.CommandIndex,false); notifyCh != nil {
send(notifyCh,op)
}
}
if kv.needSnapShot() {
go kv.doSnapShot(applyMsg.CommandIndex)
}
}
MIGRATION DATA REPLY乱序
REPLY 发到RAFT里面,虽然有顺序,但返回的时候顺序可能是乱的。比如现在我的CONFIG已经更新到9,这个时候RAFT才把CONFIG的6 返回回来。我们应该直接忽略这个版本。
在收到CONFIG 变更刷新CONFIG,然后更新COME IN SHARD,随后后台线程会去PULL。从更新COME IN SHARD到数据SHARD过来。这段时间内,必须得拒绝掉所有的索要该SHARD的请求。所以我们不能直接从CONFIG来判断是不是WRONG GROUP。
ShardKV添加myShards继续当前SHARD是否可以提供服务
myShards map[int]bool
if _, ok := kv.myShards[migrationData.Shard]; !ok {
kv.myShards[migrationData.Shard] = true
for k, v := range migrationData.DB {
kv.db[k] = v
}
for k, v := range migrationData.Cid2Seq {
kv.cid2Seq[k] = Max(v,kv.cid2Seq[k])
}
}
实现Shard更新
根据新的CONFIG来,判断要送出去的数据是哪些,自己要接受进来的数组是哪些。
func (kv *ShardKV) updateInAndOutDataShard(cfg shardmaster.Config) {
kv.mu.Lock()
defer kv.mu.Unlock()
if cfg.Num <= kv.cfg.Num { //only consider newer config
return
}
oldCfg, toOutShard := kv.cfg, kv.myShards
kv.myShards, kv.cfg = make(map[int]bool), cfg
for shard, gid := range cfg.Shards {
if gid != kv.gid {continue}
if _, ok := toOutShard[shard]; ok || oldCfg.Num == 0 {
kv.myShards[shard] = true
delete(toOutShard, shard)
} else {
kv.comeInShards[shard] = oldCfg.Num
}
}
if len(toOutShard) > 0 { // prepare data that needed migration
kv.toOutShards[oldCfg.Num] = make(map[int]map[string]string)
for shard := range toOutShard {
outDb := make(map[string]string)
for k, v := range kv.db {
if key2shard(k) == shard {
outDb[k] = v
delete(kv.db, k)
}
}
kv.toOutShards[oldCfg.Num][shard] = outDb
}
}
}
WRONG GROUP
数据会在APPLY CH收到新的CONFIG,一部分要TO OUT的数据就会从DB里DELETE掉。为了确保NOTIFY CH的传输过程中,这个DB的更改不会影响到实际的GET的返回值。我们需要在接到APPLY CH的时候就把结果给注入到OP里。不然等OP发过去再从DB拿,有一定概率此时另一个线程已经再DELETE DB了。
func (kv *ShardKV) normal(op *Op) {
shard := key2shard(op.Key)
kv.mu.Lock()
if _, ok := kv.myShards[shard]; !ok {
op.OpType = ErrWrongGroup
} else {
maxSeq,found := kv.cid2Seq[op.Cid]
if !found || op.SeqNum > maxSeq {
if op.OpType == "Put" {
kv.db[op.Key] = op.Value
} else if op.OpType == "Append" {
kv.db[op.Key] += op.Value
}
kv.cid2Seq[op.Cid] = op.SeqNum
}
if op.OpType == "Get" {
op.Value = kv.db[op.Key]
}
}
kv.mu.Unlock()
}
func (kv *ShardKV) templateStart(originOp Op) (Err, string) {
index,_,isLeader := kv.rf.Start(originOp)
if isLeader {
ch := kv.put(index, true)
op := kv.beNotified(ch, index)
if equalOp(originOp, op) { return OK, op.Value }
if op.OpType == ErrWrongGroup { return ErrWrongGroup, "" }
}
return ErrWrongLeader,""
}
func equalOp(a Op, b Op) bool{
return a.Key == b.Key && a.OpType == b.OpType && a.SeqNum == b.SeqNum && a.Cid == b.Cid
}
func (kv *ShardKV) beNotified(ch chan Op,index int) Op{
select {
case notifyArg,ok := <- ch :
kv.mu.Lock()
if ok {
close(ch)
}
delete(kv.chMap,index)
kv.mu.Unlock()
return notifyArg
case <- time.After(time.Duration(1000)*time.Millisecond):
return Op{}
}
}
实现新的SNAPSHOT
func (kv *ShardKV) doSnapShot(index int) {
w := new(bytes.Buffer)
e := labgob.NewEncoder(w)
kv.mu.Lock()
e.Encode(kv.db)
e.Encode(kv.cid2Seq)
e.Encode(kv.comeInShards)
e.Encode(kv.toOutShards)
e.Encode(kv.myShards)
e.Encode(kv.cfg)
e.Encode(kv.garbages)
kv.mu.Unlock()
kv.rf.ReplaceLogWithSnapshot(index,w.Bytes())
}
func (kv *ShardKV) readSnapShot(snapshot []byte) {
kv.mu.Lock()
defer kv.mu.Unlock()
if snapshot == nil || len(snapshot) < 1 {return}
r := bytes.NewBuffer(snapshot)
d := labgob.NewDecoder(r)
var db map[string]string
var cid2Seq map[int64]int
var toOutShards map[int]map[int]map[string]string
var comeInShards map[int]int
var myShards map[int]bool
var garbages map[int]map[int]bool
var cfg shardmaster.Config
if d.Decode(&db) != nil || d.Decode(&cid2Seq) != nil || d.Decode(&comeInShards) != nil ||
d.Decode(&toOutShards) != nil || d.Decode(&myShards) != nil || d.Decode(&cfg) != nil ||
d.Decode(&garbages) != nil {
log.Fatal("readSnapShot ERROR for server ",kv.me)
} else {
kv.db, kv.cid2Seq, kv.cfg = db, cid2Seq, cfg
kv.toOutShards, kv.comeInShards, kv.myShards, kv.garbages = toOutShards,comeInShards,myShards,garbages
}
}
GC
func (kv *ShardKV) GarbageCollection(args *MigrateArgs, reply *MigrateReply) {
reply.Err = ErrWrongLeader
if _, isLeader := kv.rf.GetState(); !isLeader {return}
kv.mu.Lock()
defer kv.mu.Unlock()
if _,ok := kv.toOutShards[args.ConfigNum]; !ok {return}
if _,ok := kv.toOutShards[args.ConfigNum][args.Shard]; !ok {return}
originOp := Op{"GC",strconv.Itoa(args.ConfigNum),"",nrand(),args.Shard}
kv.mu.Unlock()
reply.Err,_ = kv.templateStart(originOp)
kv.mu.Lock()
}
func (kv *ShardKV) gc(cfgNum int, shard int) {
kv.mu.Lock()
defer kv.mu.Unlock()
if _, ok := kv.toOutShards[cfgNum]; ok {
delete(kv.toOutShards[cfgNum], shard)
if len(kv.toOutShards[cfgNum]) == 0 {
delete(kv.toOutShards, cfgNum)
}
}
}
后台gc
func (kv *ShardKV) tryGC() {
_, isLeader := kv.rf.GetState();
kv.mu.Lock()
if !isLeader || len(kv.garbages) == 0{
kv.mu.Unlock()
return
}
var wait sync.WaitGroup
for cfgNum, shards := range kv.garbages {
for shard := range shards {
wait.Add(1)
go func(shard int, cfg shardmaster.Config) {
defer wait.Done()
args := MigrateArgs{shard, cfg.Num}
gid := cfg.Shards[shard]
for _, server := range cfg.Groups[gid] {
srv := kv.make_end(server)
reply := MigrateReply{}
if ok := srv.Call("ShardKV.GarbageCollection", &args, &reply); ok && reply.Err == OK {
kv.mu.Lock()
defer kv.mu.Unlock()
delete(kv.garbages[cfgNum], shard)
if len(kv.garbages[cfgNum]) == 0 {
delete(kv.garbages, cfgNum)
}
}
}
}(shard, kv.mck.Query(cfgNum))
}
}
kv.mu.Unlock()
wait.Wait()
}
代码总结
common
package shardkv
//
// Sharded key/value server.
// Lots of replica groups, each running op-at-a-time paxos.
// Shardmaster decides which group serves each shard.
// Shardmaster may change shard assignment from time to time.
//
// You will have to modify these definitions.
//
const (
OK = "OK"
ErrNoKey = "ErrNoKey"
ErrWrongGroup = "ErrWrongGroup"
ErrWrongLeader = "ErrWrongLeader"
)
type Err string
// Put or Append
type PutAppendArgs struct {
// You'll have to add definitions here.
Key string
Value string
Op string // "Put" or "Append"
Cid int64 "client unique id"
SeqNum int "each request with a monotonically increasing sequence number"
}
type PutAppendReply struct {
WrongLeader bool
Err Err
}
type GetArgs struct {
Key string
// You'll have to add definitions here.
}
type GetReply struct {
WrongLeader bool
Err Err
Value string
}
type MigrateArgs struct {
Shard int
ConfigNum int
}
type MigrateReply struct {
Err Err
ConfigNum int
Shard int
DB map[string]string
Cid2Seq map[int64]int
}
func Max(x, y int) int {
if x > y {
return x
}
return y
}
client
package shardkv
//
// client code to talk to a sharded key/value service.
//
// the client first talks to the shardmaster to find out
// the assignment of shards (keys) to groups, and then
// talks to the group that holds the key's shard.
//
import (
"labrpc"
)
import "crypto/rand"
import "math/big"
import "shardmaster"
import "time"
//
// which shard is a key in?
// please use this function,
// and please do not change it.
//
func key2shard(key string) int {
shard := 0
if len(key) > 0 {
shard = int(key[0])
}
shard %= shardmaster.NShards
return shard
}
func nrand() int64 {
max := big.NewInt(int64(1) << 62)
bigx, _ := rand.Int(rand.Reader, max)
x := bigx.Int64()
return x
}
type Clerk struct {
sm *shardmaster.Clerk
config shardmaster.Config
make_end func(string) *labrpc.ClientEnd
// You will have to modify this struct.
lastLeader int
id int64
seqNum int
}
//
// the tester calls MakeClerk.
//
// masters[] is needed to call shardmaster.MakeClerk().
//
// make_end(servername) turns a server name from a
// Config.Groups[gid][i] into a labrpc.ClientEnd on which you can
// send RPCs.
//
func MakeClerk(masters []*labrpc.ClientEnd, make_end func(string) *labrpc.ClientEnd) *Clerk {
ck := new(Clerk)
ck.sm = shardmaster.MakeClerk(masters)
ck.make_end = make_end
// You'll have to add code here.
ck.id = nrand()//give each client a unique identifier, and then have them
ck.seqNum = 0// tag each request with a monotonically increasing sequence number.
ck.lastLeader = 0
return ck
}
//
// fetch the current value for a key.
// returns "" if the key does not exist.
// keeps trying forever in the face of all other errors.
// You will have to modify this function.
//
func (ck *Clerk) Get(key string) string {
args := GetArgs{}
args.Key = key
for {
shard := key2shard(key)
gid := ck.config.Shards[shard]
if servers, ok := ck.config.Groups[gid]; ok {
// try each server for the shard.
for si := 0; si < len(servers); si++ {
srv := ck.make_end(servers[si])
var reply GetReply
ok := srv.Call("ShardKV.Get", &args, &reply)
if ok && reply.WrongLeader == false && (reply.Err == OK || reply.Err == ErrNoKey) {
return reply.Value
}
if ok && (reply.Err == ErrWrongGroup) {
break
}
}
}
time.Sleep(100 * time.Millisecond)
// ask master for the latest configuration.
ck.config = ck.sm.Query(-1)
}
return ""
}
//
// shared by Put and Append.
// You will have to modify this function.
//
func (ck *Clerk) PutAppend(key string, value string, op string) {
args := PutAppendArgs{key,value,op,ck.id,ck.seqNum}
ck.seqNum++
for {
shard := key2shard(key)
gid := ck.config.Shards[shard]
if servers, ok := ck.config.Groups[gid]; ok {
for si := 0; si < len(servers); si++ {
srv := ck.make_end(servers[si])
var reply PutAppendReply
ok := srv.Call("ShardKV.PutAppend", &args, &reply)
if ok && reply.WrongLeader == false && reply.Err == OK {
return
}
if ok && reply.Err == ErrWrongGroup {
break
}
}
}
time.Sleep(100 * time.Millisecond)
// ask master for the latest configuration.
ck.config = ck.sm.Query(-1)
}
}
func (ck *Clerk) Put(key string, value string) {
ck.PutAppend(key, value, "Put")
}
func (ck *Clerk) Append(key string, value string) {
ck.PutAppend(key, value, "Append")
}
server
package shardmaster
import (
"log"
"math"
"raft"
"time"
)
import "labrpc"
import "sync"
import "labgob"
type ShardMaster struct {
mu sync.Mutex
me int
rf *raft.Raft
applyCh chan raft.ApplyMsg
// Your data here.
configs []Config // indexed by config num
chMap map[int]chan Op
cid2Seq map[int64]int
killCh chan bool
}
type Op struct {
OpType string "operation type(eg. join/leave/move/query)"
Args interface{} // could be JoinArgs, LeaveArgs, MoveArgs and QueryArgs, in reply it could be config
Cid int64
SeqNum int
}
func (sm *ShardMaster) Join(args *JoinArgs, reply *JoinReply) {
originOp := Op{"Join",*args,args.Cid,args.SeqNum}
reply.WrongLeader = sm.templateHandler(originOp)
}
func (sm *ShardMaster) Leave(args *LeaveArgs, reply *LeaveReply) {
originOp := Op{"Leave",*args,args.Cid,args.SeqNum}
reply.WrongLeader = sm.templateHandler(originOp)
}
func (sm *ShardMaster) Move(args *MoveArgs, reply *MoveReply) {
originOp := Op{"Move",*args,args.Cid,args.SeqNum}
reply.WrongLeader = sm.templateHandler(originOp)
}
func (sm *ShardMaster) Query(args *QueryArgs, reply *QueryReply) {
reply.WrongLeader = true;
originOp := Op{"Query",*args,nrand(),-1}
reply.WrongLeader = sm.templateHandler(originOp)
if !reply.WrongLeader {
sm.mu.Lock()
defer sm.mu.Unlock()
if args.Num >= 0 && args.Num < len(sm.configs) {
reply.Config = sm.configs[args.Num]
} else {
reply.Config = sm.configs[len(sm.configs) - 1]
}
}
}
func (sm *ShardMaster) templateHandler(originOp Op) bool {
wrongLeader := true
index,_,isLeader := sm.rf.Start(originOp)
if !isLeader {return wrongLeader}
ch := sm.getCh(index,true)
op := sm.beNotified(ch,index)
if equalOp(op,originOp) {
wrongLeader = false
}
return wrongLeader
}
func (sm *ShardMaster) beNotified(ch chan Op, index int) Op {
select {
case notifyArg := <- ch :
sm.mu.Lock()
close(ch)
delete(sm.chMap,index)
sm.mu.Unlock()
return notifyArg
case <- time.After(time.Duration(600)*time.Millisecond):
return Op{}
}
}
func equalOp(a Op, b Op) bool{
return a.SeqNum == b.SeqNum && a.Cid == b.Cid && a.OpType == b.OpType
}
func (sm *ShardMaster) Kill() {
sm.rf.Kill()
sm.killCh <- true
}
// needed by shardkv tester
func (sm *ShardMaster) Raft() *raft.Raft {
return sm.rf
}
func (sm *ShardMaster) getCh(idx int, createIfNotExists bool) chan Op{
sm.mu.Lock()
defer sm.mu.Unlock()
if _, ok := sm.chMap[idx]; !ok {
if !createIfNotExists {return nil}
sm.chMap[idx] = make(chan Op,1)
}
return sm.chMap[idx]
}
func (sm *ShardMaster) updateConfig(op string, arg interface{}) {
cfg := sm.createNextConfig()
if op == "Move" {
moveArg := arg.(MoveArgs)
if _,exists := cfg.Groups[moveArg.GID]; exists {
cfg.Shards[moveArg.Shard] = moveArg.GID
} else {return}
}else if op == "Join" {
joinArg := arg.(JoinArgs)
for gid,servers := range joinArg.Servers {
newServers := make([]string, len(servers))
copy(newServers, servers)
cfg.Groups[gid] = newServers
sm.rebalance(&cfg,op,gid)
}
} else if op == "Leave"{
leaveArg := arg.(LeaveArgs)
for _,gid := range leaveArg.GIDs {
delete(cfg.Groups,gid)
sm.rebalance(&cfg,op,gid)
}
} else {
log.Fatal("invalid area",op)
}
sm.configs = append(sm.configs,cfg)
}
func (sm *ShardMaster) createNextConfig() Config {
lastCfg := sm.configs[len(sm.configs)-1]
nextCfg := Config{Num: lastCfg.Num + 1, Shards: lastCfg.Shards, Groups: make(map[int][]string)}
for gid, servers := range lastCfg.Groups {
nextCfg.Groups[gid] = append([]string{}, servers...)
}
return nextCfg
}
func (sm *ShardMaster) rebalance(cfg *Config, request string, gid int) {
shardsCount := sm.groupByGid(cfg) // gid -> shards
switch request {
case "Join":
avg := NShards / len(cfg.Groups)
for i := 0; i < avg; i++ {
maxGid := sm.getMaxShardGid(shardsCount)
cfg.Shards[shardsCount[maxGid][0]] = gid
shardsCount[maxGid] = shardsCount[maxGid][1:]
}
case "Leave":
shardsArray,exists := shardsCount[gid]
if !exists {return}
delete(shardsCount,gid)
if len(cfg.Groups) == 0 { // remove all gid
cfg.Shards = [NShards]int{}
return
}
for _,v := range shardsArray {
minGid := sm.getMinShardGid(shardsCount)
cfg.Shards[v] = minGid
shardsCount[minGid] = append(shardsCount[minGid], v)
}
}
}
func (sm *ShardMaster) groupByGid(cfg *Config) map[int][]int {
shardsCount := map[int][]int{}
for k,_ := range cfg.Groups {
shardsCount[k] = []int{}
}
for k, v := range cfg.Shards {
shardsCount[v] = append(shardsCount[v], k)
}
return shardsCount
}
func (sm *ShardMaster) getMaxShardGid(shardsCount map[int][]int) int {
max := -1
var gid int
for k, v := range shardsCount {
if max < len(v) {
max = len(v)
gid = k
}
}
return gid
}
func (sm *ShardMaster) getMinShardGid(shardsCount map[int][]int) int {
min := math.MaxInt32
var gid int
for k, v := range shardsCount {
if min > len(v) {
min = len(v)
gid = k
}
}
return gid
}
func send(notifyCh chan Op,op Op) {
select{
case <-notifyCh:
default:
}
notifyCh <- op
}
func StartServer(servers []*labrpc.ClientEnd, me int, persister *raft.Persister) *ShardMaster {
sm := new(ShardMaster)
sm.me = me
sm.configs = make([]Config, 1)
sm.configs[0].Groups = map[int][]string{}
labgob.Register(Op{})
labgob.Register(JoinArgs{})
labgob.Register(LeaveArgs{})
labgob.Register(MoveArgs{})
labgob.Register(QueryArgs{})
sm.applyCh = make(chan raft.ApplyMsg)
sm.rf = raft.Make(servers, me, persister, sm.applyCh)
// Your code here.
sm.chMap = make(map[int]chan Op)
sm.cid2Seq = make(map[int64]int)
sm.killCh = make(chan bool,1)
go func() {
for {
select {
case <-sm.killCh:
return
case applyMsg := <-sm.applyCh:
if !applyMsg.CommandValid {continue}
op := applyMsg.Command.(Op)
sm.mu.Lock()
maxSeq,found := sm.cid2Seq[op.Cid]
if op.SeqNum >= 0 && (!found || op.SeqNum > maxSeq) {
sm.updateConfig(op.OpType,op.Args)
sm.cid2Seq[op.Cid] = op.SeqNum
}
sm.mu.Unlock()
if notifyCh := sm.getCh(applyMsg.CommandIndex,false); notifyCh != nil {
send(notifyCh,op)
}
}
}
}()
return sm
}