DS-lab2

lab2


Part A: The Viewservice

注意点

Hint: you’ll want to add field(s) to ViewServer in server.go in order to keep track of the most recent time at which the viewservice has heard a Ping from each server. Perhaps a map from server names to time.Time. You can find the current time with time.Now().

Hint: add field(s) to ViewServer to keep track of the current view.

Hint: you’ll need to keep track of whether the primary for the current view has acknowledged it (in PingArgs.Viewnum).

Hint: your viewservice needs to make periodic decisions, for example to promote the backup if the viewservice has missed DeadPings pings from the primary. Add this code to the tick() function, which is called once per PingInterval.

Hint: there may be more than two servers sending Pings. The extra ones (beyond primary and backup) are volunteering to be backup if needed.

Hint: the viewservice needs a way to detect that a primary or backup has failed and re-started. For example, the primary may crash and quickly restart without missing sending a single Ping.

Hint: study the test cases before you start programming. If you fail a test, you may have to look at the test code in test_test.go to figure out the failure scenario is.

实践注意点

  1. 在tick中实现change view操作
  2. 在ping中记录好状态,pingtime, idel server, primary ack等。其返回给client的状态,可以是旧的view。不一定得是最新的view,等client重试获得新的view。

Part B: The primary/backup key/value service

步骤

  1. You should start by modifying pbservice/server.go to Ping the viewservice to find the current view. Do this in the tick() function. Once a server knows the current view, it knows if it is the primary, the backup, or neither.
  2. Implement Get, Put, and Append handlers in pbservice/server.go; store keys and values in a map[string]string. If a key does not exist, Append should use an empty string for the previous value. Implement the client.go RPC stubs.
  3. Modify your handlers so that the primary forwards updates to the backup.
  4. When a server becomes the backup in a new view, the primary should send it the primary’s complete key/value database.
  5. Modify client.go so that clients keep re-trying until they get an answer. Make sure that you include enough information in PutAppendArgs, and GetArgs (see common.go) so that the key/value service can detect duplicates. Modify the key/value service to handle duplicates correctly.
  6. Modify client.go to cope with a failed primary. If the current primary doesn’t respond, or doesn’t think it’s the primary, have the client consult the viewservice (in case the primary has changed) and try again. Sleep for viewservice.PingInterval between re-tries to avoid burning up too much CPU time.
    原文

注意点

Hint: you’ll probably need to create new RPCs to forward client requests from primary to backup, since the backup should reject a direct client request but should accept a forwarded request.

Hint: you’ll probably need to create new RPCs to handle the transfer of the complete key/value database from the primary to a new backup. You can send the whole database in one RPC (for example, include a map[string]string in the RPC arguments).

Hint: the state to filter duplicates must be replicated along with the key/value state.

Hint: the tester arranges for RPC replies to be lost in tests whose description includes “unreliable”. This will cause RPCs to be executed by the receiver, but since the sender sees no reply, it cannot tell whether the server executed the RPC.

Hint: you may need to generate numbers that have a high probability of being unique. Try this:

import "crypto/rand"
import "math/big"
func nrand() int64 {
  max := big.NewInt(int64(1) << 62)
  bigx, _ := rand.Int(rand.Reader, max)
  x := bigx.Int64()
  return x
}

Hint: the tests kill a server by setting its dead flag. You must make sure that your server terminates correctly when that flag is set, otherwise you may fail to complete the test cases.

Hint: even if your viewserver passed all the tests in Part A, it may still have bugs that cause failures in Part B.

Hint: study the test cases before you start programming

实践时的注意点

1. commit 时机的选择
  1. 简单的做法就是每次update都primary不断重试backup,直到成功再返回。

    Created with Raphaël 2.1.0 Client Client Primary Primary Backup Backup 1. Request 2. Update 执行 3. Ok/ErrWrongServer/Duplicated 执行 4. Ok/ErrWrongServer/Duplicated

    如图所示,需要考虑以下几种情况:
    1. Primary丢来自client的包,此时client重试即可
    2. Backup丢来自Primary的包,此时Primary重试,(1)直到Backup成功,(2)或者Backup挂掉了(此时可以通过viewServer或者最新的view得知)
    3. Backup完成update操作,但是丢掉了回给Primary的包,此时Primary同2一样重试。(需要注意这个时候Backup完成了update操作,下次给它发同一个update操作,需要Backup有duplicate判断逻辑,防止重复执行)
    4. Primary回给client的包丢失,此时client重试,Primary按正常逻辑执行,但是需要判断是否是同一个update操作的逻辑。

    对于判断重复逻辑,简单的做法就是给每个请求带一个id,server端记录下所有id。这样就可以判断是否重复来同一个update操作。这个id的生成得尽量少碰撞。生成方式在注意点中提到了。

  2. 复杂点的做法是两阶段提交协议

  3. 彻底解决就要paxos一致性协议来解决
2. 避免partition
  1. 防止old primary还提供查询或者更新功能
    1. get/update操作都需要primary给backup发送一份,backup返回成功后,本地执行成功才能返回client。
    2. 当backup认为自己是主的时候,就会返回ErrWrongServer给Primary, Primary此时可以将这个错误返回给client,让client重新获得新的view,然后再向新的Primary发起请求。

references

[1]. http://nil.csail.mit.edu/6.824/2015/labs/lab-2.html

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值