分布式系统MIT 6.824 lab1-part4

最新推荐文章于 2024-05-14 14:15:27 发布

B$oodyCoder

最新推荐文章于 2024-05-14 14:15:27 发布

阅读量213

点赞数

分类专栏： MIT 6.824 文章标签： go golang mit

本文链接：https://blog.csdn.net/u012327735/article/details/105394767

版权

Part IV: Handling worker failures

任务描述

处理worker失败的情况。

也就是rpc调用call函数可能因为超时会返回false。

解决方案

当worker fail后，需要把任务调度给另外一个任务，可以用数组存所有的task。

每次从数组中取任务，失败后又添加回数组中。

但这样还会有一个问题

那就是有可能正常worker完成任务后退出了，所以要确保所有任务都完成，才能退出。

建立一个字段finishTaskCnt来记录已经完成的任务数，只有所有任务完成后，worker才能退出，否则一直等待获取新任务。上一章完成的代码，只要任务分配完了，worker做完就能退出。现在必须保证所有任务确实做完了，才能退出。

源代码

func schedule(jobName string, mapFiles []string, nReduce int, phase jobPhase, registerChan chan string) {
   
	var ntasks int
	var n_other int // number of inputs (for reduce) or outputs (for map)
	switch phase {
   
	case mapPhase:
		ntasks = len(mapFiles)
		n_other = nReduce
	case reducePhase:
		ntasks = nReduce
		n_other = len(mapFiles)
	}

	fmt.Printf("Schedule: %v %v tasks (%d I/Os)\n", ntasks, phase, n_other)

最低0.47元/天解锁文章

B$oodyCoder

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
分布式系统MIT 6.824 lab1-part4

Part IV: Handling worker failures任务描述处理worker失败的情况。也就是rpc调用call函数可能因为超时会返回false。解决方案当worker fail后，需要把任务调度给另外一个任务，可以用数组存所有的task。每次从数组中取任务，失败后又添加回数组中。但这样还会有一个问题那就是有可能正常worker完成任务后退出了，所以要确保所有任务都完...
复制链接

扫一扫