MIT 6.824 Lab1

spockumentary

已于 2022-11-29 10:11:26 修改

阅读量186

点赞数

分类专栏： MIT 6.824 文章标签： go mapreduce 分布式

于 2022-11-27 23:29:39 首次发布

本文链接：https://blog.csdn.net/weixin_54662200/article/details/128070873

版权

MIT 6.824 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

Statistic
Data Structure
Naming style
Idea
Bugs I met
TODO
Test Result

Statistic

time spent: 12 h
lines added: 397
github: https://github.com/ztzhu1/MIT6.824

Data Structure

type Coordinator struct {
	nReduce          int
	tasks_unassigned chan *Task
	map_tasks        [] *Task
	reduce_tasks     [] *Task
	mu               sync.Mutex
	mapDone          bool
	cleanDone        bool
}

type Task struct {
	Type       TaskType
	ID         int
	InputName  string
	OutputName string
	processing bool
	procStart  time.Time // time when the worker start processing the task
}

const (
	MAP    TaskType = 0
	REDUCE TaskType = 1
	REREQ  TaskType = 2 // re-request
	QUIT   TaskType = 3
)

Naming style

map task:

official	temp
mr-X-Y	mr-X-Y-ID

X denotes the Xth map task and Y for the Yth reduce task. ID is a random value generated by os.CreateTemp.

reduce task:

official	temp
mr-out-Y	mr-out-Y-ID

Same as above.

Idea

Storing the tasks in the channel, which prevents race condition naturally. But sometimes we still need the mutex lock. Because the channel is essentially a thread-safe queue. Sometimes it’s useful and convenient, but this is not always true.
Use another two slices to maintain the status of tasks. If a task is done, it’s removed from the slice. If the two slices are both empty, reprensenting all the tasks are finished.
When one worker finished its task, it notifies the coodinator, so the latter can rename the temp files, do the cleaning work and maintain tasks’ status.
There is a loop in mrcoordinator.go, which won’t stop until the coordinator says “Done!”. In every circle, it invokes a method called Tick(). Tick() will check the processing time according to real world time for every task. If a task spends too much time, there may be something wrong with the corresponding worker. So the coordinator will push this task into task_unassigned again. So that the other workers can adopt this task.
Reduce tasks should not be assigned to workers until all the map tasks are done. So I set the filed mapDone.
When all the tasks are done and the coordinator wants to quit, it cleans all the temp files, if not cleanDone.

Bugs I met

RPC needs the first letter of the fields in args and reply to be upper case, or the value of variables may be wrong.
The naming style of map task is a little confusing. When the worker tries to complete the reduce task, it collects the file with the name mr-*-Y. Seems fine, right? Actually, the temp map task name mr-X-Y-ID may also matches this pattern, when Y == ID. It’s subtle.
At the very begin, I misunderstood the meaning of map. I write all the map result into a single file. But the spirit is help every reduce task be able to process different words. So we need to assign the map result to different files, which belong to one reduce task, according to the hash value of the word.
All the workers should quit together. If the coordinator is still alive and hasn’t told the worker to quit, the worker should ask for more tasks every few seconds. But I made the worker quit directly when it finished its work and doesn’t receive a new task for a while.

TODO

For simplicity, I assigned one map task for every file (although this task can produce many files, according to the reduce number). If the file is large, we should schedule more map tasks for it, or split it into smaller files in advance.
The input name matching pattern is not very elegant.
Some locks are unnecessary.
Still haven’t made full use of channel mechanism.