MIT6.824 lab1实践过程-1

第一步,使用git到如下地址拉取项目

$ git clone git://g.csail.mit.edu/6.824-golabs-2020 6.824
$ cd 6.824
$ ls
Makefile src
$

 执行如下命令

$ cd ~/6.824
$ cd src/main
$ go build -buildmode=plugin ../mrapps/wc.go
$ rm mr-out*
$ go run mrsequential.go wc.so pg*.txt
$ more mr-out-0
A 509
ABOUT 2
ACT 8
...

发现报错,是go mod的问题, 执行以下操作修改go mod(去stakoverflow寻找解决方案)

cd 6.824
go mod init "6.824-golabs-2020" 
# change file src/mrapps/wc.go line9 to `import "6.824-golabs-2020/src/mr"`
cd src
go build -buildmode=plugin mrapps/wc.go

还需要修改mrsequential.go 的导入包,最后测试成功,完成环境搭建。

下面来看mrsequential.go的主函数

第一个参数wc.so提供了map函数和reduce函数,第二个之后的参数是一系列的文件名

func main() {
	if len(os.Args) < 3 {
		fmt.Fprintf(os.Stderr, "Usage: mrsequential xxx.so inputfiles...\n")
		os.Exit(1)
	}

	mapf, reducef := loadPlugin(os.Args[1])

	//
	// read each input file,
	// pass it to Map,
	// accumulate the intermediate Map output.
	//
	intermediate := []mr.KeyValue{}
	for _, filename := range os.Args[2:] {
		file, err := os.Open(filename)
		if err != nil {
			log.Fatalf("cannot open %v", filename)
		}
		content, err := ioutil.ReadAll(file)
		if err != nil {
			log.Fatalf("cannot read %v", filename)
		}
		file.Close()
		kva := mapf(filename, string(content))
		intermediate = append(intermediate, kva...)
	}

	//
	// a big difference from real MapReduce is that all the
	// intermediate data is in one place, intermediate[],
	// rather than being partitioned into NxM buckets.
	//

	sort.Sort(ByKey(intermediate))

	oname := "mr-out-0"
	ofile, _ := os.Create(oname)

	//
	// call Reduce on each distinct key in intermediate[],
	// and print the result to mr-out-0.
	//
	i := 0
	for i < len(intermediate) {
		j := i + 1
		for j < len(intermediate) && intermediate[j].Key == intermediate[i].Key {
			j++
		}
		values := []string{}
		for k := i; k < j; k++ {
			values = append(values, intermediate[k].Value)
		}
		output := reducef(intermediate[i].Key, values)

		// this is the correct format for each line of Reduce output.
		fmt.Fprintf(ofile, "%v %v\n", intermediate[i].Key, output)

		i = j
	}

	ofile.Close()
}

 

再去看一下wc.go

//
// The map function is called once for each file of input. The first
// argument is the name of the input file, and the second is the
// file's complete contents. You should ignore the input file name,
// and look only at the contents argument. The return value is a slice
// of key/value pairs.
//
func Map(filename string, contents string) []mr.KeyValue {
	// function to detect word separators.
	ff := func(r rune) bool { return !unicode.IsLetter(r) }

	// split contents into an array of words.
	words := strings.FieldsFunc(contents, ff)

	kva := []mr.KeyValue{}
	for _, w := range words {
		kv := mr.KeyValue{w, "1"}
		kva = append(kva, kv)
	}
	return kva
}

//
// The reduce function is called once for each key generated by the
// map tasks, with a list of all the values created for that key by
// any map task.
//
func Reduce(key string, values []string) string {
	// return the number of occurrences of this word.
	return strconv.Itoa(len(values))
}

 

mrmaster.go 主函数

func main() {
	if len(os.Args) < 2 {
		fmt.Fprintf(os.Stderr, "Usage: mrmaster inputfiles...\n")
		os.Exit(1)
	}

	m := mr.MakeMaster(os.Args[1:], 10)
	for m.Done() == false {
		time.Sleep(time.Second)
	}

	time.Sleep(time.Second)
}

 

mrwork.go 主函数

func main() {
	if len(os.Args) != 2 {
		fmt.Fprintf(os.Stderr, "Usage: mrworker xxx.so\n")
		os.Exit(1)
	}

	mapf, reducef := loadPlugin(os.Args[1])

	mr.Worker(mapf, reducef)
}

 

 

首先做这一步

  • One way to get started is to modify mr/worker.go's Worker() to send an RPC to the master asking for a task. Then modify the master to respond with the file name of an as-yet-unstarted map task. Then modify the worker to read that file and call the application Map function, as in mrsequential.go.

 

先测试master和worker的RPC

go run mrmaster.go pg-*.txt
go run mrworker.go wc.so

 

在这个RPC的基础上,实现worker的注册,存在好多个worker,所以worker必须有一个编号,和它对应的map,reduce函数(后面实现)

type worker struct {
	workerID int
	mapf func(string, string) []KeyValue
	reducef func(string, []string) string
}

实现远程过程调用


func (w worker) callRegisterWorker() {
	args := &RegisterArgs{}
	reply := &RegisterReply{}
	if ok := call("Master.RegisterWorker", args, reply); !ok {
		log.Fatal("error: register worker failed.")
	}
	w.workerID = reply.ID

	fmt.Printf("worker %v registered success.\n", w.workerID)

}

由于master要给worker分配编号,所以master要记录当前worker数,考虑到并发问题,还需要加锁

type Master struct {
	// Your definitions here.
	numWorkers  int
	mutex       sync.Mutex
}
func (m *Master) RegisterWorker(args *RegisterArgs, reply *RegisterReply) error {
	m.mutex.Lock()
	defer m.mutex.Unlock()
	reply.ID = m.numWorkers
	m.numWorkers++
	return nil
}

运行测试,未完待续。

 

 

  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值