使用Golang完成一个内存局部性测试实验

最新推荐文章于 2024-04-16 09:50:02 发布

dunzane

最新推荐文章于 2024-04-16 09:50:02 发布

阅读量948

点赞数

文章标签：数据结构算法

本文链接：https://blog.csdn.net/weixin_43495948/article/details/129850230

版权

1.测试代码

package main
func Loop(nums []int,step int){
	l := len(nums)
	for i := 0 ; i < step ; i++ {
		for j := i ; j < l ; j += step {
			nums[j] = 4
		}
	}
}

func main(){
	mySlice := make([]int,10)
	Loop(mySlice)
}

上述是验证代码内存局部性特征的一段代码。如果step选择3，第一次遍历会被遍历的nums下标为0、3、6、9、12……，第二次遍历会遍历的nums下标为1、4、7、10、13……，第三次遍历会遍历的nums下标为2、5、8、11、14……。那么三次外循环就会将全部遍历完整个nums数组。上述的程序表示了访问数组的局部性，step跨度越小，则表示访问nums相邻内存的局部性约好，step越大则相反。

2.Benchmark 测试

接下来用Golang的Benchmark性能测试来分别对step取不同的值进行压测，来看看通过Benchmark执行Loop()函数而统计出来的几种情况，最终消耗的时间差距为多少。首先创建loop_test.go文件，实现一个制作数组并且赋值初始化内存值的函数CreateSource()，代码如下：


func CreateSource(len int) []int {
	nums := make([]int, len)
	for i := 0; i < len; i++ {
		nums = append(nums, i)
	}
	return nums
}

其次实现一个Benchmark，制作一个长度为10000的数组，这里要注意的是创建完数组后要执行b.ResetTimer()重置计时，去掉CreateSource()消耗的时间，step跨度为1的代码如下：


func BenchmarkLoopStep1(b *testing.B) {
   //制作源数据，长度为10000
   src := CreateSource(10000)

   b.ResetTimer()
   for i:=0; i < b.N; i++ {
      Loop(src, 1)
   }
}

3.完整的代码：

loop.go

func CreateSource(len int) []int {
	nums := make([]int, 0, len)

	for i := 0; i < len; i++ {
		nums = append(nums, i)
	}

	return nums
}

func Loop(nums []int, step int) {
	l := len(nums)
	for i := 0; i < step; i++ {
		for j := i; j < l; j += step {
			nums[j] = 4 //访问内存，并写入值
		}
	}
}

loop_test.go

package main

import "testing"

func BenchmarkLoopStep1(b *testing.B) {
	//制作源数据，长度为10000
	src := CreateSource(10000)

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		Loop(src, 1)
	}
}

func BenchmarkLoopStep2(b *testing.B) {
	//制作源数据，长度为10000
	src := CreateSource(10000)

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		Loop(src, 2)
	}
}

func BenchmarkLoopStep3(b *testing.B) {
	//制作源数据，长度为10000
	src := CreateSource(10000)

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		Loop(src, 3)
	}
}

func BenchmarkLoopStep4(b *testing.B) {
	//制作源数据，长度为10000
	src := CreateSource(10000)

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		Loop(src, 4)
	}
}

func BenchmarkLoopStep5(b *testing.B) {
	//制作源数据，长度为10000
	src := CreateSource(10000)

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		Loop(src, 5)
	}
}

func BenchmarkLoopStep6(b *testing.B) {
	//制作源数据，长度为10000
	src := CreateSource(10000)

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		Loop(src, 6)
	}
}

func BenchmarkLoopStep12(b *testing.B) {
	//制作源数据，长度为10000
	src := CreateSource(10000)

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		Loop(src, 12)
	}
}

func BenchmarkLoopStep16(b *testing.B) {
	//制作源数据，长度为10000
	src := CreateSource(10000)

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		Loop(src, 16)
	}
}

4.输出结果分析

使用命令：

go test -bench=.  -count=3

输出结果如下：

goos: darwin
goarch: arm64
pkg: v1
BenchmarkLoopStep1-8              405445              2890 ns/op
BenchmarkLoopStep1-8              413742              2881 ns/op
BenchmarkLoopStep1-8              411201              2884 ns/op
BenchmarkLoopStep2-8              412641              2902 ns/op
BenchmarkLoopStep2-8              412040              2902 ns/op
BenchmarkLoopStep2-8              412099              2903 ns/op
BenchmarkLoopStep3-8              409592              2930 ns/op
BenchmarkLoopStep3-8              404161              2947 ns/op
BenchmarkLoopStep3-8              407128              2922 ns/op
BenchmarkLoopStep4-8              407964              2931 ns/op
BenchmarkLoopStep4-8              407895              2932 ns/op
BenchmarkLoopStep4-8              408778              2928 ns/op
BenchmarkLoopStep5-8              403932              2952 ns/op
BenchmarkLoopStep5-8              405253              2950 ns/op
BenchmarkLoopStep5-8              404827              2951 ns/op
BenchmarkLoopStep6-8              400930              2963 ns/op
BenchmarkLoopStep6-8              403382              2963 ns/op
BenchmarkLoopStep6-8              396916              2965 ns/op
BenchmarkLoopStep12-8             387514              3056 ns/op
BenchmarkLoopStep12-8             391561              3056 ns/op
BenchmarkLoopStep12-8             389544              3055 ns/op
BenchmarkLoopStep16-8             383607              3112 ns/op
BenchmarkLoopStep16-8             377530              3115 ns/op
BenchmarkLoopStep16-8             380583              3121 ns/op
PASS
ok      v1      32.574s

首先对上述输出内容各字段进行解释：
- BenchmarkLoopStep1-8 ：GOMAXPROCS（线程数）为8
- ‘405445’表示执行的次数
- ‘2890’表示平均耗时
上述结果表明：代码内存局部性越好(step越小)，那么代码的😊越好。

5.扩展思考

在Golang的GPM调度器模型中，为什么一个G开辟的子G优先放在当前的本地G队列中，而不是放在其他M上的本地P队列中？GPM为何要满足局部性的调度设计？

首先回忆一哈GPM架构。
一个G开辟的子G优先放在本地G队列中是为了尽可能的提高内存的局部性
GPM为何要满足局部性的调度设计是为了尽可能的提升效率【可以试着从如果不这么设计会发生什么样的情况】

dunzane

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
使用Golang完成一个内存局部性测试实验

上述是验证特征的一段代码。如果step选择3，第一次遍历会被遍历的nums下标为0、3、6、9、12……，第二次遍历会遍历的nums下标为1、4、7、10、13……，第三次遍历会遍历的nums下标为2、5、8、11、14……。那么三次外循环就会将全部遍历完整个nums数组。
复制链接

扫一扫