lat_mem_rd 内存延迟测试工具原理,lmbench编译时llseek链接不到问题解决

命令介绍:

说明页:

 LAT_MEM_RD(8) manual page

 lat_mem_rd是lmbench中的一个工具,它的主要作用是测试内存访问的延迟。

源码地址: 

https://github.com/keith-packard/lmbench3

cd lmbench3

make 

即可进行编译,生成的文件在./bin目录下。

编译时遇到链接错误,找不到llseek64的问题,可以通过修改

gcc -O -DRUSAGE -DHAVE_uint=1 -DHAVE_int64_t=1 -DHAVE_DRAND48 -DHAVE_SCHED_SETAFFINITY=1   -o ../bin/x86_64-linux-gnu/disk disk.c ../bin/x86_64-linux-gnu/lmbench.a -lm
/usr/bin/ld: /tmp/cc7D60jo.o: in function `seekto':
disk.c:(.text+0x37): undefined reference to `llseek'
collect2: error: ld returned 1 exit status

disk.c中将两个llseek改成 lseek64 即可。

#ifdef	__linux__
	//extern	loff_t llseek(int, loff_t, int);
   extern	loff_t lseek64(int, loff_t, int);

	//if (llseek(fd, (loff_t)off, SEEK_SET) == (loff_t)-1) {
    if (lseek64(fd, (loff_t)off, SEEK_SET) == (loff_t)-1) {
		return(-1);
	}
	return (0);
#else

传送门:

intel平台可以使用官方的内存测试工具

Intel® Memory Latency Checker v3.9a

命令使用方法:

lat_mem_rd size_in_megabytes stride [stride stride...]

如: lat_mem_rd  128  64  1024

即: size是128MB,

stride分别为64Byte  1024Byte, 如果不指定stride,默认值是512, 可以指定多个stride,一个命令进行多次测试。

命令输出说明:

选择不同的参数用来测试内存或者cache.

The output is best examined in a graph where you typically get a graph that has four plateaus. The graph should plotted in log base 2 of the array size on the X axis and the latency on the Y axis. Each stride is then plotted as a curve. The plateaus that appear correspond to the onboard cache (if present), external cache (if present), main memory latency, and TLB miss latency.

As a rough guide, you may be able to extract the latencies of the various parts as follows, but you should really look at the graphs, since these rules of thumb do not always work (some systems do not have onboard cache, for example).

onboard cache

Try stride of 128 and array size of .00098.

external cache

Try stride of 128 and array size of .125.

main memory

Try stride of 128 and array size of 8.

TLB miss

Try the largest stride and the largest array.

下面是一个测试结果,摘自Zoran's Blog: Memory latency using lat_mem_rd from lmbench

 前面结果1.205ns的是访问L1 cache

后续3ns左右是访问L2cache

6ns是访问L3 cach3

访问内存延迟在21ns左右。

再后面

> numactl --membind=0 --cpunodebind=0 ./lat_mem_rd 2000 128
"stride=128
0.00049 1.205
0.00098 1.198
0.00195 1.195
0.00293 1.209
0.00391 1.211
0.00586 1.201
0.00781 1.199
0.01172 1.201
0.01562 1.194
0.02344 1.200
0.03125 1.217
0.04688 3.523
0.06250 3.646
0.09375 3.616
0.12500 3.611
0.18750 3.658
0.25000 4.928
0.37500 5.837
0.50000 5.791
0.75000 5.843
1.00000 5.883
1.50000 5.959
2.00000 5.983
3.00000 6.174
4.00000 9.150
6.00000 15.852
8.00000 19.982
12.00000 21.567
16.00000 21.585
24.00000 21.735
32.00000 21.610
48.00000 22.535
64.00000 22.093
96.00000 22.033
128.00000 22.608
192.00000 21.498
256.00000 21.594
384.00000 21.492
512.00000 21.473
768.00000 22.752
1024.00000 22.462

 二、内部实现

 lat_mem_rd的延迟测试的代码是这样写的

#define    ONE            p = (char **)*p;
#define    FIVE    ONE ONE ONE ONE ONE
#define    TEN            FIVE FIVE
#define    FIFTY    TEN TEN TEN TEN TEN
#define    HUNDRED    FIFTY FIFTY

    while (iterations-- > 0) {
        for (i = 0; i < count; ++i) {
            HUNDRED;
        }
    }

用指针指向下一个内存地址空间来循环访问, 比如说0.00049 1.584, 这个结果就是在512字节范围内, 步长16来一直循环访问, 最后时间除以访问次数就是延迟

范围超过l1 cache的32k的时候, 会有一个阶级变化

评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

路边闲人2

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值