注意事项
想用fio测出eMMC最好的读写性能要注意以下几点。
- eMMC性能受限于eMMC芯片本身,所以可从datasheet上查看性能数据作为参考。
- eMMC的速度模式和始终频率决定传输速度,目前eMMC 5.1最高的速度模式是HS400, 8-bit数据总线,200MHz模式。
- 顺序读写性能相对会高。
- eMMC厂商可能在eMMC芯片中实现了buffer,比如200MB的buffer。如果传输文件大小小于buffer,那么传输速度是非常快的。但是datasheet可能并不会告诉你它的存在。所以测试时,可以使用小的文件size看看性能有没有提升。
fio交叉编译
我的测试是在一个arm64的板子上。
安装工具链
apt install gcc-aarch64-linux-gnu
export ARCH=aarch64
export CROSS_COMPILE=aarch64-linux-gnu-
下载编译fio
直接make,就会自动按aarch64的平台编译。
git clone git://git.kernel.dk/fio.git
cd fio
git reset --hard fio-3.25
make
fio测试
首先查看当前eMMC的信息,包括速度模式,数据总线宽度,时钟频率。可以看到速度模式,数据总线宽度都是最高,但是实际时钟频率150MHz比协议最高时钟频率200MHz还是低一些,这会影响性能。
# cat /sys/kernel/debug/mmc1/ios
clock: 200000000 Hz
actual clock: 150000000 Hz
vdd: 21 (3.3 ~ 3.4 V)
bus mode: 2 (push-pull)
chip select: 0 (don't care)
power mode: 2 (on)
bus width: 3 (8 bits)
timing spec: 10 (mmc HS400)
signal voltage: 1 (1.80 V)
driver type: 0 (driver type B)
创建一个jobfile,mmc1.ini。
[global]
ioengine=psync
direct=1
invalidate=1
iodepth=1
time_based
runtime=60
ramp_time=10
[read-1024k-seq]
stonewall
rw=read
bs=1024k
filename=/dev/mmcblk1
filesize=100mb
[write-1024k-seq]
stonewall
rw=write
bs=1024k
filename=/dev/mmcblk1
filesize=100mb
运行fio
# fio mmc1.ini
read-1024k-seq: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
write-1024k-seq: (g=1): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
fio-3.25
Starting 2 processes
Jobs: 1 (f=1): [_(1),W(1)][100.0%][w=20.0MiB/s][w=20 IOPS][eta 00m:00s]
read-1024k-seq: (groupid=0, jobs=1): err= 0: pid=769: Mon Jan 4 07:46:38 2021
read: IOPS=170, BW=171MiB/s (179MB/s)(10.0GiB/60002msec)
clat (usec): min=5774, max=6906, avg=5839.52, stdev=100.22
lat (usec): min=5775, max=6907, avg=5840.73, stdev=100.22
clat percentiles (usec):
| 1.00th=[ 5800], 5.00th=[ 5800], 10.00th=[ 5800], 20.00th=[ 5800],
| 30.00th=[ 5800], 40.00th=[ 5800], 50.00th=[ 5800], 60.00th=[ 5800],
| 70.00th=[ 5866], 80.00th=[ 5866], 90.00th=[ 5866], 95.00th=[ 5866],
| 99.00th=[ 6259], 99.50th=[ 6783], 99.90th=[ 6849], 99.95th=[ 6849],
| 99.99th=[ 6915]
bw ( KiB/s): min=174080, max=176480, per=100.00%, avg=175173.47, stdev=1019.27, samples=120
iops : min= 170, max= 172, avg=170.89, stdev= 0.99, samples=120
lat (msec) : 10=100.00%
cpu : usr=0.24%, sys=2.13%, ctx=31187, majf=0, minf=60
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=10255,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
write-1024k-seq: (groupid=1, jobs=1): err= 0: pid=770: Mon Jan 4 07:46:38 2021
write: IOPS=19, BW=19.9MiB/s (20.8MB/s)(1193MiB/60038msec); 0 zone resets
clat (usec): min=45401, max=72577, avg=50211.56, stdev=5073.69
lat (usec): min=45452, max=72686, avg=50313.43, stdev=5074.01
clat percentiles (usec):
| 1.00th=[47449], 5.00th=[47973], 10.00th=[47973], 20.00th=[48497],
| 30.00th=[48497], 40.00th=[49021], 50.00th=[49021], 60.00th=[49021],
| 70.00th=[49021], 80.00th=[49546], 90.00th=[50070], 95.00th=[68682],
| 99.00th=[70779], 99.50th=[71828], 99.90th=[71828], 99.95th=[72877],
| 99.99th=[72877]
bw ( KiB/s): min=18432, max=22528, per=100.00%, avg=20356.54, stdev=782.41, samples=120
iops : min= 18, max= 22, avg=19.87, stdev= 0.77, samples=120
lat (msec) : 50=91.87%, 100=8.13%
cpu : usr=0.31%, sys=0.16%, ctx=3676, majf=0, minf=61
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,1193,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=171MiB/s (179MB/s), 171MiB/s-171MiB/s (179MB/s-179MB/s), io=10.0GiB (10.8GB), run=60002-60002msec
Run status group 1 (all jobs):
WRITE: bw=19.9MiB/s (20.8MB/s), 19.9MiB/s-19.9MiB/s (20.8MB/s-20.8MB/s), io=1193MiB (1251MB), run=60038-60038msec
Disk stats (read/write):
mmcblk1: ios=23933/2783, merge=0/0, ticks=101957/104857, in_queue=150068, util=58.09%
结果分析
测出的结果,读性能179MB/s,写性能20.8MB/s。我们使用的eMMC型号是MTFC8GAKAJCN-4M IT,从datasheet上看,厂商给出的度性能是190MB/s,写性能是22MB/s。测试结果和datasheet是匹配的,甚至更优,因为我们使用的时钟频率是150MHz,如果提升是200MHz性能会更高。