给SPDK添加 WRITE SAME的支持

372 篇文章 4 订阅
345 篇文章 27 订阅
WRITE SAME 命令是SCSI中一个不是必须的实现的命令,主要的用途是在重置设备内容。
一个典型的场景是ESXi下厚制备立即置零整个卷。

在云场景,一个VM一般对应多个卷,每个卷的空间都是G到T级别。
为了性能的稳定,很多分布式系统都需要将卷写一遍,然后跑业务或者跑性能。

全写卷的目的是让 backend 存储提前分配好元数据,做好预热等。

如果是上层直接发写全卷的调用,例如write,那么写一个T级别的卷,需要耗费非常长的时间。
例如1T的卷,顺序写的速度是500M/s,那么写入需要2000s左右,约等于30多分钟。

Write Same 就是为这种场景准备的。
通过在块设备层,下发Write Same命令,上层不需要传输那么多数据,只需要很少的数据(512B),
然后在backend反复的写这部分数据,就可以达到 offload 写到backend的目的。
总之,Write Same的目的是:

大大减少数据的传输;
offload 全卷写到backend;
如果跟UNMAP结合,就可以最大限度的避免写,更进一步提升性能。

本文Agenda如下:

介绍一下SCSI provison的知识;
如何查看SCSI provison 和 Write Same/Unmap的协商信息
WRITE SAME测试方法;
provision

provision 决定了逻辑块与物理块的对应关系。
读一下 SBC-3 手册可知,provision 分为如下几类:

full provision:逻辑块和物理块一一对应
logcal block provision
resource provision:有足够的资源,使得所有的逻辑快都可以对应到一个物理块,但是当前有些是 unmap 或者 anchor;
thin provision:可以超分,也就是 lb 的数量可以大于物理块的数量
术语

- anchor:预留的意思,lb 与物理块有对应关系,但是并没有使用
- unmapping:lb 与物理块没有对应
sg_utils 查看SCSI相关特性

查看是否支持 logical provision:

$ sg_readcap /dev/sda -l
Read Capacity results:
 Protection: prot_en=0, p_type=0, p_i_exponent=0
 Logical block provisioning: lbpme=1, lbprz=0
 Last logical block address=629145599 (0x257fffff), Number of logical blocks=629145600
 Logical block length=512 bytes
 Logical blocks per physical block exponent=0
 Lowest aligned logical block address=0
Hence:
 Device size: 322122547200 bytes, 307200.0 MiB, 322.12 GB
查看 block limits 的限制:

 $ sg_vpd -p oi /dev/sdb                                                                                                                          1 ↵
Block limits VPD page (SBC):
  Write same no zero (WSNZ): 0
  Maximum compare and write length: 0 blocks
  Optimal transfer length granularity: 1 blocks
  Maximum transfer length: 2097152 blocks
  Optimal transfer length: 64 blocks
  Maximum prefetch length: 0 blocks
  Maximum unmap LBA count: 4294967295
  Maximum unmap block descriptor count: 256
  Optimal unmap granularity: 1
  Unmap granularity alignment valid: 0
  Unmap granularity alignment: 0
  Maximum write same length: 0xffff blocks
查看一块盘的 lbp 的信息
 $ sg_vpd -p lbpv /dev/sda                                                                           
  Logical block provisioning VPD page (SBC):
  Unmap command supported (LBPU): 0
  Write same (16) with unmap bit supported (LBWS): 1
  Write same (10) with unmap bit supported (LBWS10): 0
  Logical block provisioning read zeros (LBPRZ): 0
  Anchored LBAs supported (ANC_SUP): 0
  Threshold exponent: 0
  Descriptor present (DP): 0
  Provisioning type: 0
以上几个命令,在做 SCSI provision,Write same和unmap协议支持检查时经常用到的,SCSI 设备通过这几个inquiry暴露自己的特性。

以下是上面查询结果中,最关键的缩写,以及意义:

LBPU:Logical block provisioning unmap,支持unmap
LBWS:Logical block provisioning write same
LBWS10:Logical block provisioning write same16
LBPRZ:Logical block provisioning read zeros
lbpme:logical block provision management enable,如果是1,表示支持 logical block provision;
lbprz:ogical block provisioning read zeros,如果是1,表示从 provision 的地方读0;
Maximum write same length: 0xffff blocks 表示一个Write Same命令可以写的最大长度;
查看 mapping 的状态:

 $ sg_get_lba_status /dev/sda
descriptor LBA: 0x0000000000000000  blocks: 838860800  mapped
使用scsi_debug 测试验证

modprobe scsi_debug lbprz=1 lbpu=1 lbpws=1 dev_size_mb=1024
$ sg_readcap /dev/sdb -l                                                                             
Read Capacity results:
   Protection: prot_en=0, p_type=0, p_i_exponent=0
   Logical block provisioning: lbpme=1, lbprz=1
   Last logical block address=2097151 (0x1fffff), Number of logical blocks=2097152
   Logical block length=512 bytes
   Logical blocks per physical block exponent=0
   Lowest aligned logical block address=0
Hence:
   Device size: 1073741824 bytes, 1024.0 MiB, 1.07 GB

 $ sg_vpd -p lbpv /dev/sdb
Logical block provisioning VPD page (SBC):
  Unmap command supported (LBPU): 1
  Write same (16) with unmap bit supported (LBWS): 1
  Write same (10) with unmap bit supported (LBWS10): 0
  Logical block provisioning read zeros (LBPRZ): 1
  Anchored LBAs supported (ANC_SUP): 0
  Threshold exponent: 0
  Descriptor present (DP): 0
  Provisioning type: 0
 $ rmmod scsi_debug && modprobe scsi_debug lbprz=1 lbpu=0 lbpws=0 dev_size_mb=1024
 $ sg_readcap /dev/sdb -l && sg_vpd -p lbpv /dev/sdb
Read Capacity results:
   Protection: prot_en=0, p_type=0, p_i_exponent=0
   Logical block provisioning: lbpme=0, lbprz=0
   Last logical block address=2097151 (0x1fffff), Number of logical blocks=2097152
   Logical block length=512 bytes
   Logical blocks per physical block exponent=0
   Lowest aligned logical block address=0
Hence:
   Device size: 1073741824 bytes, 1024.0 MiB, 1.07 GB
Logical block provisioning VPD page (SBC):
  Unmap command supported (LBPU): 0
  Write same (16) with unmap bit supported (LBWS): 0
  Write same (10) with unmap bit supported (LBWS10): 0
  Logical block provisioning read zeros (LBPRZ): 1
  Anchored LBAs supported (ANC_SUP): 0
  Threshold exponent: 0
  Descriptor present (DP): 0
  Provisioning type: 0
 $ rmmod scsi_debug && modprobe scsi_debug lbprz=1 lbpu=1 lbpws=0 dev_size_mb=1024

 $ sg_readcap /dev/sdb -l && sg_vpd -p lbpv /dev/sdb
Read Capacity results:
   Protection: prot_en=0, p_type=0, p_i_exponent=0
   Logical block provisioning: lbpme=1, lbprz=1
   Last logical block address=2097151 (0x1fffff), Number of logical blocks=2097152
   Logical block length=512 bytes
   Logical blocks per physical block exponent=0
   Lowest aligned logical block address=0
Hence:
   Device size: 1073741824 bytes, 1024.0 MiB, 1.07 GB
Logical block provisioning VPD page (SBC):
  Unmap command supported (LBPU): 1
  Write same (16) with unmap bit supported (LBWS): 0
  Write same (10) with unmap bit supported (LBWS10): 0
  Logical block provisioning read zeros (LBPRZ): 1
  Anchored LBAs supported (ANC_SUP): 0
  Threshold exponent: 0
  Descriptor present (DP): 0
  Provisioning type: 0
关掉 logical provision

 $ rmmod scsi_debug && modprobe scsi_debug lbprz=1 lbpu=0 lbpws=0 dev_size_mb=1024
 $ sg_readcap /dev/sdb -l && sg_vpd -p lbpv /dev/sdb
Read Capacity results:
   Protection: prot_en=0, p_type=0, p_i_exponent=0
   Logical block provisioning: lbpme=0, lbprz=0
   Last logical block address=2097151 (0x1fffff), Number of logical blocks=2097152
   Logical block length=512 bytes
   Logical blocks per physical block exponent=0
   Lowest aligned logical block address=0
Hence:
   Device size: 1073741824 bytes, 1024.0 MiB, 1.07 GB
Logical block provisioning VPD page (SBC):
  Unmap command supported (LBPU): 0
  Write same (16) with unmap bit supported (LBWS): 0
  Write same (10) with unmap bit supported (LBWS10): 0
  Logical block provisioning read zeros (LBPRZ): 1
  Anchored LBAs supported (ANC_SUP): 0
  Threshold exponent: 0
  Descriptor present (DP): 0
  Provisioning type: 0

 $ sg_get_lba_status -l 1024 /dev/sdb
Get LBA Status command not supported
测试 unmap

 $ rmmod scsi_debug && modprobe scsi_debug lbprz=1 lbpu=1 lbpws=1 dev_size_mb=1024
 $ dd if=/dev/zero of=/dev/sdb bs=512 seek=1024 count=138 && sg_get_lba_status -l 1024 /dev/sdb
138+0 records in
138+0 records out
70656 bytes (71 kB) copied, 0.0192505 s, 3.7 MB/s
descriptor LBA: 0x0000000000000400  blocks: 144  mapped


 $ sg_unmap -v -l 1024 -n 16 /dev/sdb
    unmap cdb: 42 00 00 00 00 00 00 00 18 00

 $ sg_get_lba_status -l 1024 /dev/sdb
descriptor LBA: 0x0000000000000400  blocks: 16  deallocated

 $ sg_get_lba_status -l 1040 /dev/sdb
descriptor LBA: 0x0000000000000410  blocks: 128  mapped
write same

详情在这里:https://www.systutorials.com/docs/linux/man/8-sg_write_same/

 $ rmmod scsi_debug && modprobe scsi_debug lbprz=1 lbpu=1 lbpws=1 dev_size_mb=1024
 $ dd if=/dev/zero of=/dev/sdb bs=512 seek=1024 count=138 && sg_get_lba_status -l 1024 /dev/sdb
138+0 records in
138+0 records out
70656 bytes (71 kB) copied, 0.0187926 s, 3.8 MB/s
descriptor LBA: 0x0000000000000400  blocks: 144  mapped

 $ sg_write_same -U --in /dev/zero --num=128 --lba=1024 /dev/sdb

 $ sg_get_lba_status -l 1024 /dev/sdb
descriptor LBA: 0x0000000000000400  blocks: 128  deallocated

 $ dd if=/dev/zero of=/dev/sdb bs=512 seek=1024 count=138 && sg_get_lba_status -l 1024 /dev/sdb
138+0 records in
138+0 records out
70656 bytes (71 kB) copied, 0.0186514 s, 3.8 MB/s
descriptor LBA: 0x0000000000000400  blocks: 144  mapped

 $ sg_write_same --in /dev/zero --num=128 --lba=1024 /dev/sdb

 $ sg_get_lba_status -l 1024 /dev/sdb
descriptor LBA: 0x0000000000000400  blocks: 144  mapped

 $ cat /sys/bus/pseudo/drivers/scsi_debug/map
1152-1167
SPDK 实现完成之后 WRITE SAME测试用例

1. 写512的0
dd if=/dev/urandom of=/dev/sdf bs=512 seek=0 count=2 && sg_get_lba_status -l 0 /dev/sdf
hexdump -C -n1024 /dev/sdf
sg_write_same --num=1 --lba=0 /dev/sdf -vvv
hexdump -C -n1024 /dev/sdf

2. unmap
# not support
sg_write_same -U --in buf --num=1 --lba=0 /dev/sdf -vvv

3. 写小于512的内容
perl -e 'print("-" x 504, "+" x 4);' >buf
time sg_write_same -U --in buf --num=4 --lba=0 /dev/sdf -vvv

4. 写大于512的内容
非对齐
perl -e 'print("-" x 512, "+" x 4);' >buf
time sg_write_same -U --in buf --num=4 --lba=0 /dev/sdf -vvv

对齐
perl -e 'print("-" x 510, "+" x 514);' >buf
time sg_write_same --in buf --num=4 --lba=0 /dev/sdf -vvv
hexdump -C -n1024 /dev/sdf

5. 性能测试
写256M数据
time sg_write_same --num=$((256*1024*1024/512)) --lba=0 /dev/sdf -vvv


no-data-out
time sg_write_same -N --num=$((256*1024*1024/512)) --lba=0 /dev/sdf -vvv
ESXi测试用例

创建一个VM
创建一个卷,类型为“厚制备立即置零”,打开wireshark,可以看到ESXi下发的Write Same命令
总结

实现 SCSI 的write same 比较简单,按照spec实现就行,patch有空整理发出来。
实现了 write same之后,ESXi 厚制备立即置零(thick provision eager zeroed)在我的测试机上提速5倍以上。

原文连接:https://zhuanlan.zhihu.com/p/44606912

 (免费订阅,永久学习)学习地址: Dpdk/网络协议栈/vpp/OvS/DDos/NFV/虚拟化/高性能专家-学习视频教程-腾讯课堂

  

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值