原文:https://leshared.com/148.html
文章内容可能更新,阅读原文可获得最新内容
混合存储中flashcache和bcache是比较知名的两个开源项目,之前文章详述了flashcache的使用[点我查看],这篇文章描述先bcache的安装和使用
bcache-tools 源码:https://github.com/koverstreet/bcache-tools
系统信息
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.6 LTS
Release: 16.04
Codename: xenial
bcache安装
bcache可以大概分为两个部分,一个是linux内核模块,一个是bcache-tools。bcache内核模块在linux内核3.10及以上才支持,所以使用bcache,需要将内核升级到3.10及以上版本才行。
安装依赖
sudo apt install libblkid-dev
sudo apt install pkg-config
下载编译bcache-tools
git clone https://evilpiepirate.org/git/bcache-tools.git
cd bcache-tools
make
问题: 编译可能会报错
$ make
cc -O2 -Wall -g `pkg-config --cflags uuid blkid` make-bcache.c bcache.o `pkg-config --libs uuid blkid` -o make-bcache
/tmp/cc5vHcs9.o:在函数‘write_sb’中:
/home/nvm/bcache-tools/make-bcache.c:277:对‘crc64’未定义的引用
collect2: error: ld returned 1 exit status
<内置>: recipe for target 'make-bcache' failed
make: *** [make-bcache] Error 1
网上搜了一下,是一个函数定义的bug[详见:https://www.spinics.net/lists/linux-bcache/msg02847.html],修正一下即可。
解决: 打开编译目录下bcache.c
文件,将函数头inline uint64_t crc64(const void *_data, size_t len)
中inline
去除即可
--- a/bcache.c
+++ b/bcache.c
@@ -115,7 +115,7 @@ static const uint64_t crc_table[256] = {
0x9AFCE626CE85B507ULL
};
-inline uint64_t crc64(const void *_data, size_t len)
+uint64_t crc64(const void *_data, size_t len)
{
uint64_t crc = 0xFFFFFFFFFFFFFFFFULL;
const unsigned char *data = _data;
安装bcache-tools
make install
加载内核bcache模块
linux内核3.10及以上都自带bcache模块,加载一下即可
$ sudo modprobe bcache
$ lsmod | grep bcache
bcache 233472 0
如果加载时报错,请确认内核版本再重新尝试
$ sudo modprobe bcache
modprobe: FATAL: Module bcache not found.
bcache使用
相关命令
与bcache相关的命令有:make-bcache
和bcache-super-show
$ make-bcache
\Please supply a device
Usage: make-bcache [options] device
-C, --cache Format a cache device
-B, --bdev Format a backing device
-b, --bucket bucket size
-w, --block block size (hard sector size of SSD, often 2k)
-o, --data-offset data offset in sectors
--cset-uuid UUID for the cache set
--writeback enable writeback
--discard enable discards
--cache_replacement_policy=(lru|fifo)
-h, --help display this help and exit
$ bcache-super-show
Usage: bcache-super-show [-f] <device>
查看硬盘信息
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
nvme1n1 259:8 0 13.4G 0 disk
sda 8:0 0 223.6G 0 disk
这里使用nvme1n1做加速盘,sda做后备盘
创建bcache混合存储
方式一:一键创建
创建后端设备、创建前端缓存设备,并建立他们之间的映射关系
$ sudo make-bcache -C /dev/nvme1n1 -B /dev/sda
UUID: bb156414-f6a2-49ed-a91b-90b7b55f4d10
Set UUID: 0b9c134e-cd04-41f1-8f3f-4a783c716707
version: 0
nbuckets: 27472
block_size: 1
bucket_size: 1024
nr_in_set: 1
nr_this_dev: 0
first_bucket: 1
UUID: 8682157d-e75b-441e-9f6e-0b3c00610f55
Set UUID: 0b9c134e-cd04-41f1-8f3f-4a783c716707
version: 1
block_size: 1
data_offset: 16
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
...
nvme1n1 259:8 0 13.4G 0 disk
└─bcache0 252:0 0 223.6G 0 disk
sda 8:0 0 223.6G 0 disk
└─bcache0 252:0 0 223.6G 0 disk
方式二:分步创建
- 创建后备存储盘
$ sudo make-bcache -B /dev/sda
UUID: 8682157d-e75b-441e-9f6e-0b3c00610f55
Set UUID: 0b9c134e-cd04-41f1-8f3f-4a783c716707
version: 1
block_size: 1
data_offset: 16
$ sudo bcache-super-show /dev/sda
sb.magic ok
sb.first_sector 8 [match]
sb.csum 896121C60502BF51 [match]
sb.version 1 [backing device]
dev.label (empty)
dev.uuid 8682157d-e75b-441e-9f6e-0b3c00610f55
dev.sectors_per_block 1
dev.sectors_per_bucket 1024
dev.data.first_sector 16
dev.data.cache_mode 1 [writeback]
dev.data.cache_state 2 [dirty]
cset.uuid 0b9c134e-cd04-41f1-8f3f-4a783c716707
- 创建加速盘
$ sudo make-bcache -C /dev/nvme1n1
UUID: bb156414-f6a2-49ed-a91b-90b7b55f4d10
Set UUID: 0b9c134e-cd04-41f1-8f3f-4a783c716707
version: 0
nbuckets: 102400
block_size: 1
bucket_size: 1024
nr_in_set: 1
nr_this_dev: 0
first_bucket: 1
$ sudo bcache-super-show /dev/nvme1n1
sb.magic ok
sb.first_sector 8 [match]
sb.csum 4C4CD30F3808C062 [match]
sb.version 3 [cache device]
dev.label (empty)
dev.uuid bb156414-f6a2-49ed-a91b-90b7b55f4d10
dev.sectors_per_block 1
dev.sectors_per_bucket 1024
dev.cache.first_sector 1024
dev.cache.cache_sectors 28130304
dev.cache.total_sectors 28131328
dev.cache.ordered yes
dev.cache.discard no
dev.cache.pos 0
dev.cache.replacement 0 [lru]
cset.uuid 0b9c134e-cd04-41f1-8f3f-4a783c716707
- 绑定加速盘和后备存储盘
其中串码是cset.uuid
,加速盘和后备盘cset.uuid
一样
# echo "0b9c134e-cd04-41f1-8f3f-4a783c716707" > /sys/block/bcache0/bcache/attach
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
...
nvme1n1 259:8 0 13.4G 0 disk
└─bcache0 252:0 0 223.6G 0 disk
sda 8:0 0 223.6G 0 disk
└─bcache0 252:0 0 223.6G 0 disk
遇到的问题
- 报错:设备存在
non-bcache superblock
错误
$ sudo make-bcache -C /dev/nvme1n1 -B /dev/sda
Device /dev/nvme1n1 already has a non-bcache superblock, remove it using wipefs and wipefs -a
解决:按照提示,擦除超级块部分即可
$ sudo wipefs -a /dev/nvme1n1
/dev/nvme1n1: 8 bytes were erased at offset 0x00000200 (gpt): 45 46 49 20 50 41 52 54
/dev/nvme1n1: 8 bytes were erased at offset 0x35a7ffe00 (gpt): 45 46 49 20 50 41 52 54
/dev/nvme1n1: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa
/dev/nvme1n1: calling ioctl to re-read partition table: Success
- 报错:
Device or resource busy
$ sudo make-bcache -C /dev/nvme1n1 -B /dev/sda
Can't open dev /dev/nvme1n1: Device or resource busy
重启系统再尝试
使用
可以将bcache0
作为一个正常设备使用
$ sudo mkfs.ext4 /dev/bcache0
$ sudo mount /dev/bcache0 /mnt/
$ df -T -h
Filesystem Type Size Used Avail Use% Mounted on
...
/dev/bcache0 ext4 220G 60M 209G 1% /test
$ sudo umount /mnt/
查看相关信息
- state
# cat /sys/block/bcache0/bcache/state
clean
state的几个状态:
- no cache:该backing device没有attach任何caching device
- clean:一切正常,缓存是干净的
- dirty:一切正常,已启用回写,缓存是脏的
- inconsistent:遇到问题,后台设备与缓存设备不同步
- 缓存数据量
# cat /sys/block/bcache0/bcache/dirty_data
0.0k
- writeback信息
# cat /sys/block/bcache0/bcache/writeback_
writeback_delay writeback_percent writeback_rate_debug writeback_rate_p_term_inverse writeback_running
writeback_metadata writeback_rate writeback_rate_d_term writeback_rate_update_seconds
更改缓存模式
bcache支持三种缓存模式:
- Writeback : 写入时先写到Cache中,同时将对应block的元数据dirty bit,但是并不会立即写入后备存储器
- Writethrough : 写入时将数据同时写入cache和后备存储器,后备存储器写完,才算写操作完成
- Writearound : 写入的时候,绕过Cache,直接写入后备存储器,即加速盘只当读缓存
下面这张图可以形象说明三者区别:
- 查看缓存模式
bcache默认的缓存模式是writethrough
$ cat /sys/block/bcache0/bcache/cache_mode
[writethrough] writeback writearound none
- 更改缓存策略
bcache缓存策略比较灵活,可以随时修改,需要以root身份修改
# echo writearound > /sys/block/bcache0/bcache/cache_mode
$ cat /sys/block/bcache0/bcache/cache_mode
writethrough writeback [writearound] none
解绑和删除
注意:解绑之前先把混合设备数据转移防止丢失!
通过解绑加速盘和后备盘的绑定,使设备回到使用bcache之前的状态
- 解除加速盘和后备盘映射关系
要将加速盘从当前的后备盘删除,只需将cset.uuid
detach
到bcache设备即可实现
先查看cset.uuid
,加速盘和后备盘设备cset.uuid
是一致的
$ sudo umount /test
$ sudo bcache-super-show /dev/nvme1n1
sb.magic ok
sb.first_sector 8 [match]
sb.csum 4C4CD30F3808C062 [match]
sb.version 3 [cache device]
dev.label (empty)
dev.uuid bb156414-f6a2-49ed-a91b-90b7b55f4d10
dev.sectors_per_block 1
dev.sectors_per_bucket 1024
dev.cache.first_sector 1024
dev.cache.cache_sectors 28130304
dev.cache.total_sectors 28131328
dev.cache.ordered yes
dev.cache.discard no
dev.cache.pos 0
dev.cache.replacement 0 [lru]
cset.uuid 0b9c134e-cd04-41f1-8f3f-4a783c716707
然后解除映射:
# echo "0b9c134e-cd04-41f1-8f3f-4a783c716707" > /sys/block/bcache0/bcache/detach
$ cat /sys/block/sda/bcache/state
no cache
这里要注意,在注销加速盘前,要确认该加速设备没有被任何的后备盘使用,否则可能会有数据丢失的风险。
- 删除加速盘
通过加速盘的cset.uuid
,在/sys/fs/bcache/<cset.uuid>/unregister
写入1(echo的数字不重要,可为任何值),即可进行注销操作
echo 1 >/sys/fs/bcache/0b9c134e-cd04-41f1-8f3f-4a783c716707/unregister
然后ls
查看/sys/fs/bcache/
,如果没有0b9c134e-cd04-41f1-8f3f-4a783c716707
这个目录,就表示注销成功了。
ls /sys/fs/bcache/
- 删除后备盘
echo 1 > /sys/block/sda/bcache/stop
参考:
http://www.yangguanjun.com/2018/03/26/lvm-sata-ssd-bcache/
https://ypdai.github.io/2018/07/13/bcache%E9%85%8D%E7%BD%AE%E4%BD%BF%E7%94%A8/