本文仅仅是在了解原理的基础上,在不改动源码的情况下做的一种实验性尝试。
实验介绍
实验环境:
两台CentOS7
云主机(规格不限),都安装20.8.3.18版本ck,多读和一读原理一样,所以只选择两台机器做功能测试
实验方法及于预期:
两个集群都创建相同的S3
策略的表,A集群执行写入,在A集群会生成S3
的元信息文件,将A集群元信息文件通过rsync
同步到B集群同表的detached
目录下,对B集群该表执行attach partition
加载S3
元信息,最终可实现相同数据的读取。
配置ck及验证
配置s3.xml
将该配置文件s3.xml
放在/etc/clickhouse-server/config.d
目录下,两个集群一定配置同样的endpoint
!!!
<yandex>
<storage_configuration>
<disks>
<s3>
<type>s3</type>
<endpoint>http://{s3url}/{bucket}/{path}/</endpoint>
<access_key_id>{ak}</access_key_id>
<secret_access_key>{sk}</secret_access_key>
</s3>
</disks>
<policies>
<s3>
<volumes>
<main>
<disk>s3</disk>
</main>
</volumes>
</s3>
</policies>
</storage_configuration>
</yandex>
验证s3.xml
配置生效,两个集群都要验证
SELECT *
FROM system.storage_policies
WHERE policy_name = 's3'
Row 1:
──────
policy_name: s3
volume_name: main
volume_priority: 1
disks: ['s3']
volume_type: JBOD
max_data_part_size: 0
move_factor: 0.1
1 rows in set. Elapsed: 0.002 sec.
SELECT *
FROM system.disks
WHERE name = 's3'
Row 1:
──────
name: s3
path: /var/lib/clickhouse/disks/s3/
free_space: 18446744073709551615
total_space: 18446744073709551615
keep_free_space: 0
type: s3
1 rows in set. Elapsed: 0.002 sec.
创建S3
策略表,两个集群都要分别创建,以MergeTree
为例
CREATE TABLE t_s3 (
dt Date,
id Int64,
data String,
INDEX min_max (id) TYPE minmax GRANULARITY 3
) ENGINE=MergeTree()
PARTITION BY dt
ORDER BY (dt, id)
SETTINGS
storage_policy='s3',
old_parts_lifetime=0,
index_granularity=512;
插入数据,注意只插入A集群!!!
INSERT INTO t_s3 VALUES ('2020-08-18',1,'A'),('2020-08-18',2,'B'),('2020-08-18',3,'C'),('2020-08-18',4,'D'),('2020-08-18',5,'E'),('2020-08-18',6,'F');
安装rsync
及验证
安装rsync
yum -y install rsync
验证rsync
rsync -avzP --delete root@{ip}:{remote_path} {local_path}
ck一写多读测试验证
1.验证A集群可以正常读写
ck-A :) INSERT INTO t_s3 VALUES ('2020-08-18',1,'A'),('2020-08-18',2,'B'),('2020-08-18',3,'C'),('2020-08-18',4,'D'),('2020-08-18',5,'E'),('2020-08-18',6,'F');
INSERT INTO t_s3 VALUES
Ok.
6 rows in set. Elapsed: 0.517 sec.
ck-A :) select * from t_s3;
SELECT *
FROM t_s3
┌─────────dt─┬─id─┬─data─┐
│ 2020-08-18 │ 1 │ A │
│ 2020-08-18 │ 2 │ B │
│ 2020-08-18 │ 3 │ C │
│ 2020-08-18 │ 4 │ D │
│ 2020-08-18 │ 5 │ E │
│ 2020-08-18 │ 6 │ F │
└────────────┴────┴──────┘
6 rows in set. Elapsed: 0.067 sec.
2.查看A集群的数据文件
[root@ck-A t_s3]# pwd
/var/lib/clickhouse/disks/s3/data/default/t_s3
[root@ck-A t_s3]# ll 20200818_1_1_0/
total 56
-rw-r----- 1 clickhouse clickhouse 47 Oct 29 14:56 checksums.txt
-rw-r----- 1 clickhouse clickhouse 45 Oct 29 14:56 columns.txt
-rw-r----- 1 clickhouse clickhouse 43 Oct 29 14:56 count.txt
-rw-r----- 1 clickhouse clickhouse 45 Oct 29 14:56 data.bin
-rw-r----- 1 clickhouse clickhouse 45 Oct 29 14:56 data.mrk2
-rw-r----- 1 clickhouse clickhouse 45 Oct 29 14:56 dt.bin
-rw-r----- 1 clickhouse clickhouse 45 Oct 29 14:56 dt.mrk2
-rw-r----- 1 clickhouse clickhouse 45 Oct 29 14:56 id.bin
-rw-r----- 1 clickhouse clickhouse 45 Oct 29 14:56 id.mrk2
-rw-r----- 1 clickhouse clickhouse 43 Oct 29 14:56 minmax_dt.idx
-rw-r----- 1 clickhouse clickhouse 43 Oct 29 14:56 partition.dat
-rw-r----- 1 clickhouse clickhouse 45 Oct 29 14:56 primary.idx
-rw-r----- 1 clickhouse clickhouse 45 Oct 29 14:56 skp_idx_min_max.idx
-rw-r----- 1 clickhouse clickhouse 45 Oct 29 14:56 skp_idx_min_max.mrk2
[root@ck-A t_s3]# cat 20200818_1_1_0/dt.bin
2
1 38
38 dfznsxlsuklaeokvgbvuxeesxyezaqsv
0
3.B集群暂时查询不到数据
ck-B :) select * from t_s3;
SELECT *
FROM t_s3
Ok.
0 rows in set. Elapsed: 0.001 sec.
4.验证B集群相应路径没有数据文件
[root@ck-B t_s3]# pwd
/var/lib/clickhouse/disks/s3/data/default/t_s3
[root@ck-B t_s3]# ll
total 4
drwxr-x--- 2 clickhouse clickhouse 6 Oct 29 14:53 detached
-rw-r----- 1 clickhouse clickhouse 43 Oct 29 14:53 format_version.txt
[root@ck-B t_s3]# ll detached/
total 0
5.在B集群所在机器执行rsync
命令复制S3
元信息文件
[root@ck-B t_s3]# rsync -avzP --delete root@114.67.249.95:/var/lib/clickhouse/disks/s3/data/default/t_s3/20200818_1_1_0 /var/lib/clickhouse/disks/s3/data/default/t_s3/detached
receiving incremental file list
20200818_1_1_0/
20200818_1_1_0/checksums.txt
47 100% 45.90kB/s 0:00:00 (xfr#1, to-chk=13/15)
20200818_1_1_0/columns.txt
45 100% 43.95kB/s 0:00:00 (xfr#2, to-chk=12/15)
20200818_1_1_0/count.txt
43 100% 41.99kB/s 0:00:00 (xfr#3, to-chk=11/15)
20200818_1_1_0/data.bin
45 100% 43.95kB/s 0:00:00 (xfr#4, to-chk=10/15)
20200818_1_1_0/data.mrk2
45 100% 43.95kB/s 0:00:00 (xfr#5, to-chk=9/15)
20200818_1_1_0/dt.bin
45 100% 43.95kB/s 0:00:00 (xfr#6, to-chk=8/15)
20200818_1_1_0/dt.mrk2
45 100% 43.95kB/s 0:00:00 (xfr#7, to-chk=7/15)
20200818_1_1_0/id.bin
45 100% 43.95kB/s 0:00:00 (xfr#8, to-chk=6/15)
20200818_1_1_0/id.mrk2
45 100% 43.95kB/s 0:00:00 (xfr#9, to-chk=5/15)
20200818_1_1_0/minmax_dt.idx
43 100% 41.99kB/s 0:00:00 (xfr#10, to-chk=4/15)
20200818_1_1_0/partition.dat
43 100% 41.99kB/s 0:00:00 (xfr#11, to-chk=3/15)
20200818_1_1_0/primary.idx
45 100% 43.95kB/s 0:00:00 (xfr#12, to-chk=2/15)
20200818_1_1_0/skp_idx_min_max.idx
45 100% 43.95kB/s 0:00:00 (xfr#13, to-chk=1/15)
20200818_1_1_0/skp_idx_min_max.mrk2
45 100% 43.95kB/s 0:00:00 (xfr#14, to-chk=0/15)
sent 294 bytes received 1,572 bytes 1,244.00 bytes/sec
total size is 626 speedup is 0.34
6.验证B集群的detached目录下数据
[root@ck-cktest t_s3]# ll detached/20200818_1_1_0/
total 56
-rw-r----- 1 clickhouse clickhouse 47 Oct 29 14:56 checksums.txt
-rw-r----- 1 clickhouse clickhouse 45 Oct 29 14:56 columns.txt
-rw-r----- 1 clickhouse clickhouse 43 Oct 29 14:56 count.txt
-rw-r----- 1 clickhouse clickhouse 45 Oct 29 14:56 data.bin
-rw-r----- 1 clickhouse clickhouse 45 Oct 29 14:56 data.mrk2
-rw-r----- 1 clickhouse clickhouse 45 Oct 29 14:56 dt.bin
-rw-r----- 1 clickhouse clickhouse 45 Oct 29 14:56 dt.mrk2
-rw-r----- 1 clickhouse clickhouse 45 Oct 29 14:56 id.bin
-rw-r----- 1 clickhouse clickhouse 45 Oct 29 14:56 id.mrk2
-rw-r----- 1 clickhouse clickhouse 43 Oct 29 14:56 minmax_dt.idx
-rw-r----- 1 clickhouse clickhouse 43 Oct 29 14:56 partition.dat
-rw-r----- 1 clickhouse clickhouse 45 Oct 29 14:56 primary.idx
-rw-r----- 1 clickhouse clickhouse 45 Oct 29 14:56 skp_idx_min_max.idx
-rw-r----- 1 clickhouse clickhouse 45 Oct 29 14:56 skp_idx_min_max.mrk2
7.登陆B集群执行attach,验证数据
ck-B :) ALTER TABLE t_s3 ATTACH PARTITION '2020-08-18';
ALTER TABLE t_s3
ATTACH PARTITION '2020-08-18'
Ok.
0 rows in set. Elapsed: 0.001 sec.
ck-B :) select * from t_s3;
SELECT *
FROM t_s3
┌─────────dt─┬─id─┬─data─┐
│ 2020-08-18 │ 1 │ A │
│ 2020-08-18 │ 2 │ B │
│ 2020-08-18 │ 3 │ C │
│ 2020-08-18 │ 4 │ D │
│ 2020-08-18 │ 5 │ E │
│ 2020-08-18 │ 6 │ F │
└────────────┴────┴──────┘
6 rows in set. Elapsed: 0.382 sec.
总结
结论
可以借助ck提供的存储策略基于S3
实现多ck集群一写多读的能力。
可优化项
集群间需要同步S3
元数据,同步后读集群需要执行attach才能实现查询。
ck在使用S3
时还是会生成类似磁盘的文件,不能直接将所有文件都存储在S3
上,只是文件中存储的是S3
文件的索引,S3
上的文件名称都是一串字符串,如:
[root@ck-A t_s3]# cat 20200818_1_1_0/dt.bin
2
1 38
38 dfznsxlsuklaeokvgbvuxeesxyezaqsv
0
所以文中借助了rsync
来进行这些元数据的同步,并进行attach
才能实现其他集群的查询