目录
1. 背景
我们前面安装Clickhouse集群的时候,配置的数据path都是单路径,这里我们学习配置多磁盘上的存储路径,可以减轻磁盘的压力,增加写入和读取的性能
[root@clickhouse1 ~]#
[root@clickhouse1 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 100G 0 disk
├─sda1 8:1 0 1G 0 part /boot
└─sda2 8:2 0 99G 0 part
├─centos_centos-root 253:0 0 50G 0 lvm /
├─centos_centos-swap 253:1 0 2G 0 lvm [SWAP]
└─centos_centos-home 253:2 0 47G 0 lvm /home
sr0 11:0 1 973M 0 rom
[root@clickhouse1 ~]#
从上面我们可以看到,我们的测试环境只挂载了一块磁盘,所以我们这里采用单磁盘多路径的方式来模拟
2. 实战练习
2.1 修改config.xml
在四台服务器上添加以下配置
<storage_configuration>
<disks>
<disk_hot1> <!-- 自定义磁盘名称 -->
<path>/root/clickhouse_storage/hot1/</path>
</disk_hot1>
<disk_hot2>
<path>/root/clickhouse_storage/hot2/</path>
</disk_hot2>
<disk_cold>
<path>/root/clickhouse_storage/cold/</path>
<keep_free_space_bytes>1073741824</keep_free_space_bytes>
</disk_cold>
</disks>
<policies>
<jbod_police> <!-- 自定义策略名称 -->
<volumes>
<jbod> <!-- 自定义磁盘组名称 -->
<disk>disk_hot1</disk>
<disk>disk_hot2</disk>
</jbod>
</volumes>
</jbod_police>
<hot_cold_police>
<volumes>
<hot>
<disk>disk_hot1</disk>
<disk>disk_hot2</disk>
<max_data_part_size_bytes>1048576</max_data_part_size_bytes>
</hot>
<cold>
<disk>disk_cold</disk>
</cold>
</volumes>
<move_factor>0.2</move_factor>
</hot_cold_police>
</policies>
</storage_configuration>
说明:
keep_free_space_bytes
:选填项,表示不被使用的磁盘空间大小- jbod策略只需要配置一个磁盘组,part(如202107_0_0_0)储存轮询每个disk;适用于挂载了多块磁盘,但未配置RAID
- hot/cold策略配置hot和cold两个磁盘组, part未超过(max_data_part_size_bytes[选填项] * move_factor[选填项, 默认0.1])则储存在hot磁盘组,超过则储存在cold磁盘组;适用于挂载了SSD和HDD磁盘
2.2 重启clickhouse server
在4台服务器新建path
[root@clickhouse1 ~]# mkdir -p /root/clickhouse_storage/hot1
[root@clickhouse1 ~]# mkdir /root/clickhouse_storage/hot2
[root@clickhouse1 ~]# mkdir /root/clickhouse_storage/cold
重启4台clickhouse server
[root@clickhouse1 ~]#
[root@clickhouse1 ~]# clickhouse stop
Found pid = 2660 in the list of running processes.
The process with pid = 2660 is running.
Sent terminate signal to process with pid 2660.
Waiting for server to stop
Found pid = 2660 in the list of running processes.
The process with pid = 2660 is running.
Waiting for server to stop
Found pid = 2660 in the list of running processes.
The process with pid = 2660 is running.
Waiting for server to stop
Found pid = 2660 in the list of running processes.
The process with pid = 2660 is running.
Waiting for server to stop
Now there is no clickhouse-server process.
Server stopped
[root@clickhouse1 ~]#
[root@clickhouse1 ~]# clickhouse-server --config=/etc/clickhouse-server/config.xml --daemon start
[root@clickhouse1 ~]#
2.3 验证配置成功
clickhouse1 :)
clickhouse1 :) select name, path, formatReadableSize(free_space) as free, formatReadableSize(total_space) as total, formatReadableSize(keep_free_space) as reserved from system.disks;
SELECT
name,
path,
formatReadableSize(free_space) AS free,
formatReadableSize(total_space) AS total,
formatReadableSize(keep_free_space) AS reserved
FROM system.disks
Query id: 3bcffa44-6427-450e-8b0a-de067b6d7706
┌─name──────┬─path───────────────────────────┬─free──────┬─total─────┬─reserved─┐
│ default │ /root/clickhouse/ │ 43.59 GiB │ 49.98 GiB │ 0.00 B │
│ disk_cold │ /root/clickhouse_storage/cold/ │ 42.59 GiB │ 48.98 GiB │ 1.00 GiB │
│ disk_hot1 │ /root/clickhouse_storage/hot1/ │ 43.59 GiB │ 49.98 GiB │ 0.00 B │
│ disk_hot2 │ /root/clickhouse_storage/hot2/ │ 43.59 GiB │ 49.98 GiB │ 0.00 B │
└───────────┴────────────────────────────────┴───────────┴───────────┴──────────┘
4 rows in set. Elapsed: 0.004 sec.
clickhouse1 :)
clickhouse1 :) select policy_name, volume_name, volume_priority, disks, formatReadableSize(max_data_part_size) max_data_part_size, move_factor from system.storage_policies;
SELECT
policy_name,
volume_name,
volume_priority,
disks,
formatReadableSize(max_data_part_size) AS max_data_part_size,
move_factor
FROM system.storage_policies
Query id: aa9e26e7-0580-44ae-bfb3-d9855414c44e
┌─policy_name─────┬─volume_name─┬─volume_priority─┬─disks─────────────────────┬─max_data_part_size─┬─move_factor─┐
│ default │ default │ 1 │ ['default'] │ 0.00 B │ 0 │
│ hot_cold_police │ hot │ 1 │ ['disk_hot1','disk_hot2'] │ 1.00 MiB │ 0.2 │
│ hot_cold_police │ cold │ 2 │ ['disk_cold'] │ 0.00 B │ 0.2 │
│ jbod_police │ jbod │ 1 │ ['disk_hot1','disk_hot2'] │ 0.00 B │ 0 │
└─────────────────┴─────────────┴─────────────────┴───────────────────────────┴────────────────────┴─────────────┘
4 rows in set. Elapsed: 0.014 sec.
clickhouse1 :)
2.4 jbod策略使用
创建表
clickhouse1 :)
clickhouse1 :) create table jbod_table_local on cluster sharding_ha(
:-] id UInt64
:-] ) engine = ReplicatedMergeTree('/clickhouse/tables/jbod_table/{shard}', '{replica}')
:-] order by id
:-] settings storage_policy = 'jbod_police';
CREATE TABLE jbod_table_local ON CLUSTER sharding_ha
(
`id` UInt64
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/jbod_table/{shard}', '{replica}')
ORDER BY id
SETTINGS storage_policy = 'jbod_police'
Query id: 9ad62172-2dee-49c3-ba46-bff047c9635e
┌─host────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ clickhouse2 │ 9000 │ 0 │ │ 3 │ 3 │
└─────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
┌─host────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ clickhouse3 │ 9000 │ 0 │ │ 2 │ 1 │
│ clickhouse1 │ 9000 │ 0 │ │ 1 │ 1 │
└─────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
┌─host────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ clickhouse4 │ 9000 │ 0 │ │ 0 │ 0 │
└─────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
4 rows in set. Elapsed: 0.451 sec.
clickhouse1 :)
clickhouse1 :) create table jbod_table_all on cluster sharding_ha(
:-] id UInt64
:-] ) engine = Distributed(sharding_ha, default, jbod_table_local, rand());
CREATE TABLE jbod_table_all ON CLUSTER sharding_ha
(
`id` UInt64
)
ENGINE = Distributed(sharding_ha, default, jbod_table_local, rand())
Query id: 46efcdf3-466c-4c6e-a694-cda568ca68f1
┌─host────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ clickhouse2 │ 9000 │ 0 │ │ 3 │ 3 │
└─────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
┌─host────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ clickhouse3 │ 9000 │ 0 │ │ 2 │ 0 │
│ clickhouse1 │ 9000 │ 0 │ │ 1 │ 0 │
│ clickhouse4 │ 9000 │ 0 │ │ 0 │ 0 │
└─────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
4 rows in set. Elapsed: 0.081 sec.
clickhouse1 :)
第一次插入数据
clickhouse1 :)
clickhouse1 :) insert into jbod_table_all select rand() from numbers(20);
INSERT INTO jbod_table_all SELECT rand()
FROM numbers(20)
Query id: 3f933193-0edd-4d64-8129-ed9a68b016db
Ok.
0 rows in set. Elapsed: 0.023 sec.
clickhouse1 :)
clickhouse1 :) select name, disk_name, active from system.parts where table = 'jbod_table_local';
SELECT
name,
disk_name
FROM system.parts
WHERE table = 'jbod_table_local'
Query id: d0a90599-1c40-4055-b088-55b06d5b1676
┌─name──────┬─disk_name─┬─active─┐
│ all_0_0_0 │ disk_hot1 │ 1 │
└───────────┴───────────┴────────┘
1 rows in set. Elapsed: 0.004 sec.
clickhouse1 :)
第二次插入数据
clickhouse1 :)
clickhouse1 :) insert into jbod_table_all select rand() from numbers(20);
INSERT INTO jbod_table_all SELECT rand()
FROM numbers(20)
Query id: 55875762-df39-4e55-b2f3-adf4550e2741
Ok.
0 rows in set. Elapsed: 0.027 sec.
clickhouse1 :)
clickhouse1 :) select name, disk_name, active from system.parts where table = 'jbod_table_local';
SELECT
name,
disk_name
FROM system.parts
WHERE table = 'jbod_table_local'
Query id: dd2b6285-b082-4984-8903-e074d002794a
┌─name──────┬─disk_name─┬─active─┐
│ all_0_0_0 │ disk_hot1 │ 1 │
│ all_1_1_0 │ disk_hot2 │ 1 │
└───────────┴───────────┴────────┘
2 rows in set. Elapsed: 0.005 sec.
clickhouse1 :)
强制分区合并
clickhouse1 :)
clickhouse1 :) optimize table jbod_table_local on cluster sharding_ha final;
OPTIMIZE TABLE jbod_table_local ON CLUSTER sharding_ha FINAL
Query id: a58f3b6e-6d46-4a99-bfe4-bf5405300073
┌─host────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ clickhouse2 │ 9000 │ 0 │ │ 3 │ 1 │
│ clickhouse3 │ 9000 │ 0 │ │ 2 │ 1 │
│ clickhouse4 │ 9000 │ 0 │ │ 1 │ 1 │
└─────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
┌─host────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ clickhouse1 │ 9000 │ 0 │ │ 0 │ 0 │
└─────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
4 rows in set. Elapsed: 0.203 sec.
clickhouse1 :)
clickhouse1 :) select name, disk_name, active from system.parts where table = 'jbod_table_local';
SELECT
name,
disk_name,
active
FROM system.parts
WHERE table = 'jbod_table_local'
Query id: ee17f981-d309-4f44-b7e2-b9220cda45dc
┌─name──────┬─disk_name─┬─active─┐
│ all_0_0_0 │ disk_hot1 │ 0 │
│ all_0_1_1 │ disk_hot1 │ 1 │
│ all_1_1_0 │ disk_hot2 │ 0 │
└───────────┴───────────┴────────┘
3 rows in set. Elapsed: 0.005 sec.
clickhouse1 :)
2.4 hot/cold策略使用
创建表
clickhouse1 :)
clickhouse1 :) create table hot_cold_table_local on cluster sharding_ha(
:-] id UInt64
:-] ) engine = ReplicatedMergeTree('/clickhouse/tables/hot_cold_table/{shard}', '{replica}')
:-] order by id
:-] settings storage_policy = 'hot_cold_police';
CREATE TABLE hot_cold_table_local ON CLUSTER sharding_ha
(
`id` UInt64
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/hot_cold_table/{shard}', '{replica}')
ORDER BY id
SETTINGS storage_policy = 'hot_cold_police'
Query id: 8331bece-656a-4dba-b73a-834103d507c9
┌─host────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ clickhouse2 │ 9000 │ 0 │ │ 3 │ 2 │
│ clickhouse4 │ 9000 │ 0 │ │ 2 │ 2 │
└─────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
┌─host────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ clickhouse3 │ 9000 │ 0 │ │ 1 │ 0 │
│ clickhouse1 │ 9000 │ 0 │ │ 0 │ 0 │
└─────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
4 rows in set. Elapsed: 0.233 sec.
clickhouse1 :)
clickhouse1 :) create table hot_cold_table_all on cluster sharding_ha(
:-] id UInt64
:-] ) engine = Distributed(sharding_ha, default, hot_cold_table_local, rand());
CREATE TABLE hot_cold_table_all ON CLUSTER sharding_ha
(
`id` UInt64
)
ENGINE = Distributed(sharding_ha, default, hot_cold_table_local, rand())
Query id: 03bf7af8-261e-4dad-b0c3-b267dc8738b2
┌─host────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ clickhouse3 │ 9000 │ 0 │ │ 3 │ 2 │
└─────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
┌─host────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ clickhouse2 │ 9000 │ 0 │ │ 2 │ 0 │
│ clickhouse1 │ 9000 │ 0 │ │ 1 │ 0 │
│ clickhouse4 │ 9000 │ 0 │ │ 0 │ 0 │
└─────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
4 rows in set. Elapsed: 0.090 sec.
clickhouse1 :)
第一次插入数据
clickhouse1 :)
clickhouse1 :) insert into hot_cold_table_all select rand() from numbers(200000);
INSERT INTO hot_cold_table_all SELECT rand()
FROM numbers(200000)
Query id: 9bafe2c8-f937-46d4-a29d-86ff7f35f9e6
Ok.
0 rows in set. Elapsed: 0.073 sec. Processed 200.00 thousand rows, 1.60 MB (2.76 million rows/s., 22.04 MB/s.)
clickhouse1 :)
clickhouse1 :) select name, disk_name, formatReadableSize(bytes_on_disk) size, active from system.parts where table = 'hot_cold_table_local';
SELECT
name,
disk_name,
active
FROM system.parts
WHERE table = 'hot_cold_table_local'
Query id: f7924c10-31fe-4d18-9736-4e095ae68c32
┌─name──────┬─disk_name─┬─size───────┬─active─┐
│ all_0_0_0 │ disk_hot1 │ 540.15 KiB │ 1 │
└───────────┴───────────┴────────────┴────────┘
1 rows in set. Elapsed: 0.002 sec.
clickhouse1 :)
第二次插入数据
clickhouse1 :)
clickhouse1 :) insert into hot_cold_table_all select rand() from numbers(200000);
INSERT INTO hot_cold_table_all SELECT rand()
FROM numbers(200000)
Query id: 84846376-3d19-448e-8f8a-242f2e2b999d
Ok.
0 rows in set. Elapsed: 0.057 sec. Processed 200.00 thousand rows, 1.60 MB (3.48 million rows/s., 27.84 MB/s.)
clickhouse1 :)
clickhouse1 :) select name, disk_name, formatReadableSize(bytes_on_disk) size, active from system.parts where table = 'hot_cold_table_local';
SELECT
name,
disk_name,
active
FROM system.parts
WHERE table = 'hot_cold_table_local'
Query id: 6c8081f1-5fd0-4ba3-a997-6f19db554e45
┌─name──────┬─disk_name─┬─size───────┬─active─┐
│ all_0_0_0 │ disk_hot1 │ 540.15 KiB │ 1 │
│ all_1_1_0 │ disk_hot2 │ 539.10 KiB │ 1 │
└───────────┴───────────┴────────────┴────────┘
2 rows in set. Elapsed: 0.003 sec.
clickhouse1 :)
强制分区合并
clickhouse1 :)
clickhouse1 :) optimize table hot_cold_table_local on cluster sharding_ha final;
OPTIMIZE TABLE hot_cold_table_local ON CLUSTER sharding_ha FINAL
Query id: ec7435f0-7ef9-4cf0-b944-9d0b5f4973d8
┌─host────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ clickhouse2 │ 9000 │ 0 │ │ 3 │ 0 │
│ clickhouse3 │ 9000 │ 0 │ │ 2 │ 0 │
│ clickhouse1 │ 9000 │ 0 │ │ 1 │ 0 │
│ clickhouse4 │ 9000 │ 0 │ │ 0 │ 0 │
└─────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
4 rows in set. Elapsed: 0.294 sec.
clickhouse1 :)
clickhouse1 :) select name, disk_name, formatReadableSize(bytes_on_disk) size, active from system.parts where table = 'hot_cold_table_local';
SELECT
name,
disk_name,
formatReadableSize(bytes_on_disk) AS size,
active
FROM system.parts
WHERE table = 'hot_cold_table_local'
Query id: 33d7721b-d953-4325-9742-400dcf5a7c95
┌─name──────┬─disk_name─┬─size───────┬─active─┐
│ all_0_0_0 │ disk_hot1 │ 540.15 KiB │ 0 │
│ all_0_1_1 │ disk_cold │ 1.01 MiB │ 1 │
│ all_1_1_0 │ disk_hot2 │ 539.10 KiB │ 0 │
└───────────┴───────────┴────────────┴────────┘
3 rows in set. Elapsed: 0.013 sec.
clickhouse1 :)
2.5 分区part的移动
clickhouse1 :)
clickhouse1 :) alter table hot_cold_table_local on cluster sharding_ha move part 'all_0_1_1' to disk 'disk_hot2';
ALTER TABLE hot_cold_table_local ON CLUSTER sharding_ha
MOVE PART 'all_0_1_1' TO DISK 'disk_hot2'
Query id: 8a367d9e-2e9c-4dc6-acc3-55f6ff363de4
┌─host────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ clickhouse2 │ 9000 │ 0 │ │ 3 │ 0 │
│ clickhouse3 │ 9000 │ 0 │ │ 2 │ 0 │
│ clickhouse1 │ 9000 │ 0 │ │ 1 │ 0 │
│ clickhouse4 │ 9000 │ 0 │ │ 0 │ 0 │
└─────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
4 rows in set. Elapsed: 0.150 sec.
clickhouse1 :)
clickhouse1 :) select name, disk_name, formatReadableSize(bytes_on_disk) size, active from system.parts where table = 'hot_cold_table_local';
SELECT
name,
disk_name,
formatReadableSize(bytes_on_disk) AS size,
active
FROM system.parts
WHERE table = 'hot_cold_table_local'
Query id: 31754c1a-8671-4af9-ba18-44fac1263161
┌─name──────┬─disk_name─┬─size───────┬─active─┐
│ all_0_0_0 │ disk_hot1 │ 540.15 KiB │ 0 │
│ all_0_1_1 │ disk_cold │ 1.01 MiB │ 1 │
│ all_1_1_0 │ disk_hot2 │ 539.10 KiB │ 0 │
└───────────┴───────────┴────────────┴────────┘
3 rows in set. Elapsed: 0.003 sec.
clickhouse1 :)
由上面可以看出part 'all_0_1_1’并没有移动到disk ‘disk_hot2’,这是因为on cluster操作只影响了shard,并没有影响replica; 所以需要在本地操作
clickhouse1 :)
clickhouse1 :) alter table hot_cold_table_local on cluster sharding_ha move part 'all_0_1_1' to volume 'cold';
ALTER TABLE hot_cold_table_local ON CLUSTER sharding_ha
MOVE PART 'all_0_1_1' TO VOLUME 'cold'
Query id: b9a0384e-990b-4173-afaf-ab1a9ee8f481
┌─host────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ clickhouse2 │ 9000 │ 0 │ │ 3 │ 2 │
│ clickhouse1 │ 9000 │ 0 │ │ 2 │ 2 │
└─────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
┌─host────────┬─port─┬─status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
│ clickhouse3 │ 9000 │ 0 │ │ 1 │ 0 │
│ clickhouse4 │ 9000 │ 0 │ │ 0 │ 0 │
└─────────────┴──────┴────────┴───────┴─────────────────────┴──────────────────┘
4 rows in set. Elapsed: 0.237 sec.
clickhouse1 :)
clickhouse1 :) select name, disk_name, formatReadableSize(bytes_on_disk) size, active from system.parts where table = 'hot_cold_table_local';
SELECT
name,
disk_name,
formatReadableSize(bytes_on_disk) AS size,
active
FROM system.parts
WHERE table = 'hot_cold_table_local'
Query id: 985623b6-55f0-49cd-83cb-8cb94a658196
┌─name──────┬─disk_name─┬─size─────┬─active─┐
│ all_0_1_1 │ disk_cold │ 1.01 MiB │ 1 │
└───────────┴───────────┴──────────┴────────┘
1 rows in set. Elapsed: 0.004 sec.
clickhouse1 :)
on cluster操作只影响了shard,并没有影响replica; 如果未移动就需要在本地操作