背景
Clickhouse的MergeTree只支持一种排序规则,如建表时定义Order BY A,B,C ,查询where A很快,查询where C无法命中索引。
由于业务规则变更,原来宽表的排序键是issueId,产品修改了问题Id的计算规则,引入了一个新字段bugIssueId,为了查询效率,应该把排序键换成bugIssueId,但是Clickhouse并不支持更换排序键。于是我查了一下针对这个问题是否有解决办法,首先物化视图(materialized view)是可以考虑的,但是存在数据一致性问题。另外Clickhouse在上个月正式投产的Projection这个功能非常优秀,可以完美解决上述问题。Projection原来在Experimental Feature下,需要开启配置才能使用,于2022-02-17正式投入生产使用。
1. Projection简介
Projection的概念出自《C-Store A Column-oriented DBMS》这篇论文,作者是2015年图领奖获得者、Vertica之父——Mike Stonebraker。来自快手的Amos Bird(郑天祺)借鉴了这个思想,在Clickhouse中实现了Projection功能,并贡献到社区。
Projection的特点:
1) part-level:Projection类型物化视图,但物化视图是完完全全的一张新表,而Projection是原表的一个部分,存储在分区目录中的一个子目录。
2) consistency:Projection保存在原表的分区,数据的更新、合并都是同源的,不存在不一致的问题。在读写一致性方面,Projection采用版本控制(multi-versioning),没有冗长的锁机制,插入不会影响读操作。
3) automatic: 查询数据时,自动匹配最佳的projection。匹配规则如下:
2.Projection的用法
1)创建Projection
ALTER TABLE [db].name ADD PROJECTION name ( SELECT <COLUMN LIST EXPR> [GROUP BY] [ORDER BY] )
2) 删除Projection
ALTER TABLE [db].name DROP PROJECTION name
3)手动触发物化(首次创建Projection时使用,否则只有新插入数据会自动物化)
ALTER TABLE [db.]table MATERIALIZE PROJECTION name IN PARTITION partition_name
4)删除Projection磁盘文件,但不删除说明(类似Mysql的truncate)
ALTER TABLE [db.]table CLEAR PROJECTION name IN PARTITION partition_name
Projection的ADD,DROP,CLEAR只是对元数据进行修改和文件的删除,因此这些操作都是轻量级的。但是MATERIALIZE 是一个重量级的操作,需要对原有数据进行重排序,所以尽量分区内进行操作或在一个合适的时间节点进行操作。
3. Projection的示例及性能
案例来自官网:UK Property Price Paid | ClickHouse Documentation
1)下载数据源
wget http://prod.publicdata.landregistry.gov.uk.s3-website-eu-west-1.amazonaws.com/pp-complete.csv
2)建表
CREATE TABLE uk_price_paid
(
price UInt32,
date Date,
postcode1 LowCardinality(String),
postcode2 LowCardinality(String),
type Enum8('terraced' = 1, 'semi-detached' = 2, 'detached' = 3, 'flat' = 4, 'other' = 0),
is_new UInt8,
duration Enum8('freehold' = 1, 'leasehold' = 2, 'unknown' = 0),
addr1 String,
addr2 String,
street LowCardinality(String),
locality LowCardinality(String),
town LowCardinality(String),
district LowCardinality(String),
county LowCardinality(String),
category UInt8
) ENGINE = MergeTree ORDER BY (postcode1, postcode2, addr1, addr2);
3)导入数据
clickhouse-local --input-format CSV --structure '
uuid String,
price UInt32,
time DateTime,
postcode String,
a String,
b String,
c String,
addr1 String,
addr2 String,
street String,
locality String,
town String,
district String,
county String,
d String,
e String
' --query "
WITH splitByChar(' ', postcode) AS p
SELECT
price,
toDate(time) AS date,
p[1] AS postcode1,
p[2] AS postcode2,
transform(a, ['T', 'S', 'D', 'F', 'O'], ['terraced', 'semi-detached', 'detached', 'flat', 'other']) AS type,
b = 'Y' AS is_new,
transform(c, ['F', 'L', 'U'], ['freehold', 'leasehold', 'unknown']) AS duration,
addr1,
addr2,
street,
locality,
town,
district,
county,
d = 'B' AS category
FROM table" --date_time_input_format best_effort < pp-complete.csv | clickhouse-client --query "INSERT INTO uk_price_paid FORMAT TSV"
4)验证
SELECT count() FROM uk_price_paid;
5)查询以下三条语句
--Query 1. Average Price Per Year
SELECT toYear(date) AS year, round(avg(price)) AS price, bar(price, 0, 1000000, 80) FROM uk_price_paid GROUP BY year ORDER BY year;
--Query 2. Average Price per Year in London
SELECT toYear(date) AS year, round(avg(price)) AS price, bar(price, 0, 2000000, 100) FROM uk_price_paid WHERE town = 'LONDON' GROUP BY year ORDER BY year;
--Query 3. The Most Expensive Neighborhoods
SELECT
town,
district,
count() AS c,
round(avg(price)) AS price,
bar(price, 0, 5000000, 100)
FROM uk_price_paid
WHERE date >= '2020-01-01'
GROUP BY
town,
district
HAVING c >= 100
ORDER BY price DESC
LIMIT 100;
6)建立并物化Projection
ALTER TABLE uk_price_paid
ADD PROJECTION projection_by_year_district_town
(
SELECT
toYear(date),
district,
town,
avg(price),
sum(price),
count()
GROUP BY
toYear(date),
district,
town
);
ALTER TABLE uk_price_paid
MATERIALIZE PROJECTION projection_by_year_district_town
SETTINGS mutations_sync = 1;
7)再次执行前面三条查询语句
--Query 1. Average Price Per Year
SELECT toYear(date) AS year, round(avg(price)) AS price, bar(price, 0, 1000000, 80) FROM uk_price_paid GROUP BY year ORDER BY year;
--Query 2. Average Price per Year in London
SELECT toYear(date) AS year, round(avg(price)) AS price, bar(price, 0, 2000000, 100) FROM uk_price_paid WHERE town = 'LONDON' GROUP BY year ORDER BY year;
--Query 3. The Most Expensive Neighborhoods
SELECT
town,
district,
count() AS c,
round(avg(price)) AS price,
bar(price, 0, 5000000, 100)
FROM uk_price_paid
WHERE date >= '2020-01-01'
GROUP BY
town,
district
HAVING c >= 100
ORDER BY price DESC
LIMIT 100;
8)查看执行计划
EXPLAIN
SELECT
toYear(date) AS year,
round(avg(price)) AS price,
bar(price, 0, 1000000, 80)
FROM uk_price_paid
GROUP BY year
ORDER BY year ASC
看到 ReadFromStorage (MergeTree(with * projection *))说明命中了projection。
9)结果对比
Query 1
no projection: 27 rows in set. Elapsed: 0.158 sec. Processed 26.32 million rows, 157.93 MB (166.57 million rows/s., 999.39 MB/s.)
projection: 27 rows in set. Elapsed: 0.007 sec. Processed 105.96 thousand rows, 3.33 MB (14.58 million rows/s., 458.13 MB/s.)
Query 2
no projection: 27 rows in set. Elapsed: 0.163 sec. Processed 26.32 million rows, 80.01 MB (161.75 million rows/s., 491.64 MB/s.)
projection: 27 rows in set. Elapsed: 0.008 sec. Processed 105.96 thousand rows, 3.67 MB (13.29 million rows/s., 459.89 MB/s.)
Query 3
no projection: 100 rows in set. Elapsed: 0.069 sec. Processed 26.32 million rows, 62.47 MB (382.13 million rows/s., 906.93 MB/s.)
projection: 100 rows in set. Elapsed: 0.029 sec. Processed 8.08 thousand rows, 511.08 KB (276.06 thousand rows/s., 17.47 MB/s.)
可以看出,无论是在时间还是空间上,Projection的表现都是相当优秀的。
4. Projection原理解析
projection在功能上有点类似mysql的index,但跟MergeTree的index完全是两码事,MergeTree的index是主键基础上的跳数索引。Projection的主要构件包含三个部分,分别是:Projection定义、Projection存储、Projection查询分析。
4.1 Projection定义
Projection的定义可以看作CREATE TABLE AS SELECT (CTAS)语句
1) select中的别名与表达式转换为标准命名
2)group by语句生成Aggregate projections,使用Group by指定的聚合维度作为主键
3)聚合维度将生成中间状态类型AggregateFunction(...),如AggregateFunction(sum, UInt64), AggregateFunction(duration, UInt64) --注:这一点我不是很理解
4.2 Projection存储
Projection是底表Part的伴生Part(即分区目录的子目录)
1)存储结果完全跟底表的MergeTree结构相同
2)Projection Part复用底表Part的分区信息
3)merge, mutation 和replication均对照父part进行处理
PS:在我自己做的实验中,Projection的存储跟父Part的存储结构并非是完全一致的。我想应该是Projection的数据量较少的时候,.bin文件和.mrk2文件均合并成一个文件,而不是按列独立存储。
4.3 Projection查询分析
原理:
一致性保障:
5. 参考
Clickhouse官方文档:MergeTree | ClickHouse Documentation
《C-Store A Column-oriented DBMS》:http://www.doczj.com/doc/f236381bfc4ffe473368ab67-1.html
【clickhouse】clickhouse强大智能的 Projection (投影) 功能_九师兄的博客-CSDN博客