clickhouse实现排序的开窗函数, 学习这个需要先了解clickhouse 一些函数,了解过直接跳到步骤3
1.数组相关函数
arrayEnumerate -- 传入数组,返回序号数组,序号不重复
arrayEnumerateDense -- 传入数组,返回序号数组,序号可重复
arrayEnumerateUniq -- 传入数组,根据数组位置返回当前元素出现的次数
SELECT
arrayEnumerate([10, 20, 30, 10, 40]) AS row_number,
arrayEnumerateDense([10, 20, 30, 10, 40]) AS dense_rank,
arrayEnumerateUniq([10, 20, 30, 10, 40]) AS uniq_rank
┌─row_number──┬─dense_rank──┬─uniq_rank───┐
│ [1,2,3,4,5] │ [1,2,3,1,4] │ [1,1,1,2,1] │
└─────────────┴─────────────┴─────────────┘
2 arrry join
array join可以将数组1行转多行,有点类似与hive的炸裂函数
SELECT
key,
value
FROM
(
SELECT
'a' AS key,
[10, 20, 30, 10, 40] AS values
) AS a
ARRAY JOIN values AS value
┌─key─┬─value─┐
│ a │ 10 │
│ a │ 20 │
│ a │ 30 │
│ a │ 10 │
│ a │ 40 │
└─────┴───────┘
4 实现rank,row_number,dense_rank 开窗函数
1.创建测试数据
CREATE TABLE test_data engine = Memory AS
WITH(
SELECT ['A','A','A','A','B','B','B','B','B','A','59','90','80','80','65','75','78','88','99','70']
)AS dict
SELECT dict[number%10+1] AS id, dict[number+11] AS val FROM system.numbers LIMIT 10
2.详细sql参考
SELECT
id,
val,
row_number,
dense_rank,
uniq_rank
FROM
(
SELECT
id,
groupArray(val) AS arr_val,
arrayEnumerate(arr_val) AS row_number,
arrayEnumerateDense(arr_val) AS dense_rank,
arrayEnumerateUniq(arr_val) AS uniq_rank
FROM
(
SELECT *
FROM test_data
ORDER BY val ASC
)
GROUP BY id
)
ARRAY JOIN
arr_val AS val,
row_number,
dense_rank,
uniq_rank
ORDER BY
id ASC,
row_number ASC,
dense_rank ASC
┌─id─┬─val─┬─row_number─┬─dense_rank─┬─uniq_rank─┐
│ A │ 59 │ 1 │ 1 │ 1 │
│ A │ 70 │ 2 │ 2 │ 1 │
│ A │ 80 │ 3 │ 3 │ 1 │
│ A │ 80 │ 4 │ 3 │ 2 │
│ A │ 90 │ 5 │ 4 │ 1 │
│ B │ 65 │ 1 │ 1 │ 1 │
│ B │ 75 │ 2 │ 2 │ 1 │
│ B │ 78 │ 3 │ 3 │ 1 │
│ B │ 88 │ 4 │ 4 │ 1 │
│ B │ 99 │ 5 │ 5 │ 1 │
└────┴─────┴────────────┴────────────┴───────────┘
tip: 比较新的clickhouse支持了开窗函数。
推荐一下:关注了一位clickhouse大佬微信公众号,推送的内容挺好的,"ClickHouse的秘密基地", 有兴趣自己添加哈