clickhouse分组排序,行号,取特定数量数据

目前应用很多需求设计对数据分组并去特定数量的数据;
clickhouse 新版本增加了row_number(),rank() 函数,可以直接对分组数据添加行号;下面是记录了这两个函数的基本使用;另外用 groupArray方式也大概实现了添加行号(官网上有说 groupArrayLast 可以实现自动排序的功能),但是实测这函数没法使用,所以暂不记录;

1、源数据

select 'a' as name, 25 as age,165 as height union all
select 'b' as name, 21 as age,182 as height union all
select 'a' as name, 21 as age,187 as height union all
select 'a' as name, 25 as age,158 as height union all
select 'b' as name, 22 as age,168 as height

2、生成数组

2.1 groupArray 分组合并为数组

groupArray 会把同类型的值合并为数组,并过滤NULL值数据;格式groupArray(max_size)(fields)

-- 不限制分组数量
select name,groupArray(age) from 
(select 'a' as name, 25 as age,165 as height union all 
 select 'b' as name, 21 as age,182 as height union all 
 select 'a' as name, 21 as age,187 as height union all 
 select 'a' as name, 25 as age,158 as height union all 
 select 'b' as name, 22 as age,168 as height
) a group by name;
namegroupArray(age)
b[21,22]
a[25,21,25]
-- 限制分组数量为1
-- 如果先对内部数据排序再分组可拿取age最大的一条
-- arrayStringConcat 将数据用特定字符合并
select name,groupArray(1)(age),arrayStringConcat(groupArray(1)(age),'') from 
(select * from 
 (select 'a' as name, 25 as age,165 as height union all 
  select 'b' as name, 21 as age,182 as height union all 
  select 'a' as name, 21 as age,187 as height union all 
  select 'a' as name, 25 as age,158 as height union all 
  select 'b' as name, 22 as age,168 as height
 ) a  
 order by age desc 
) r group by name;
namegroupArray(1)(age)arrayStringConcat(groupArray(1)(age),‘’)
b[22]22
a[25]25
2.2 arrayEnumerate 标记数据
select name,groupArray(age) as values,arrayEnumerate(values) as indexs from 
(select 'a' as name, 25 as age,165 as height union all 
 select 'b' as name, 21 as age,182 as height union all 
 select 'a' as name, 21 as age,187 as height union all 
 select 'a' as name, 25 as age,158 as height union all 
 select 'b' as name, 22 as age,168 as height
) a group by name;
namevaluesindexs
b[21,22][1,2]
a[25,21,25][1,2,3]

3、rank()、row_number()

3.1 说明

https://clickhouse.com/docs/en/sql-reference/window-functions

3.2 使用
--rank()
select name,age,rank() over(partition by name order by age asc ) from 
(select 'a' as name, 25 as age,165 as height union all 
 select 'b' as name, 21 as age,182 as height union all 
 select 'a' as name, 21 as age,187 as height union all 
 select 'a' as name, 25 as age,158 as height union all 
 select 'b' as name, 22 as age,168 as height
) a group by name,age;

-- row_number()
select name,age,row_number() over(partition by name order by age asc ) from 
(select 'a' as name, 25 as age,165 as height union all 
 select 'b' as name, 21 as age,182 as height union all 
 select 'a' as name, 21 as age,187 as height union all 
 select 'a' as name, 25 as age,158 as height union all 
 select 'b' as name, 22 as age,168 as height
) a group by name,age;
nameagerank() OVER (PARTITION BY name ORDER BY age ASC)
a211
a252
b211
b222

PS. 行号已经标明,后续需要取多少数据设置行号条件即可

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值