大数据大部分是sqlboy,整理一下常用的开窗函数
常用的开窗函数:序号函数
row_number() over() : 相同值排名顺延,返回结果1、2、3、4
rank() over():相同结果排名相同,后续排名不连续,返回结果1、2、2、4
dense_rank() over():相同结果排名相同,后续排名连续,返回结果为 1、2、2、3
ntile(n) over(): 分组,将数据均匀划分成N等分,将数据分为n组并返回对应组号1、2......n
样例
select *,
row_number() over(partition by project order by score desc ) as en ,
rank() over(partition by project order by score desc ) as rank_n,
dense_rank() over(partition by project order by score desc ) as dense_n,
ntile(2) over(partition by project order by score) as ntile_n
from test1 t
样例数据
INSERT INTO test1 (name, score, score2, project) VALUES('zhangsan', 100, 80, '物理');
INSERT INTO test1 (name, score, score2, project) VALUES('zhangsan', 80, 20, '化学');
INSERT INTO test1 (name, score, score2, project) VALUES('zhangsan', 90, 23, '数学');
INSERT INTO test1 (name, score, score2, project) VALUES('lisi', 30, 100, '化学');
INSERT INTO test1 (name, score, score2, project) VALUES('lisi', 50, 100, '数学');
INSERT INTO test1 (name, score, score2, project) VALUES('lisi', 80, 85, '物理');
INSERT INTO test1 (name, score, score2, project) VALUES('wangwu', 80, 70, '物理');
INSERT INTO test1 (name, score, score2, project) VALUES('a', 80, 80, '物理');
INSERT INTO test1 (name, score, score2, project) VALUES('b', 90, 90, '物理');
INSERT INTO test1 (name, score, score2, project) VALUES('c', 36, 50, '物理');
聚合开窗函数
- sum() 分组求和
- count() 分组统计
- min() 分组求最大值
- max() 分组求最小值
- avg() 分组求平均值
select name,score ,
sum(score) over(partition by name ) as sum_n,
max(score) over(partition by name ) as max_n,
min(score) over(partition by name ) as min_n,
avg(score) over(partition by name ) as avg_n
from test1 t
其他开窗函数或前后函数
- lag(字段名,offset,default_value) 移动开窗函数,当前排序规则,返回向下第n行制定字段对应数据。其中n代表向下第n行制定字段对应数据。其中n代表向下偏移n行,如果偏移n超出范围返回的默认值,不写返回null,
- lead(字段名,offset,default_value) 移动开窗函数,与lag相反,表示返回向上第n行制定字段对应数据
- first_value() 取分组内排序后,截止到当前行的第一个值
- last_value() 取分组内排序后,截止到当前行最后一个值
select name,score ,
lag(score,2,0) over(partition by name order by score ) as lag_n,
lead(score,2,0) over(partition by name order by score ) as lead_n,
first_value(score) over(partition by name order by score ) as first_n,
last_value(score) over(partition by name order by score desc ) as last_n
from test1 t
执行结果
(注意:lag和lead 当前行为0,下标从零开始计数)
NTH_VALUE()函数
nth_value(expression, index) :获取指定列的有序几何指定位置的值
select name,score ,
nth_value(score,2) over(partition by name order by score ) as lag_n
from test1 t
参考学习:https://download.csdn.net/blog/column/11851938/131011696