MySQL Window Function Descriptions

最新推荐文章于 2024-07-20 10:15:00 发布

Claroja

最新推荐文章于 2024-07-20 10:15:00 发布

阅读量534

点赞数

分类专栏： mysql 文章标签： mysql

本文链接：https://blog.csdn.net/claroja/article/details/103067762

版权

mysql 专栏收录该内容

33 篇文章 0 订阅

订阅专栏

聚合函数是将多条记录聚合为一条,而窗口函数是每条记录都会执行

win_fn()over (partition by order by frame) as new_row,

partition:分区,窗口函数在不同分组分别执行

order:排序

frame:在分区滑动窗口,可使用一般聚合函数比如avg等

固定窗口函数

Name	Description
CUME_DIST() over_clause	Cumulative distribution value
DENSE_RANK() over_clause	Rank of current row within its partition, without gaps
FIRST_VALUE(expr) [null_treatment] over_clause	Value of argument from first row of window frame
LAG(expr [, N[, default]]) [null_treatment] over_clause	Value of argument from row lagging current row within partition
LAST_VALUE(expr) [null_treatment] over_clause	Value of argument from last row of window frame
LEAD(expr [, N[, default]]) [null_treatment] over_clause	Value of argument from row leading current row within partition
NTH_VALUE(expr, N) [from_first_last] [null_treatment] over_clause	Value of argument from N-th row of window frame
NTILE(N) over_clause	Bucket number of current row within its partition.
PERCENT_RANK() over_clause	Percentage rank value
RANK() over_clause	Rank of current row within its partition, with gaps
ROW_NUMBER() over_clause	Number of current row within its partition

创建测试数据框

create table ss.test(
	id INT,
	c1 INT,
	c2 INT
)


insert into ss.test (c1,c2) values(2,6);
insert into ss.test (c1,c2) values(1,3);
insert into ss.test (c1,c2) values(1,2);
insert into ss.test (c1,c2) values(1,1);
insert into ss.test (c1,c2) values(2,4);
insert into ss.test (c1,c2) values(2,4);

c2	c1
6	2
3	1
2	1
1	1
4	2
4	2

测试

序号函数——row_number() / rank() / dense_rank()

比较区别

row_number不对重复值处理
rank()重复值会并列,后面的值按整体的序号
dense_rank()重复值会并列,后面的值不按整体的序号

select c1,c2,row_number() over (partition by c1 order by c2 asc) as 'new' from ss.test; # 按c1分组,c2排序,给出每行在分组中的序号

c1	c2	new
1	1	1
1	2	2
1	3	3
2	4	1
2	4	2
2	6	3

select c1,c2,rank() over (partition by c1 order by c2 asc) as 'new' from ss.test; # c1分组,c2排序,求排名

c1	c2	new
1	1	1
1	2	2
1	3	3
2	4	1
2	4	1
2	6	3

select c1,c2,dense_rank() over (partition by c1 order by c2 asc) as 'new' from ss.test; # c1分组,c2排序,有重复的排名只计算一次

c1	c2	new
1	1	1
1	2	2
1	3	3
2	4	1
2	4	1
2	6	2

应用分组topn

select * from (select c1,c2,row_number() over (partition by c1 order by c2 asc) as 'new' from ss.test) as temp where new <= 2 ;

c1	c2	new
1	1	1
1	2	2
2	4	1
2	4	2

分布函数——percent_rank()/cume_dist()。

percent_rank() : (rank - 1) / (rows - 1),rank为RANK()函数产生的序号，rows为当前窗口的记录总行数
cume_dist():分组内小于等于当前rank值的行数/分组内总行数
select c1,c2,percent_rank() over (partition by c1 order by c2 asc) as 'new' from ss.test; # c1分组,c2排序,求排名百分比

c1	c2	new
1	1	0
1	2	0.5
1	3	1
2	4	0
2	4	0
2	6	1

select c1,c2,cume_dist() over (partition by c1 order by c2 asc) as 'new' from ss.test; # c1分组,c2排序,小于等于当前rank行的百分比

c1	c2	new
1	1	0.3333333333333333
1	2	0.6666666666666666
1	3	1
2	4	0.6666666666666666
2	4	0.6666666666666666
2	6	1

应用

查看分区后,小于当前值占的百分比,比如小于某个人考试成绩的百分比

前后函数——lead(n)/lag(n)

lead(n):都是错行取值
lag(n):都是错行取值
select c1,c2,lead(c2) over (partition by c1 order by c2 asc) as 'new' from ss.test; # c1分组,c2排序,错一行取值

c1	c2	new
1	1	2
1	2	3
1	3
2	4	4
2	4	6
2	6

select c1,c2,lag(c2) over (partition by c1 order by c2 asc) as 'new' from ss.test; # c1分组,c2排序,错一行取值

c1	c2	new
1	1
1	2	1
1	3	2
2	4
2	4	4
2	6	4

应用

按时间排序后,查询两笔订单之间的时间差

取记录函数——first_val(expr)/last_val(expr)/nth_value(expr)

select c1,c2,first_value(c2) over (partition by c1 order by c2 asc) as 'new' from ss.test;# c1分组,c2排序,取每组c2第一个值

c1	c2	new
1	1	1
1	2	1
1	3	1
2	4	4
2	4	4
2	6	4

select c1,c2,last_value(c2) over (partition by c1 order by c2 asc) as 'new' from ss.test; # c1分组,c2排序,取每组c2最后一个值

c1	c2	new
1	1	1
1	2	2
1	3	3
2	4	4
2	4	4
2	6	6

select c1,c2,nth_value(c2,2) over (partition by c1 order by c2 asc) as 'new' from ss.test; # c1分组,c2排序,取每组c2第二个值

c1	c2	new
1	1
1	2	2
1	3	2
2	4	4
2	4	4
2	6	4

应用

取分区后指定位置的值,比如取每个班第二名的成绩

顺序分组

select c1,c2,ntile(2) over (partition by c1 order by c2 asc) as 'new' from ss.test; # c1分组,c2排序,按排序进行分组

c1	c2	new
1	1	1
1	2	1
1	3	2
2	4	1
2	4	1
2	6	2

滑动frame

frame的语法是BETWEEN frame_start AND frame_end

WINDOW w AS (partition by c1 order by c2 asc ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
生成一个frame在partitioner分区里滑动
rows BETWEEN 1 PRECEDING AND 1 FOLLOWING 当前行的前一行和后一行,共三行记录
rows UNBOUNDED FOLLOWING 当前行到分区后的最后一行
INTERVAL 7 DAY PRECEDING 一周以内的

窗口命名

1.窗口名称
使用WINDOW w AS将替换over后面的语句,让语句更加清晰

select c1,
	c2,
	cume_dist() over w as 'new' 
	from ss.test
	WINDOW w AS (partition by c1 order by c2 asc);

参考:
https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html
https://www.cnblogs.com/DataArt/p/9961676.html

Claroja

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录