MySQL Window Function Descriptions

聚合函数是将多条记录聚合为一条,而窗口函数是每条记录都会执行

win_fn()over (partition by order by frame) as new_row,

partition:分区,窗口函数在不同分组分别执行

order:排序

frame:在分区滑动窗口,可使用一般聚合函数比如avg等

固定窗口函数

NameDescription
CUME_DIST() over_clauseCumulative distribution value
DENSE_RANK() over_clauseRank of current row within its partition, without gaps
FIRST_VALUE(expr) [null_treatment] over_clauseValue of argument from first row of window frame
LAG(expr [, N[, default]]) [null_treatment] over_clauseValue of argument from row lagging current row within partition
LAST_VALUE(expr) [null_treatment] over_clauseValue of argument from last row of window frame
LEAD(expr [, N[, default]]) [null_treatment] over_clauseValue of argument from row leading current row within partition
NTH_VALUE(expr, N) [from_first_last] [null_treatment] over_clauseValue of argument from N-th row of window frame
NTILE(N) over_clauseBucket number of current row within its partition.
PERCENT_RANK() over_clausePercentage rank value
RANK() over_clauseRank of current row within its partition, with gaps
ROW_NUMBER() over_clauseNumber of current row within its partition

创建测试数据框

create table ss.test(
	id INT,
	c1 INT,
	c2 INT
)


insert into ss.test (c1,c2) values(2,6);
insert into ss.test (c1,c2) values(1,3);
insert into ss.test (c1,c2) values(1,2);
insert into ss.test (c1,c2) values(1,1);
insert into ss.test (c1,c2) values(2,4);
insert into ss.test (c1,c2) values(2,4);
c2c1
62
31
21
11
42
42

测试

序号函数——row_number() / rank() / dense_rank()

比较区别

row_number不对重复值处理
rank()重复值会并列,后面的值按整体的序号
dense_rank()重复值会并列,后面的值不按整体的序号

select c1,c2,row_number() over (partition by c1 order by c2 asc) as 'new' from ss.test; # 按c1分组,c2排序,给出每行在分组中的序号

c1c2new
111
122
133
241
242
263

select c1,c2,rank() over (partition by c1 order by c2 asc) as 'new' from ss.test; # c1分组,c2排序,求排名

c1c2new
111
122
133
241
241
263

select c1,c2,dense_rank() over (partition by c1 order by c2 asc) as 'new' from ss.test; # c1分组,c2排序,有重复的排名只计算一次

c1c2new
111
122
133
241
241
262
应用分组topn

select * from (select c1,c2,row_number() over (partition by c1 order by c2 asc) as 'new' from ss.test) as temp where new <= 2 ;

c1c2new
111
122
241
242

分布函数——percent_rank()/cume_dist()。

percent_rank() : (rank - 1) / (rows - 1),rank为RANK()函数产生的序号,rows为当前窗口的记录总行数
cume_dist():分组内小于等于当前rank值的行数/分组内总行数
select c1,c2,percent_rank() over (partition by c1 order by c2 asc) as 'new' from ss.test; # c1分组,c2排序,求排名百分比

c1c2new
110
120.5
131
240
240
261

select c1,c2,cume_dist() over (partition by c1 order by c2 asc) as 'new' from ss.test; # c1分组,c2排序,小于等于当前rank行的百分比

c1c2new
110.3333333333333333
120.6666666666666666
131
240.6666666666666666
240.6666666666666666
261
应用

查看分区后,小于当前值占的百分比,比如小于某个人考试成绩的百分比

前后函数——lead(n)/lag(n)

lead(n):都是错行取值
lag(n):都是错行取值
select c1,c2,lead(c2) over (partition by c1 order by c2 asc) as 'new' from ss.test; # c1分组,c2排序,错一行取值

c1c2new
112
123
13
244
246
26

select c1,c2,lag(c2) over (partition by c1 order by c2 asc) as 'new' from ss.test; # c1分组,c2排序,错一行取值

c1c2new
11
121
132
24
244
264
应用

按时间排序后,查询两笔订单之间的时间差

取记录函数——first_val(expr)/last_val(expr)/nth_value(expr)

select c1,c2,first_value(c2) over (partition by c1 order by c2 asc) as 'new' from ss.test;# c1分组,c2排序,取每组c2第一个值

c1c2new
111
121
131
244
244
264

select c1,c2,last_value(c2) over (partition by c1 order by c2 asc) as 'new' from ss.test; # c1分组,c2排序,取每组c2最后一个值

c1c2new
111
122
133
244
244
266

select c1,c2,nth_value(c2,2) over (partition by c1 order by c2 asc) as 'new' from ss.test; # c1分组,c2排序,取每组c2第二个值

c1c2new
11
122
132
244
244
264

应用

取分区后指定位置的值,比如取每个班第二名的成绩

顺序分组

select c1,c2,ntile(2) over (partition by c1 order by c2 asc) as 'new' from ss.test; # c1分组,c2排序,按排序进行分组

c1c2new
111
121
132
241
241
262

滑动frame

frame的语法是BETWEEN frame_start AND frame_end

WINDOW w AS (partition by c1 order by c2 asc ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
生成一个frame在partitioner分区里滑动
rows BETWEEN 1 PRECEDING AND 1 FOLLOWING 当前行的前一行和后一行,共三行记录
rows UNBOUNDED FOLLOWING 当前行到分区后的最后一行
INTERVAL 7 DAY PRECEDING 一周以内的

窗口命名

1.窗口名称
使用WINDOW w AS将替换over后面的语句,让语句更加清晰

select c1,
	c2,
	cume_dist() over w as 'new' 
	from ss.test
	WINDOW w AS (partition by c1 order by c2 asc);

参考:
https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html
https://www.cnblogs.com/DataArt/p/9961676.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值