窗口分析函数
函数中用到的表数据如下图:
1. 分析函数
- row_number()
- rank()
- dense_rank()
这3个函数通常用在组内排序中,但实现的效果却不相同,用法如下:
select
name,subject,score,
row_number() over(partition by name order by score) rn,
rank() over(partition by name order by score) rk,
dense_rank() over(partition by name order by score) dr
from grade;
排序之后编号对比, 如下图:
2. 聚合函数
- count()组内计数
- sum()组内求和
- avg()组内求平均值
- max()&min()组内求最大最小值
下面SQL以sum函数为例展示聚合函数的用法,其他函数的用法类似。
select
name,subject,score,
sum(score) over(partition by name order by subject) as sum1, -- 分组内起始行到当前行的和
sum(score) over(partition by name order by subject rows between 1 preceding and current row) as sum2, --分组内当前行与前一行的和
sum(score) over(partition by name order by subject rows between 1 preceding and 2 following) as sum3, --分组内当前行与后两行的和
sum(score) over(partition by name order by subject rows between unbounded preceding and current row) as sum4, --分组内起始行到当前行的和
sum(score) over(partition by name order by subject rows between current row and unbounded following) as sum5 ---分组内当前行到终止行的和
from grade;
计算结果如下图:
3. 窗口函数
- lag(col, n, default) 表示分组内列(col)的当前行之前的第n行, default为默认值
- lead(col, n, default) 表示分组内列(col)的当前行之后的第n行, default为默认值
用法如下图:
select
name,subject,score,
lag(score, 1, 0) over(partition by name order by score) lag, -- lag(score, 1, 0)表示分组内列(score)的当前行之前的第1行, 0为默认值
lead(score, 1, 0) over(partition by name order by score) lead -- lead(score, 1, 0)表示分组内列(score)的当前行之后的第1行, 0为默认值
from grade;
结果如下图:
- first_value(col) 组内排序第一个值
- last_value(col) 组内排序最后一个值
用法如下图:
select
name,subject,score,
first_value(score) over(partition by name order by score) f
from grade;