1、开窗函数是什么?
开窗函数用于为行定义一个窗口(这里的窗口是指运算将要操作的行的集合),它对一组值进行操作,不需要使用GROUP BY子句对数据进行分组,能够在同一行中同时返回基础行的列和聚合列。
2、开窗函数有什么用?
开窗函数的功能本质是聚合,但是相比聚合,开窗函数可以提供的信息更多。
3、first_value/last_value函数
first_value()over(partition by 列名1,列名2 order by 列名1,列名2)是求一组数据的第一个值 | |
last_value()over(partition by 列名1,列名2 order by 列名1,列名2)是求一组数据的最后一个值 |
first_value用法:
select distinct a.date,a.name,first_value(date)over(partition by name order by date asc)as `每个人对应最早的date` | |
,first_value(date)over(partition by name order by date desc)as `每个人对应最晚的date` | |
from | |
( | |
select '张三'as name,'2021-04-11' as date | |
union all | |
select '李四'as name,'2021-04-09' as date | |
union all | |
select '赵四'as name,'2021-04-16' as date | |
union all | |
select '张三'as name,'2021-03-10'as date | |
union all | |
select '李四'as name,'2020-01-01'as date | |
)a | |
last_value用法
select distinct a.date,a.name | |
,last_value(date)over(partition by name order by date asc)as `每个人对应最晚的date` | |
from | |
( | |
select '张三'as name,'2021-04-11' as date | |
union all | |
select '李四'as name,'2021-04-09' as date | |
union all | |
select '赵四'as name,'2021-04-16' as date | |
union all | |
select '张三'as name,'2021-03-10'as date | |
union all | |
select '李四'as name,'2020-01-01'as date | |
)a | |
可以看到使用last_value函数求每个人最后一个日期,结果并不是想要的。那该怎么办呢,查询该函数的具体用法发现:
last_value()默认的统计范围是”rows between unbounded preceding and current row【无界的前面行和当前行之间】”怎么理解呢?见下:
rows between unbounded preceding and current row,可以这么理解: x∈(-∞,X) | |
rows between unbounded preceding and unbounded following, x∈(-∞,+ ∞) | |
rows between current row and unbounded following, x∈(X,+ ∞) |
last_value()默认是升序,如果限制了是降序,则等同于first_value()升序
select distinct a.date,a.name | |
,last_value(date)over(partition by name order by date rows between unbounded preceding and current row)as `(-∞,X)` | |
,last_value(date)over(partition by name order by date rows between unbounded preceding and unbounded following)as `(-∞,+ ∞)` | |
,last_value(date)over(partition by name order by date rows between current row and unbounded following)as `(X,+ ∞)` | |
from | |
( | |
select '张三'as name,'2021-04-11' as date | |
union all | |
select '李四'as name,'2021-04-09' as date | |
union all | |
select '赵四'as name,'2021-04-16' as date | |
union all | |
select '张三'as name,'2021-03-10'as date | |
union all | |
select '李四'as name,'2020-01-01'as date | |
)a |
rows可以换成range,下次再补充