窗口函数
- 语法
函数名([expr]) over(partition by <要分列的组> order by <要排序的列> rows between <数据范围>)
over是关键字,用来指定函数执行的窗口范围,包between含三个分析子句:
- 分组子句 (partition by)
- 排序子句(order by)
- 窗口子句(rows between)
当有 order by 无 rows between,窗口规范默认是
rows between unbounded preceding and current row. (当前及以上)
当无 order by 无 rows between, 窗口规范默认是
rows between unbounded preceding and unbounded following (所有)
rows between 2 preceding and current row # 取当前及上两行
rows between current row and unbounded following #当前及以下
rows between 3 preceding and 1 following # 从上三行到下一行,总共五行
- 函数
聚合类窗口函数:
sum()
count()
avg()
max()
min()
排序窗口函数:
rank() over (…)
dense_rank() over (…)
row_number() over (…)
偏移函数:
lag(…) over(…)
lead(…) over(…)
切片函数:
ntile(n) over (…)
聚合类窗口函数 vs 聚合函数
差异:
普通场景下的聚合函数是将多条记录聚合为一条(多到一)
窗口函数原来有几条记录,执行完还是几条(多到多)
user_num表:
date | user | num |
---|---|---|
0101 | A | 1 |
0101 |