PRECEDING:往前
FOLLOWING:往后
CURRENT ROW:当前行
UNBOUNDED:起点(一般结合PRECEDING,FOLLOWING使用)
UNBOUNDED PRECEDING 表示该窗口最前面的行(起点)
UNBOUNDED FOLLOWING:表示该窗口最后面的行(终点)
比如说:
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW(表示从起点到当前行)
ROWS BETWEEN 2 PRECEDING AND 1 FOLLOWING(表示往前2行到往后1行)
ROWS BETWEEN 2 PRECEDING AND 1 CURRENT ROW(表示往前2行到当前行)
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING(表示当前行到终点)
测试数据
20191020,11111,85
20191020,22222,83
20191020,33333,86
20191021,11111,87
20191021,22222,65
20191021,33333,98
20191022,11111,67
20191022,22222,34
20191022,33333,88
20191023,11111,99
20191023,22222,33
# 1、使用 over() 函数进行数据统计, 统计每个用户及表中数据的总数
select *,
count(*) over ()
from test_window;
# 2、求用户明细并统计每天的用户总数
select *,
count(userid) over (partition by logday order by logday) as count_day
from test_window;
# 3、计算从第一天到现在的所有 score 大于80分的用户总数
select *,
count(userid) over (order by logday rows between unbounded preceding and current row)
from test_window
where score > 80;
# 4、计算每个用户到当前日期分数大于80的天数
select *,
count(*) over (partition by userid order by logday rows between unbounded preceding and current row) as count_day
from test_window
where score > 80
order by logday, userid;
Hive&Mysql窗口函数详解—及3套案例练习
最新推荐文章于 2024-02-11 16:48:58 发布