Hive窗口函数之preceding and following

最近发现两个特别实用的Hive函数。preceding:向前    following:向后,这两个窗口函数可以实现滑窗求和(指定rows范围)或者指定范围内数据求和(指定range范围)

学习链接:Hive窗口函数之preceding and following,试运行了下,结果如下:

1.数据源:

select
   *
from
    stu_score
order by
    score;

2.函数使用:

select
    name,
    score,
    sum(score) over(order by score range between 2 preceding and 2 following) s1, -- 当前行的score值加减2的范围内的所有行
    sum(score) over(order by score rows between 2 preceding and 2 following) s2, -- 当前行+前后2行,一共5行
    sum(score) over(order by score range between unbounded preceding and unbounded following) s3, -- 全部行,不做限制
    sum(score) over(order by score rows between unbounded preceding and unbounded following) s4, -- 全部行,不做限制
    sum(score) over(order by score) s5, -- 第一行到当前行(和当前行相同score值的所有行都会包含进去)
    sum(score) over(order by score rows between unbounded preceding and current row) s6, -- 第一行到当前行(和当前行相同score值的其他行不会包含进去,这是和上面的区别)
    sum(score) over(order by score rows between 3 preceding and current row) s7, -- 当前行+往前3行
    sum(score) over(order by score rows between 3 preceding and 1 following) s8, --当前行+往前3行+往后1行
    sum(score) over(order by score rows between current row and unbounded following) s9 --当前行+往后所有行
from
    stu_score
order by 
    score;

3.验证数据:

s2~s9都是按照对应逻辑计算的,s1的好像不是,有点像是group by之后的求和。

select
    name,
    score,
    sum(score) over(order by score range between 0 preceding and 10 following) s1 -
from
    stu_score_1
order by 
    score;


 补充: 下述代码主要是求每个团长近七天的用户数,当然只有6月30号的那一天数据是准确的,其他的日期下数据都会偏少。

--每个团长近七天数据
select
    dt,
    leader_uid,
    sum(user_cnt) over(partition by leader_uid order by dt asc rows 6 preceding) as pre_7d_user_num,
    sum(user_cnt) over(partition by leader_uid order by dt asc rows between 6 preceding and current row) pre_7d_user_num
from
   leader_data_di
where
    dt <= '2021-06-30'
    and dt >= date_sub('2021-06-30', 6)
order by
    dt desc
limit
    10;

上述代码里的

  •    sum(user_cnt) over(partition by leader_uid order by dt asc rows 6 preceding) as pre_7d_user_num,
  •     sum(user_cnt) over(partition by leader_uid order by dt asc rows between 6 preceding and current row) pre_7d_user_num

效果是一样的。 

  • 3
    点赞
  • 19
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值