hive分析函数

排序

1.row_number() 直排,相同的也有序

row_number() over(partition by col1 order by col2) as rowid
结果:1,2,3,4

2.rank() 并列的,下一个数字会跳过

rank() over(partition by col1 order by col2) as rowid
结果:1,2,2,4,5

3.dense_rank() 并列的,下一个数据不会跳过

dense_rank() over(partition by col1 order by col2) as rowid
结果:1,2,2,3,4

4.ntile() 将分组数据依照顺序切分成n片。返回当前切片值

ntile(2) over(partition by xx order by createtime) as col1, --分组内将数据分成2片
ntile(3) over(partition by xx order by createtime) as col2, --分组内将数据分成3片
ntile不支持rows between,比方 ntile(2) over(partition by xx order by createtime rows between 3 preceding and current row)
假设切片不均匀。默认添加第一个切片的分布。

最值

1.first_value() 取统计结果中第一个值

first_value(col) over(partition by col1 order by col2) as col_new
取分组内排序后,截止到当前行,col第一个值

2.last_value() 取统计结果中最后一个值

last_value(col) over(partition by col1 order by col2) as col_new
取分组内排序后,截止到当前行,col最后一个值

跨行获取

1.lag() 向上取

lag(col1,n,DEFAULT) over(partition by col1 order by col2) as up 
用于统计窗口内往上第n行值,第一个参数为列名,第二个参数为往上第n行(可选,默认为1),第三个参数为默认值(当往上第n行为NULL时候,取默认值,如不指定,则为NULL)

2.lead() 向下取

lead(col1,n,DEFAULT) over(partition by col1 order by col2) as down 
用于统计窗口内往下第n行值,第一个参数为列名,第二个参数为往下第n行(可选,默认为1),第三个参数为默认值(当往下第n行为NULL时候,取默认值,如不指定,则为NULL)

cume_dist() 指定值占总数的百分比

cume_dist()over(partition by col1 order by col2) 
分组内小于等于当前值的行数/分组内总行数,asc小于等于,desc大于等于

with tmp as
(
select '李一' as name,'001' as dept,2900 as salary union all 
select '赵二' as name,'001' as dept,3000 as salary union all 
select '钱三' as name,'002' as dept,2900 as salary union all 
select '孙四' as name,'002' as dept,2300 as salary union all 
select '王五' as name,'002' as dept,2900 as salary
)
select name,
dept,
salary,
cume_dist() over(partition by dept order by salary asc) as d1,--小于等于当前值的行数/分组内总行数
cume_dist() over(partition by dept order by salary desc) as d2--大于等于当前值的行数/分组内总行数
from tmp

percent_rank() 分组内当前行的RANK值-1/分组内总行数-1

percent_rank() over (partition by col1 order by col2 )
等同于
rank()over(partition by col1 order by col2) -1/sum(1)over(partition by col1 order by col2)-1

  • 0
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值