hive学习之六：row_number()排序函数的使用

最新推荐文章于 2024-05-27 11:50:21 发布

anickname

最新推荐文章于 2024-05-27 11:50:21 发布

阅读量2.1w

点赞数 1

文章标签： hive row_number 排序

本文链接：https://blog.csdn.net/javajxz008/article/details/53493509

版权

hive 专栏收录该内容

7 篇文章 1 订阅

订阅专栏

在hive中经常需要使用到排序，hive中的排序函数有多种，可在相关文档中查阅具体的使用方法，在项目中用到了row_number()来做排序。简单的用法在这里就

不做赘述了，项目具体需求如下：

表tbl_custinfo结构如下

create table tbl_custinfo(
custno   string，--客户号
acctno   string，--账号
cardno   string，--卡号
recdate  string，--卡审核日期
appid    string，--卡申请id
product  string，--卡种
cardtype string  --卡申请时卡种
)
partitioned by(pt string)
row format delimited   
fields terminated by ','  
stored as rcfile;

现在要求同一个客户下，取卡审核日期小的卡，若卡审核日期相同，取卡申请id小的卡，若卡申请id相同，则取卡种和卡申请时卡种相同的卡。

原始数据：

1002 65898 622589785645238 20161101 V9875624201 106  1027------①
1002 65898 655812318977101 20161101 V9875624201 1027 1027------②
1003 12876 688951011942014 20050521 Z1021540014 301  301-------③

一开始不知道如何实现，按卡审核日期和卡申请id排序好处理，但是要取取卡种和卡申请时卡种相同的卡就不太好处理，原先的想法是这样的：

select * from
  (select *,row_number() over(distribute by custno sort by recdate asc,appid asc,product=cardtype) as rank  from tbl_custinfo where pt='20161015') a
where a.rank=1;

mapreduce不报错，正常执行，

1002 65898 622589785645238 20161101 V9875624201 106  1027 1------①
1002 65898 655812318977101 20161101 V9875624201 1027 1027 2------②
1003 12876 688951011942014 20050521 Z1021540014 301  301  1------③

但是客户号为1002的排序结果不正确，②应该排序为1。在同事的提醒下换了个思路解决：

select * from
  (select *,row_number() over(distribute by custno sort by recdate asc,appid asc,case when product=cardtype then '1' else '2' end asc) as rank  from tbl_custinfo where pt='20161015') a
where a.rank=1;

结果显示是正确的。

1002 65898 622589785645238 20161101 V9875624201 1027  1027 1------①
1002 65898 655812318977101 20161101 V9875624201 106   1027 2------②
1003 12876 688951011942014 20050521 Z1021540014 301   301  1------③

anickname

关注

1
点赞
踩
19

收藏

觉得还不错? 一键收藏
5
评论
hive学习之六：row_number()排序函数的使用

在hive中经常需要使用到排序，hive中的排序函数有多种，可在相关文档中查阅具体的使用方法，在项目中用到了row_number()来做排序。简单的用法在这里就不做赘述了，项目具体需求如下：表tbl_custinfo结构如下create table tbl_custinfo(custno string，--客户号acctno string，--账号cardno str
复制链接

扫一扫