hive调优 ------- 竖表变横表

最新推荐文章于 2024-03-19 18:36:23 发布

幸运六叶草

最新推荐文章于 2024-03-19 18:36:23 发布

阅读量4.5k

点赞数 1

分类专栏： hive

hive 专栏收录该内容

32 篇文章 1 订阅

订阅专栏

有这这样一张表t_buy_buyer_time_hongbao_asc

用户id 次序购买时间

25560 1 1325345254

25560 2 1331043510

25560 3 1331999999

25720 1 1320381121

25720 2 1320461154

25720 3 1320639271

26840 1 1337214675

26840 2 1337214694

26840 3 1337214768

37160 1 1328583075

需求是在某张表中罗列出某用户的第一次购买时间，第二次购买时间，第三次购买时间

比如

用户id 第一次购买第二次购买第三次购买

25560 1325345254 1331043510 1331999999

25720 1320381121 1320461154 1320639271

26840 1337214675 1337214694 1337214768

......

于是呢打一个很形象的比方就是 把竖表变横表的要求

使用两种hive脚本来查询hive1

    Sql代码   
    
  
 select   
         tb1.uid as uid,   
         tb1.order_time as s1t_deal_time,   
         tb2.order_time as c2d_deal_time,   
         tb3.order_time as r3d_deal_time  
 from   
                 (select * from t_buy_buyer_time_hongbao_asc where row_num=1 and pt='20121010000000')tb1    
         left outer join   
         (select * from t_buy_buyer_time_hongbao_asc where row_num=2 and pt='20121010000000')tb2    
         on tb1.uid=tb2.uid   
         left outer join   
         (select * from t_buy_buyer_time_hongbao_asc where row_num=3 and pt='20121010000000')tb3    
         on tb1.uid=tb3.uid   

本hive脚本只需要一个job，执行时间376.005 s

hive2

    Sql代码   
    
  
 select   
         tb1.uid as uid,   
         s1t_deal_time,   
         c2d_deal_time,   
         r3d_deal_time  
 from   
         (select uid,sum(if(row_num=1,order_time,0)) as s1t_deal_time,sum(if(row_num=2,order_time,0)) as c2d_deal_time,sum(if(row_num=3,order_time,0)) as r3d_deal_time from t_buy_buyer_time_hongbao_asc where pt='20121010000000' group by uid)tb1