hive大字段拼接与切割实战(一个大字段保存每天历史数据)

前言

去年的时候项目组接到需求要将每个客户每个产品的昨日日累计收益在hive中统计出来推送到hbase,此为对接下游1。后来又接到另一个需求,要保留最近30天的数据,还需要推送到hbase中,此为对接下游2。
经过分析后可以在hive结果表中增加一个保存历史数据字段,对每天的收益进行拼接,然后对此字段进行截取30天,推送到hbase中。下面谈一下实现过程。

# 对每个产品收益分别求和

select 
id
sum(if (sign in ('pro1_a','pro1_b','pro1_c','pro1_d'),income ,null) as pro1,
sum(if (sign='pro2',income ,null) as pro2,
sum(if (sign='pro3',income ,null) as pro3,
sum(if (sign='pro4',income ,null) as pro4,
sum(if (sign='pro1_a',income ,null) as pro1_a,
sum(if (sign='pro1_b',income ,null) as pro1_b,
sum(if (sign='pro1_c',income ,null) as pro1_c,
sum(if (sign='pro1_d',income ,null) as pro1_d
 from (select pro1.id,pro1.income as income,'pro1' as sign,from pro1 where day=${today} and type='pro1'
union all 
select pro2.id,pro2.income as income,'pro2' as sign,from pro2 where day=${today} and type='pro2'
union all
select pro3.id,pro3.income as income,'pro3' as sign,from pro3 where day=${today} and type='pro3'
union all
select pro4.id,pro4.income as income,'pro4.type' as sign from pro4  where day=${today} and type='pro3' and  type in ('pro4_a','pro4_b','pro4_c','pro4_d')f
group by id)aa
full join (
select id,total_income,pro1_income_his,pri2_income_his,pri3_income_his,pri4_income_his
from table_his where day =${yesterday}
    ) bb
aa.id=bb.id;

1.首先统计每个产品的日收益后进行拼接

1.1 字段:

         product_total_income_his:产品历史累计收益
         product_total_income:产品累计收益
         product_income 产品日收益
         date:日期
         id:客户id
         产品历史累计收益字段()product_total_income_his)内容:
         20200101:100\;20200102:105\;20200103:110\;.......

1.2 表:

       tableName:拼接收益结果表
       tableTmp:统计日累计收益结果表

1.3 sql逻辑:

insert overwrite table tableName partition(day='{today}')   
select 
      id,
      concat(nvl(b.product_total_income_his,''),"{data}",':',nvl(a.product_total_income,''),'\;') as product_income_his
from 
      (select 
             id,
             sum(product_income) as product_total_income 
         from tableTmp
      )a
full jion
        (select 
               id,
               product_income_his 
           from tableName 
          where day='${yesterday}'
         ) b 
on a.id=b.id;

2.切割最近30天

2.1 表:

        tableResult:hive结果表
        tableName:拼接收益结果表

2.2 字段:

         product_total_income_his:产品历史累计收益
         product_total_income:产品累计收益
         product_total_income_his_30 :产品历史累计收益(最近30天)
         dt:日期

2.3 sql逻辑:

select
         id,
         product_total_income,
         dt,
         substring_index(product_total_income_his,'\:',-30) as product_total_income_his_30
  from tableName 
 where day ="{today}"

3 番外

本人小白一个,如有问题,多多担待,多多指教!

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

weixin_44352020

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值