前言
去年的时候项目组接到需求要将每个客户每个产品的昨日日累计收益在hive中统计出来推送到hbase,此为对接下游1。后来又接到另一个需求,要保留最近30天的数据,还需要推送到hbase中,此为对接下游2。
经过分析后可以在hive结果表中增加一个保存历史数据字段,对每天的收益进行拼接,然后对此字段进行截取30天,推送到hbase中。下面谈一下实现过程。
# 对每个产品收益分别求和
select
id
sum(if (sign in ('pro1_a','pro1_b','pro1_c','pro1_d'),income ,null) as pro1,
sum(if (sign='pro2',income ,null) as pro2,
sum(if (sign='pro3',income ,null) as pro3,
sum(if (sign='pro4',income ,null) as pro4,
sum(if (sign='pro1_a',income ,null) as pro1_a,
sum(if (sign='pro1_b',income ,null) as pro1_b,
sum(if (sign='pro1_c',income ,null) as pro1_c,
sum(if (sign='pro1_d',income ,null) as pro1_d
from (select pro1.id,pro1.income as income,'pro1' as sign,from pro1 where day=${today} and type='pro1'
union all
select pro2.id,pro2.income as income,'pro2' as sign,from pro2 where day=${today} and type='pro2'
union all
select pro3.id,pro3.income as income,'pro3' as sign,from pro3 where day=${today} and type='pro3'
union all
select pro4.id,pro4.income as income,'pro4.type' as sign from pro4 where day=${today} and type='pro3' and type in ('pro4_a','pro4_b','pro4_c','pro4_d')f
group by id)aa
full join (
select id,total_income,pro1_income_his,pri2_income_his,pri3_income_his,pri4_income_his
from table_his where day =${yesterday}
) bb
aa.id=bb.id;
1.首先统计每个产品的日收益后进行拼接
1.1 字段:
product_total_income_his:产品历史累计收益
product_total_income:产品累计收益
product_income 产品日收益
date:日期
id:客户id
产品历史累计收益字段()product_total_income_his)内容:
20200101:100\;20200102:105\;20200103:110\;.......
1.2 表:
tableName:拼接收益结果表
tableTmp:统计日累计收益结果表
1.3 sql逻辑:
insert overwrite table tableName partition(day='{today}')
select
id,
concat(nvl(b.product_total_income_his,''),"{data}",':',nvl(a.product_total_income,''),'\;') as product_income_his
from
(select
id,
sum(product_income) as product_total_income
from tableTmp
)a
full jion
(select
id,
product_income_his
from tableName
where day='${yesterday}'
) b
on a.id=b.id;
2.切割最近30天
2.1 表:
tableResult:hive结果表
tableName:拼接收益结果表
2.2 字段:
product_total_income_his:产品历史累计收益
product_total_income:产品累计收益
product_total_income_his_30 :产品历史累计收益(最近30天)
dt:日期
2.3 sql逻辑:
select
id,
product_total_income,
dt,
substring_index(product_total_income_his,'\:',-30) as product_total_income_his_30
from tableName
where day ="{today}"
3 番外
本人小白一个,如有问题,多多担待,多多指教!