一个数据计算的需求中需要用到当前月以及上月的数据,不利用map数据类型的话计算起来很麻烦,这里记录一下过程。
以下是根据实际业务分析出的计算逻辑。
ps:24合算法
30:资金
31:资金平均数
Q :12月=(P31*11+(P30+Q30)/2)/12
P:11月=(O31*10+(O30+P30)/2)/11
O:10月=(N31*9+(N30+O30)/2)/10
N:09月=(M31*8+(M30+N30)/2)/9
M:08月=(L31*7+(L30+M30)/2)/8
L:07月=(K31*6+(K30+L30)/2)/7
K:06月=(J31*5+(J30+K30)/2)/6
J:05月=(I31*4+(I30+J30)/2)/5
I:04月=(H31*3+(H30+I30)/2)/4
H:03月=(G31*2+(G30+H30)/2)/3
G:02月=(F31+(F30+G30)/2)/2
F:01月=(E30+F30)/2
资金平均数计算:
1月: (上月资金+当月资金)/2
2月至11月:当月平均资金 = [(上月资金平均数)*(当前月数-1)+ (上月资金+当月资金)/2 ] / 当前月数
每月数据的格式如下图片所示:
先计算出每年01月的资金平均数:
用当前表左关联当前表,关联条件为f1.zyear-1=f2.zyear and f1.zmonth+11=f2.zmonth
上一年对应今年,上年12月对应今年1月
select
f1.zyear,
f1.zmonth,
'01' date_type,
f1.base_name,
'资金平均数' as inv_main_class,
'资金平均数' as inv_class,
sum(coalesce(f1.inv_value,0)+coalesce(f2.inv_value,0)) / 2 as inv_value,
null as inv_value2,
null as inv_value3,
current_timestamp as insert_time
from (select * from db_name.table_name a1 where a1.inv_main_class='资金' and a1.date_type='01') f1 left join
(select * from db_name.table_name a1 where a1.inv_main_class='资金' and a1.date_type='12') f2
on f1.base_name=f2.base_name and f1.zyear-1=f2.zyear and f1.zmonth+11=f2.zmonth
group by
f1.zyear,
f1.zmonth,
f1.base_name;
用SQL我还没想到怎么实现递归调用,输出每一轮的结果,没办法只能12个月写12段SQL
分段计算2月至12月每月的平均资金
这里就要用到map数据类型了,
使用方式如下:
str_to_map(concat_ws(',',collect_list(concat_ws(':',cast(t1.zmonth as string),cast(t1.inv_value as string))))) as map_value,
说明:
用cast将数据类型转为string
用concat_ws拼接key和value,key、value之间用:分隔
Hive中concat函数和concat_ws函数的区别
https://blog.csdn.net/henrrywan/article/details/86543202
用collect_list将每个键值对存放到列表中
Hive的collect_list&collect_set函数
https://blog.csdn.net/qq_43665254/article/details/112552005
用concat_ws拼接每个键值对,键值对之间用,分隔
用str_to_map将字符串转为mao数据类型
效果如下:
用这个结果集作为主表关联取上月的资金平均数
select
f1.zyear,
f1.zmonth,
'02' date_type,
f1.base_name,
'库存资金平均数(财务报表口径)' as inv_main_class,
'库存资金平均数(财务报表口径)' as inv_class,
sum(
(coalesce(f2.inv_value,0)*(f1.zmonth-1)
+
(coalesce(f1.map_value[f1.zmonth],0)+coalesce(f1.map_value[case when length(cast(f1.zmonth+1 as string))=1 then concat('0',cast(f1.zmonth+1 as string)) else f1.zmonth+1 end ],0))
/2)
/f1.zmonth) as inv_value,
sum(coalesce(f2.inv_value,0)*(f1.zmonth-1)) inv_value2,
sum((coalesce(f1.map_value[f1.zmonth],0)+coalesce(f1.map_value[case when length(cast(f1.zmonth+1 as string))=1 then concat('0',cast(f1.zmonth+1 as string)) else f1.zmonth+1 end ],0))
/2 )as inv_value2,
current_timestamp as insert_time
from (select
t1.zyear,
'02' as zmonth,
t1.base_name,
'map' as date_type,
'资金map' as inv_main_class,
'资金map' as inv_class,
str_to_map(concat_ws(',',collect_list(concat_ws(':',cast(t1.zmonth as string),cast(t1.inv_value as string))))) as map_value,
current_timestamp as insert_time
from db_name.table_name t1
where t1.inv_main_class='资金' and t1.date_type='当月'
group by
t1.zyear,
t1.base_name)f1
left join (select * from db_name.table_name a1
where a1.inv_main_class='资金平均数'
and a1.date_type='01')f2
on f1.base_name=f2.base_name and f1.zyear=f2.zyear and f1.zmonth-1=f2.zmonth
group by
f1.zyear,
f1.zmonth,
f1.base_name;