HIVE/SQL 实现同一列数据累加和累乘

一、累加

hive sql 实现同一列数据的累加,相信大家都会,这里就不过多解释了,贴一个例子结束:

with base_data as (
			  select 1 as fee, '20220101' as dt 
	union all select 2 as fee, '20220101' as dt 
	union all select 3 as fee, '20220102' as dt 
	union all select 4 as fee, '20220102' as dt 
	union all select 5 as fee, '20220103' as dt 
	union all select 6 as fee, '20220103' as dt 
	union all select 7 as fee, '20220104' as dt 
	union all select 8 as fee, '20220105' as dt 
	union all select 9 as fee, '20220105' as dt 
)

select sum(fee) as fee from base_data

二、累乘

仔细一想,hive 好像没有直接对同一列求累乘的函数,这里需要用到高中的数学知识——对数,一起来回忆一下:

1. 对数定义

2.对数的性质

以上是对数的一些运算性质,其中我重点圈出了两个性质,这将是我们使用 hive sql 实现同一列数据累乘的关键

1)左边的红框中,两个底数(a)相同的对数相加 = 以a为底(N*M)的对数,其中(N*M)就是我们想要的计算结果,应该如何获取(N*M)呢?

2)看右边红框的性质,我们可以利用这个性质获取(N*M)

映射到 hive ,可以将同一列的相乘转为同一列的对数相加,在求真数即可;

具体做法:

1)先将该列每一个值转为以10为底的对数(底数可随意),再对该列求sum,最后就得到以10为底【该列所有值相加】的结果为真数的对数。记为结果A(利用hive的log()函数)

2)再对A取真数(利用hive的power()函数)

用到的 log() 和 power() 两个函数不了解的可以自行百度一下

写个例子实战一下:

with base_data as (
			  select 1 as fee, '20220101' as dt 
	union all select 2 as fee, '20220101' as dt 
	union all select 3 as fee, '20220102' as dt 
	union all select 4 as fee, '20220102' as dt 
	union all select 5 as fee, '20220103' as dt 
	union all select 6 as fee, '20220103' as dt 
	union all select 7 as fee, '20220104' as dt 
	union all select 8 as fee, '20220105' as dt 
	union all select 9 as fee, '20220105' as dt 
)

select 	sum(fee) 						as `单列累加`,
		power(10, sum(log(10, fee))) 	as `单列累乘`
from 	base_data

结果:

可以发现,理论上来说 1*2*3*...*9 = 362880,但是为什么结果会是 362879.9999999994呢?

这是因为在累乘过程中,由于进行了log转换,存在较小精度损失;在真正使用时一般会用round()进行四舍五入处理;

再看:

with base_data as (
			  select 1 as fee, '20220101' as dt 
	union all select 2 as fee, '20220101' as dt 
	union all select 3 as fee, '20220102' as dt 
	union all select 4 as fee, '20220102' as dt 
	union all select 5 as fee, '20220103' as dt 
	union all select 6 as fee, '20220103' as dt 
	union all select 7 as fee, '20220104' as dt 
	union all select 8 as fee, '20220105' as dt 
	union all select 9 as fee, '20220105' as dt 
)

select 	sum(fee) 							as `单列累加`,
		power(10, sum(log(10, fee))) 		as `单列累乘`,
		round(power(10, sum(log(10, fee))))	as `单列累乘-精度处理`
from 	base_data

结果:

 至此,DONE

  • 2
    点赞
  • 11
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值