Hive级联求和

基本需求:

根据访客的每日访问信息,进行累计访问,计算每个月访问量,总访问量。

输入数据:

有如下访客访问次数统计表 t_access_times。为了减轻计算复杂度,去掉了天的信息只留下了年月。

+--------------------------+-----------------------+-----------------------------+--+
| t_access_times.username  | t_access_times.month  | t_access_times.access_time  |
+--------------------------+-----------------------+-----------------------------+--+
| A                        | 2088-01               | 8                           |
| A                        | 2088-01               | 88                          |
| B                        | 2088-01               | 8                           |
| A                        | 2088-01               | 8                           |
| B                        | 2088-01               | 28                          |
| A                        | 2088-01               | 8                           |
| A                        | 2088-02               | 4                           |
| A                        | 2088-02               | 6                           |
| B                        | 2088-02               | 80                          |
| B                        | 2088-02               | 8                           |
+--------------------------+-----------------------+-----------------------------+--+

输出数据:

需要输出报表:t_access_times_accumulate

+-------------+----------+------+-------------+--+
| a.username  | a.month  | _c2  | accumulate  |
+-------------+----------+------+-------------+--+
| A           | 2088-01  | 112  | 112         |
| A           | 2088-02  | 10   | 122         |
| B           | 2088-01  | 36   | 36          |
| B           | 2088-02  | 88   | 124         |
+-------------+----------+------+-------------+--+

实现步骤:

  1. 创建表,加载数据
数据源:
A,2088-01,8
A,2088-01,88
B,2088-01,8
A,2088-01,8
B,2088-01,28
A,2088-01,8
A,2088-02,4
A,2088-02,6
B,2088-02,80
B,2088-02,8

创建表:
create table t_access_times(username string,month string,access_time int) 
row format delimited fields terminated by ',';

加载数据:
load data local inpath '/home/chunsoft/t_access_times.dat' into table t_access_times;
  1. 计算单个用户的月访问次数

select username,month,sum(access_time ) as nums from t_access_times group by username,month;

+-----------+----------+-------+--+
| username  |  month   | nums  |
+-----------+----------+-------+--+
| A         | 2088-01  | 112   |
| A         | 2088-02  | 10    |
| B         | 2088-01  | 36    |
| B         | 2088-02  | 88    |
+-----------+----------+-------+--+

  1. 上面的表自己跟自己进行inner join 方便求总访问量
select  a.*,b.*from
(select username,month,sum(access_time) as nums from t_access_times group by username,month) a  inner join
(select username,month,sum(access_time) as nums from t_access_times group by username,month) b 
on a.username = b.username;


+-------------+----------+---------+-------------+----------+---------+--+
| a.username  | a.month  | a.nums  | b.username  | b.month  | b.nums  |
+-------------+----------+---------+-------------+----------+---------+--+
| A           | 2088-01  | 112     | A           | 2088-01  | 112     |
| A           | 2088-01  | 112     | A           | 2088-02  | 10      |
| A           | 2088-02  | 10      | A           | 2088-01  | 112     |
| A           | 2088-02  | 10      | A           | 2088-02  | 10      |
| B           | 2088-01  | 36      | B           | 2088-01  | 36      |
| B           | 2088-01  | 36      | B           | 2088-02  | 88      |
| B           | 2088-02  | 88      | B           | 2088-01  | 36      |
| B           | 2088-02  | 88      | B           | 2088-02  | 88      |
+-------------+----------+---------+-------------+----------+---------+--+

  1. 根据当前月份的大小来判断累计求和并排序
select a.username,a.month,max(a.nums) ,sum(b.nums) as accumulate from 
(select username,month,sum(access_time) as nums from t_access_times group by username,month) a  inner join
(select username,month,sum(access_time) as nums from t_access_times group by username,month) b 
on a.username = b.username
where a.month>=b.month 
group by a.username,a.month 
order by a.username,a.month;

+-------------+----------+------+-------------+--+
| a.username  | a.month  | _c2  | accumulate  |
+-------------+----------+------+-------------+--+
| A           | 2088-01  | 112  | 112         |
| A           | 2088-02  | 10   | 122         |
| B           | 2088-01  | 36   | 36          |
| B           | 2088-02  | 88   | 124         |
+-------------+----------+------+-------------+--+


  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

刘金超DT

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值