Hive经典案例之蚂蚁森林(一)

Hive经典案例之蚂蚁森林

以下表记录了用户每天的蚂蚁森林低碳生活领取的记录流水。
table_name:user_low_carbon

user_id data_dt low_carbon
用户 日期 减少碳排放(g)

蚂蚁森林植物换购表,用于记录申领环保植物所需要减少的碳排放量
table_name: plant_carbon

plant_id plant_name low_carbon
植物编号 植物名 换购植物所需要的碳

表数据:plant_carbon.txt

p001	梭梭树	17
p002	沙柳	19
p003	樟子树	146
p004	胡杨	215

表数据 :user_low_carbon.txt

u_001	2017/1/1	10
u_001	2017/1/2	150
u_001	2017/1/2	110
u_001	2017/1/2	10
u_001	2017/1/4	50
u_001	2017/1/4	10
u_001	2017/1/6	45
u_001	2017/1/6	90
u_002	2017/1/1	10
u_002	2017/1/2	150
u_002	2017/1/2	70
u_002	2017/1/3	30
u_002	2017/1/3	80
u_002	2017/1/4	150
u_002	2017/1/5	101
u_002	2017/1/6	68
u_003	2017/1/1	20
u_003	2017/1/2	10
u_003	2017/1/2	150
u_003	2017/1/3	160
u_003	2017/1/4	20
u_003	2017/1/5	120
u_003	2017/1/6	20
u_003	2017/1/7	10
u_003	2017/1/7	110
u_004	2017/1/1	110
u_004	2017/1/2	20
u_004	2017/1/2	50
u_004	2017/1/3	120
u_004	2017/1/4	30
u_004	2017/1/5	60
u_004	2017/1/6	120
u_004	2017/1/7	10
u_004	2017/1/7	120
u_005	2017/1/1	80
u_005	2017/1/2	50
u_005	2017/1/2	80
u_005	2017/1/3	180
u_005	2017/1/4	180
u_005	2017/1/4	10
u_005	2017/1/5	80
u_005	2017/1/6	280
u_005	2017/1/7	80
u_005	2017/1/7	80
u_006	2017/1/1	40
u_006	2017/1/2	40
u_006	2017/1/2	140
u_006	2017/1/3	210
u_006	2017/1/3	10
u_006	2017/1/4	40
u_006	2017/1/5	40
u_006	2017/1/6	20
u_006	2017/1/7	50
u_006	2017/1/7	240
u_007	2017/1/1	130
u_007	2017/1/2	30
u_007	2017/1/2	330
u_007	2017/1/3	30
u_007	2017/1/4	530
u_007	2017/1/5	30
u_007	2017/1/6	230
u_007	2017/1/7	130
u_007	2017/1/7	30
u_008	2017/1/1	160
u_008	2017/1/2	60
u_008	2017/1/2	60
u_008	2017/1/3	60
u_008	2017/1/4	260
u_008	2017/1/5	360
u_008	2017/1/6	160
u_008	2017/1/7	60
u_008	2017/1/7	60
u_009	2017/1/1	70
u_009	2017/1/2	70
u_009	2017/1/2	70
u_009	2017/1/3	170
u_009	2017/1/4	270
u_009	2017/1/5	70
u_009	2017/1/6	70
u_009	2017/1/7	70
u_009	2017/1/7	70
u_010	2017/1/1	90
u_010	2017/1/2	90
u_010	2017/1/2	90
u_010	2017/1/3	90
u_010	2017/1/4	90
u_010	2017/1/4	80
u_010	2017/1/5	90
u_010	2017/1/5	90
u_010	2017/1/6	190
u_010	2017/1/7	90
u_010	2017/1/7	90
u_011	2017/1/1	110
u_011	2017/1/2	100
u_011	2017/1/2	100
u_011	2017/1/3	120
u_011	2017/1/4	100
u_011	2017/1/5	100
u_011	2017/1/6	100
u_011	2017/1/7	130
u_011	2017/1/7	100
u_012	2017/1/1	10
u_012	2017/1/2	120
u_012	2017/1/2	10
u_012	2017/1/3	10
u_012	2017/1/4	50
u_012	2017/1/5	10
u_012	2017/1/6	20
u_012	2017/1/7	10
u_012	2017/1/7	10
u_013	2017/1/1	50
u_013	2017/1/2	150
u_013	2017/1/2	50
u_013	2017/1/3	150
u_013	2017/1/4	550
u_013	2017/1/5	350
u_013	2017/1/6	50
u_013	2017/1/7	20
u_013	2017/1/7	60
u_014	2017/1/1	220
u_014	2017/1/2	120
u_014	2017/1/2	20
u_014	2017/1/3	20
u_014	2017/1/4	20
u_014	2017/1/5	250
u_014	2017/1/6	120
u_014	2017/1/7	270
u_014	2017/1/7	20
u_015	2017/1/1	10
u_015	2017/1/2	20
u_015	2017/1/2	10
u_015	2017/1/3	10
u_015	2017/1/4	20
u_015	2017/1/5	70
u_015	2017/1/6	10
u_015	2017/1/7	80
u_015	2017/1/7	60

问题一:
蚂蚁森林植物申领统计
问题:假设2017年1月1日开始记录低碳数据(user_low_carbon),
假设2017年10月1日之前满足申领条件的用户都申领了一颗p004-胡杨,
剩余的能量全部用来领取“p002-沙柳” 。
统计在10月1日累计申领“p002-沙柳” 排名前10
的用户信息;以及他比后一名多领了几颗沙柳。

1.时间转换,将2017/1/3 转换成 2017-01-03

SELECT
     user_id,
     DATE_FORMAT(regexp_replace(data_dt,'/','-'),'yyyy-MM-dd') data_dt,
     low_carbon
FROM
     user_low_carbon  t1

在这里插入图片描述
2.计算时间在2017.10.1号之前的能量总和

SELECT
    t1.user_id,
    SUM(low_carbon) sum_low_carbon
FROM
(
   SELECT
     user_id,
     DATE_FORMAT(regexp_replace(data_dt,'/','-'),'yyyy-MM-dd') data_dt,
     low_carbon
   FROM
     user_low_carbon
)
t1
WHERE t1.data_dt<"2017-10-01"
GROUP BY t1.user_id
ORDER BY sum_low_carbon DESC LIMIT 11   t2
+-------------+-----------------+
| t1.user_id  | sum_low_carbon  |
+-------------+-----------------+
| u_007       | 1470            |
| u_013       | 1430            |
| u_008       | 1240            |
| u_005       | 1100            |
| u_010       | 1080            |
| u_014       | 1060            |
| u_011       | 960             |
| u_009       | 930             |
| u_006       | 830             |
| u_002       | 659             |
| u_004       | 640             |
+-------------+-----------------+

3.求出每个人领取的沙柳的数量 (先减去胡杨的能量,在除以沙柳的能量)
注意:最后相除的结果要进行向下转型

SELECT
   t2.user_id,
   t2.sum_low_carbon,
   t2.sum_low_carbon-(SELECT low_carbon  FROM plant_carbon WHERE plant_id='p004') aa ,
   FLOOR((t2.sum_low_carbon-(SELECT low_carbon  FROM plant_carbon WHERE plant_id='p004'))/(SELECT low_carbon  FROM plant_carbon WHERE plant_id='p002'))  bb
FROM  t2

组合

SELECT
   t2.user_id,
   t2.sum_low_carbon,
   t2.sum_low_carbon-(SELECT low_carbon  FROM plant_carbon WHERE plant_id='p004') aa,
   FLOOR((t2.sum_low_carbon-(SELECT low_carbon  FROM plant_carbon WHERE plant_id='p004'))/(SELECT low_carbon  FROM plant_carbon WHERE plant_id='p002'))  bb

FROM (
SELECT
    t1.user_id,
    SUM(low_carbon) sum_low_carbon
FROM
(
   SELECT
     user_id,
     DATE_FORMAT(regexp_replace(data_dt,'/','-'),'yyyy-MM-dd') data_dt,
     low_carbon
   FROM
     user_low_carbon
)
t1
WHERE t1.data_dt<"2017-10-01"
GROUP BY t1.user_id
ORDER BY sum_low_carbon DESC LIMIT 11
) t2      t3
+-------------+--------------------+-------+-----+
| t2.user_id  | t2.sum_low_carbon  |  aa   | bb  |
+-------------+--------------------+-------+-----+
| u_007       | 1470               | 1255  | 66  |
| u_013       | 1430               | 1215  | 63  |
| u_008       | 1240               | 1025  | 53  |
| u_005       | 1100               | 885   | 46  |
| u_010       | 1080               | 865   | 45  |
| u_014       | 1060               | 845   | 44  |
| u_011       | 960                | 745   | 39  |
| u_009       | 930                | 715   | 37  |
| u_006       | 830                | 615   | 32  |
| u_002       | 659                | 444   | 23  |
| u_004       | 640                | 425   | 22  |
+-------------+--------------------+-------+-----+


4.把下一行提取到前一行

    -- 把下一行提取到前面
SELECT 
    t3.user_id,
    t3.sum_low_carbon,
    t3.aa,
    t3.bb,
    lead(t3.bb,1) over(ORDER BY t3.bb DESC) next_sl
FROM t3
LIMIT 10

组合

SELECT 
    t3.user_id,
    t3.sum_low_carbon,
    t3.aa,
    t3.bb,
    lead(t3.bb,1) over(ORDER BY t3.bb DESC) next_sl
FROM 
(
SELECT
   t2.user_id,
   t2.sum_low_carbon,
   t2.sum_low_carbon-(SELECT low_carbon  FROM plant_carbon WHERE plant_id='p004') aa,
   FLOOR((t2.sum_low_carbon-(SELECT low_carbon  FROM plant_carbon WHERE plant_id='p004'))/(SELECT low_carbon  FROM plant_carbon WHERE plant_id='p002'))  bb

FROM (
SELECT
    t1.user_id,
    SUM(low_carbon) sum_low_carbon
FROM
(
   SELECT
     user_id,
     DATE_FORMAT(regexp_replace(data_dt,'/','-'),'yyyy-MM-dd') data_dt,
     low_carbon
   FROM
     user_low_carbon
)
t1
WHERE t1.data_dt<"2017-10-01"
GROUP BY t1.user_id
ORDER BY sum_low_carbon DESC LIMIT 11
) t2
)
t3
LIMIT 10
+-------------+--------------------+--------+--------+----------+
| t3.user_id  | t3.sum_low_carbon  | t3.aa  | t3.bb  | next_sl  |
+-------------+--------------------+--------+--------+----------+
| u_007       | 1470               | 1255   | 66     | 63       |
| u_013       | 1430               | 1215   | 63     | 53       |
| u_008       | 1240               | 1025   | 53     | 46       |
| u_005       | 1100               | 885    | 46     | 45       |
| u_010       | 1080               | 865    | 45     | 44       |
| u_014       | 1060               | 845    | 44     | 39       |
| u_011       | 960                | 745    | 39     | 37       |
| u_009       | 930                | 715    | 37     | 32       |
| u_006       | 830                | 615    | 32     | 23       |
| u_002       | 659                | 444    | 23     | 22       |
+-------------+--------------------+--------+--------+----------+

  1. 直接两行相减 即可得出
SELECT 
    t3.user_id,
    t3.sum_low_carbon,
    t3.aa,
    t3.bb,
    lead(t3.bb,1) over(ORDER BY t3.bb DESC) next_sl,
    t3.bb-lead(t3.bb,1) over(ORDER BY t3.bb DESC)  lastnum
FROM 
(
SELECT
   t2.user_id,
   t2.sum_low_carbon,
   t2.sum_low_carbon-(SELECT low_carbon  FROM plant_carbon WHERE plant_id='p004') aa,
   FLOOR((t2.sum_low_carbon-(SELECT low_carbon  FROM plant_carbon WHERE plant_id='p004'))/(SELECT low_carbon  FROM plant_carbon WHERE plant_id='p002'))  bb

FROM (
SELECT
    t1.user_id,
    SUM(low_carbon) sum_low_carbon
FROM
(
   SELECT
     user_id,
     DATE_FORMAT(regexp_replace(data_dt,'/','-'),'yyyy-MM-dd') data_dt,
     low_carbon
   FROM
     user_low_carbon
)
t1
WHERE t1.data_dt<"2017-10-01"
GROUP BY t1.user_id
ORDER BY sum_low_carbon DESC LIMIT 11
) t2
)
t3
LIMIT 10
+-------------+--------------------+--------+--------+----------+----------+
| t3.user_id  | t3.sum_low_carbon  | t3.aa  | t3.bb  | next_sl  | lastnum  |
+-------------+--------------------+--------+--------+----------+----------+
| u_007       | 1470               | 1255   | 66     | 63       | 3        |
| u_013       | 1430               | 1215   | 63     | 53       | 10       |
| u_008       | 1240               | 1025   | 53     | 46       | 7        |
| u_005       | 1100               | 885    | 46     | 45       | 1        |
| u_010       | 1080               | 865    | 45     | 44       | 1        |
| u_014       | 1060               | 845    | 44     | 39       | 5        |
| u_011       | 960                | 745    | 39     | 37       | 2        |
| u_009       | 930                | 715    | 37     | 32       | 5        |
| u_006       | 830                | 615    | 32     | 23       | 9        |
| u_002       | 659                | 444    | 23     | 22       | 1        |
+-------------+--------------------+--------+--------+----------+----------+


  • 1
    点赞
  • 11
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值