impala-1

最新推荐文章于 2022-02-28 14:15:50 发布

leoZ231

最新推荐文章于 2022-02-28 14:15:50 发布

阅读量2.1k

点赞数

分类专栏：故纸堆

本文链接：https://blog.csdn.net/jiangfuofu555/article/details/48374547

版权

故纸堆专栏收录该内容

88 篇文章 0 订阅

订阅专栏

impala-1

分析函数

over

row_number()：
select goods_price,row_number() over(order by goods_price desc) as id from h2_td_url_price limit 10;
max(x):
select spread_name, goods_price, max(goods_price) over (partition by spread_name) from h2_td_url_price order by spread_name;
窗口子句受 AVG()、COUNT()、FIRST_VALUE()、LAST_VALUE() 和 SUM() 函数支持。对于 MAX() 和 MIN()，允许窗口子句的前提是开始边界为UNBOUNDED PRECEDING。
ROWS BETWEEN [ { m | UNBOUNDED } PRECEDING | CURRENT ROW] [ AND [CURRENT ROW | { UNBOUNDED | n } FOLLOWING] ]的使用：

比如我想看收入的每天收入的5日累积变化可以像下面这么写：

select dt as '日期', price as '收入',
    sum(price) over (partition by spread_name order by dt rows between current row and 1 following) as '+1收入',
    sum(price) over (partition by spread_name order by dt rows between current row and 2 following) as '+2收入',
    sum(price) over (partition by spread_name order by dt rows between current row and 3 following) as '+3收入',
    sum(price) over (partition by spread_name order by dt rows between current row and 4 following) as '+4收入',
    sum(price) over (partition by spread_name order by dt rows between current row and 5 following) as '+5收入'
from (
    select dt,spread_name,sum(goods_price) price
    from h2_td_url_price
    where spread_name='广点通'
    group by dt,spread_name
) t
order by dt;

结果

+------------+-------+--------+--------+--------+--------+--------+
| 日期       | 收入  | +1收入 | +2收入 | +3收入 | +4收入 | +5收入 |
+------------+-------+--------+--------+--------+--------+--------+
| 2015-08-27 | 290   | 3180   | 7818   | 13167  | 19627  | 28918  |
| 2015-08-28 | 2890  | 7528   | 12877  | 19337  | 28628  | 37639  |
| 2015-08-29 | 4638  | 9987   | 16447  | 25738  | 34749  | 40750  |
| 2015-08-30 | 5349  | 11809  | 21100  | 30111  | 36112  | 41393  |
| 2015-08-31 | 6460  | 15751  | 24762  | 30763  | 36044  | 48733  |
| 2015-09-01 | 9291  | 18302  | 24303  | 29584  | 42273  | 51612  |
| 2015-09-02 | 9011  | 15012  | 20293  | 32982  | 42321  | 49526  |
| 2015-09-03 | 6001  | 11282  | 23971  | 33310  | 40515  | 53029  |
| 2015-09-04 | 5281  | 17970  | 27309  | 34514  | 47028  | 57571  |
| 2015-09-05 | 12689 | 22028  | 29233  | 41747  | 52290  | 52290  |
| 2015-09-06 | 9339  | 16544  | 29058  | 39601  | 39601  | 39601  |
| 2015-09-07 | 7205  | 19719  | 30262  | 30262  | 30262  | 30262  |
| 2015-09-08 | 12514 | 23057  | 23057  | 23057  | 23057  | 23057  |
| 2015-09-09 | 10543 | 10543  | 10543  | 10543  | 10543  | 10543  |
+------------+-------+--------+--------+--------+--------+--------+

select dt as '日期', price as '收入',
    sum(price) over (partition by spread_name order by dt range between unbounded preceding and current row) as '累积当今收入',
    sum(price) over (partition by spread_name order by dt range between unbounded preceding and unbounded following) as '累积收入',
    sum(price) over (partition by spread_name order by dt range between current row and unbounded following ) as '当日至今收入'
from (
    select dt,spread_name,sum(goods_price) price
    from h2_td_url_price
    where spread_name='广点通'
    group by dt,spread_name
) t
order by dt;

+------------+-------+--------------+----------+--------------+
| 日期       | 收入  | 累积当今收入 | 累积收入 | 当日至今收入 |
+------------+-------+--------------+----------+--------------+
| 2015-08-27 | 290   | 290          | 101501   | 101501       |
| 2015-08-28 | 2890  | 3180         | 101501   | 101211       |
| 2015-08-29 | 4638  | 7818         | 101501   | 98321        |
| 2015-08-30 | 5349  | 13167        | 101501   | 93683        |
| 2015-08-31 | 6460  | 19627        | 101501   | 88334        |
| 2015-09-01 | 9291  | 28918        | 101501   | 81874        |
| 2015-09-02 | 9011  | 37929        | 101501   | 72583        |
| 2015-09-03 | 6001  | 43930        | 101501   | 63572        |
| 2015-09-04 | 5281  | 49211        | 101501   | 57571        |
| 2015-09-05 | 12689 | 61900        | 101501   | 52290        |
| 2015-09-06 | 9339  | 71239        | 101501   | 39601        |
| 2015-09-07 | 7205  | 78444        | 101501   | 30262        |
| 2015-09-08 | 12514 | 90958        | 101501   | 23057        |
| 2015-09-09 | 10543 | 101501       | 101501   | 10543        |
+------------+-------+--------------+----------+--------------+

rank和dense_rank的区别

dense_rank
返回整数的升序顺序，从 1 开始。输出序列为 ORDER BY 表达式的重复值生成重复整数。为”连接的” 输入值生成重复输出值后，函数通过下一个更高的整数来继续序列。
因此，在输入包含重复值时，序列将包含重复值但无间隔。启动通过 PARTITIONED BY 子句生成的每个组的序列。
rank
返回整数的升序顺序，从 1 开始。输出序列为 ORDER BY 表达式的重复值生成重复整数。为”连接的” 输入值生成重复输出值后，函数按连接值的数量增加序列。
因此，在输入包含重复值时，序列将包含重复和间隔。启动通过 PARTITIONED BY 子句生成的每个组的序列。
举个例子:

select spread_name, 
       row_number() over(order by spread_name) id, 
       rank() over (order by spread_name) rank_id,
       dense_rank() over (order by spread_name) dense_id
from  h2_td_url_price group by spread_name,appkey
order by spread_name;

+---------------------+----+---------+----------+
| spread_name         | id | rank_id | dense_id |
+---------------------+----+---------+----------+
| adSage              | 1  | 1       | 1        |
| adwords             | 2  | 2       | 2        |
| appTao              | 3  | 3       | 3        |
| cocounion           | 4  | 4       | 4        |
| gdt                 | 5  | 5       | 5        |
| tendcloud           | 6  | 6       | 6        |
| 广点通              | 7  | 7       | 7        |
| 广点通              | 8  | 7       | 7        |
| 搜狗-无线专区0316   | 9  | 9       | 8        |
| 搜狗-无线关键字0316 | 10 | 10      | 9        |
| 百度-无线专区0316   | 11 | 11      | 10       |
| 百度-无线关键字0316 | 12 | 12      | 11       |
| 限时免费大全        | 13 | 13      | 12       |
+---------------------+----+---------+----------+

lead和leg的用法

select dt,sum(goods_price) as '收入',
        lead(sum(goods_price),1) over (order by dt) as '加一日收入',
        lag(sum(goods_price),1) over (order by dt) as '减一日收入'
from h2_td_url_price
group by dt;

+------------+-------+------------+------------+
| dt         | 收入  | 加一日收入 | 减一日收入 |
+------------+-------+------------+------------+
| 2015-08-27 | 985   | 3954       | NULL       |
| 2015-08-28 | 3954  | 7034       | 985        |
| 2015-08-29 | 7034  | 6281       | 3954       |
| 2015-08-30 | 6281  | 7199       | 7034       |
| 2015-08-31 | 7199  | 11874      | 6281       |
| 2015-09-01 | 11874 | 9637       | 7199       |
| 2015-09-02 | 9637  | 7277       | 11874      |
| 2015-09-03 | 7277  | 7652       | 9637       |
| 2015-09-04 | 7652  | 13592      | 7277       |
| 2015-09-05 | 13592 | 11036      | 7652       |
| 2015-09-06 | 11036 | 9399       | 13592      |
| 2015-09-07 | 9399  | 13299      | 11036      |
| 2015-09-08 | 13299 | 11928      | 9399       |
| 2015-09-09 | 11928 | NULL       | 13299      |
+------------+-------+------------+------------+

leoZ231

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
impala-1

impala-1分析函数overrow_number()： select goods_price,row_number() over(order by goods_price desc) as id from h2_td_url_price limit 10;max(x): select spread_name, goods_price, max(goods_price) over (par
复制链接

扫一扫