Hive preceding and following理解
在了解hive开窗函数前我们来看看Hive窗口函数preceding and following是怎么回事呢.
Hive窗口函数中,有一个功能是统计当前行之前或之后指定行作为一个聚合,关键字是 preceding 和 following,举例说明其使用方法.
常规的窗口函数比较简单,这里介绍一下分组的,重点是分组排序之后的rows between用法。
关键是理解rows between中关键字含义:
关键字 | 含义 |
---|---|
preceding | 往前 |
following | 往后 |
current row | 当前行 |
unbounded | 开始行 |
unbounded preceding | 表示从前面的起点 |
unbounded following | 表示到后面的终点 |
案例
select country,time,charge,
max(charge) over (partition by country order by time) as normal,
max(charge) over (partition by country order by time rows between unbounded preceding and current row) as unb_pre_cur,
max(charge) over (partition by country order by time rows between 2 preceding and 1 following) as pre2_fol1,
max(charge) over (partition by country order by time rows between current row and unbounded following) as cur_unb_fol
from temp
注意:默认是在分组类的当前行之前的行中计算。
rows between unbounded preceding and current row和默认的一样
rows between 2 preceding and 1 following表示在当前行的前2行和后1行中计算
rows between current row and unbounded following表示在当前行和到最后行中计算
rows between对于avg、min、max、sum这几个窗口函数的含义基本是一致的,注意查看当前结果
在 hive 环境中创建临时表
create table tmp_student
(
name string,
class tinyint,
cooperator_name string,
score tinyint
)row format delimited fields terminated by '|';
加载测试数据
load data local inpath ‘text.txt’ into table tmp_student;
其中text.txt中内容为:
adf|3|测试公司1|45
xx|3|测试公司2|55
cfe|2|测试公司2|74
3dd|3|测试公司5|n
fda|1|测试公司7|80
gds|2|测试公司9|92
ffd|1|测试公司10|95
dss|1|测试公司4|95
ddd|3|测试公司3|99
gf|3|测试公司9|99
查看是否加载成功
hive> select * from tmp_student;
adf 3 测试公司1 45
xx 3 测试公司2 55
cfe 2 测试公司2 74
3dd 3 测试公司5 NULL
fda 1 测试公司7 80
gds 2 测试公司9 92
ffd 1 测试公司10 95
dss 1 测试公司4 95
ddd 3 测试公司3 99
gf 3 测试公司9 99
Time taken: 1.314 seconds, Fetched: 10 row(s)
下面来了解preceding and following函数用法,执行下面sql
select
name,
score,
sum(score) over(order by score range between 2 preceding and 2 following) s1, -- 当前行的score值加减2的范围内的所有行
sum(score) over(order by score rows between 2 preceding and 2 following) s2, -- 当前行+前后2行,一共5行
sum(score) over(order by score range between unbounded preceding and unbounded following) s3, -- 全部行,不做限制
sum(score) over(order by score rows between unbounded preceding and unbounded following) s4, -- 全部行,不做限制
sum(score) over(order by score) s5, -- 第一行到当前行(和当前行相同score值的所有行都会包含进去)
sum(score) over(order by score rows between unbounded preceding and current row) s6, -- 第一行到当前行(和当前行相同score值的其他行不会包含进去,这是和上面的区别)
sum(score) over(order by score rows between 3 preceding and current row) s7, -- 当前行+往前3行
sum(score) over(order by score rows between 3 preceding and 1 following) s8, --当前行+往前3行+往后1行
sum(score) over(order by score rows between current row and unbounded following) s9 --当前行+往后所有行
from
tmp.tmp_student
order by
score;
注意:
当ORDER BY后面缺少窗口从句条件,窗口规范默认是 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.
当ORDER BY和窗口从句都缺失, 窗口规范默认是 ROW BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.
rows是物理窗口,是哪一行就是哪一行,与当前行的值(order by key的key的值)无关,只与排序后的行号相关,就是我们常规理解的那样。
range是逻辑窗口,与当前行的值有关(order by key的key的值),在key上操作range范围。
得到相关结果如下
通过上面的练习我们主要是对preceding and following有了一个比较全面的理解,所谓开窗函数其实就相当于flink中的滚动窗口,统计分析都是基于这个滚动窗口内完成的,所有的聚合计算统计都需要先根据range或者row确定窗口内的数据,然后就很容易得到正确的计算结果,在确定行数的过程中需要根据range和row确定是逻辑范围还是物理范围,最终都可以看作是第N行到第M行内数据的聚合统计.
Flink窗口说明
窗口函Windowing functions
-
FIRST_VALUE(col, bool DEFAULT)
返回分组窗口内第一行col的值,DEFAULT默认为false,如果指定为true,则跳过NULL后再取值,对于FIRST_VALUE每个分组第一行数据的FIRST_VALUE(col, bool DEFAULT) 就等于col,接下来几行数据会参考第一行数据是否为NULL根据True/False进行取舍.
WITH tmp AS (
SELECT 1 AS group_id, 'a' AS col
UNION ALL
SELECT 1 AS group_id, 'b' AS col
UNION ALL
SELECT 1 AS group_id, 'c' AS col
UNION ALL
SELECT 2 AS group_id, NULL AS col
UNION ALL
SELECT 2 AS group_id, 'e' AS col
)
SELECT group_id, col, FIRST_VALUE(col) OVER (PARTITION BY group_id ORDER BY col) AS col_new
FROM tmp;
返回结果为:
group_id col col_new
1 a a
1 b a
1 c a
2 NULL NULL
2 e NULL
如果是True
WITH tmp AS (
SELECT 1 AS group_id, NULL AS col
UNION ALL
SELECT 1 AS group_id, 'b' AS col
UNION ALL
SELECT 1 AS group_id, 'c' AS col
UNION ALL
SELECT 2 AS group_id, NULL AS col
UNION ALL
SELECT 2 AS group_id, 'e' AS col
)
SELECT group_id, col, FIRST_VALUE(col, true) OVER (PARTITION BY group_id ORDER BY col) AS col_new
FROM tmp;
返回结果为:
group_id col col_new
1 NULL NULL
1 b b
1 c b
2 NULL NULL
2 e e
- LAST_VALUE(col, bool DEFAULT)
返回分组窗口内第后一行col的值,DEFAULT默认为false,如果指定为true,则跳过NULL后再取值.
WITH tmp AS (
SELECT 1 AS group_id, 'a' AS col
UNION ALL
SELECT 1 AS group_id, NULL AS col
UNION ALL
SELECT 1 AS group_id, 'c' AS col
UNION ALL
SELECT 2 AS group_id, 'd' AS col
UNION ALL
SELECT 2 AS group_id, 'e' AS col
)
SELECT group_id, col, LAST_VALUE(col) OVER (PARTITION BY group_id ORDER BY col DESC) AS col_new FROM tmp;
返回结果为:
group_id col col_new
1 c c
1 a a
1 NULL NULL
2 e e
2 d d
如果是True
WITH tmp AS (
SELECT 1 AS group_id, 'a' AS col
UNION ALL
SELECT 1 AS group_id, NULL AS col
UNION ALL
SELECT 1 AS group_id, 'c' AS col
UNION ALL
SELECT 2 AS group_id, 'd' AS col
UNION ALL
SELECT 2 AS group_id, 'e' AS col
)
SELECT group_id, col, LAST_VALUE(col, true) OVER (ORDER BY group_id,col DESC ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS col_new FROM tmp;
返回结果为:
group_id col col_new
1 c a
1 a a
1 NULL e
2 e d
2 d d
开窗函数不同于group by函数,开窗函数能够把所有的记录都显示出来,一般select所选择的列也都与over里面的分组和排序字段相同,这样才能比较清楚地看到当前记录在聚合函数中的区别和贡献,上面两个窗口函数我们针对最后一个案例进行说明下.
使用了开窗函数首先要确定窗口的大小,根据上面的PRECEDING和FOLLOWING讲解我们可以知道在分析时候窗口大小为[前一行,当前行,后一行],那么对于第一行1 c取出last_value就是从[空值,c,a]取出集合中最后一个就是a,同理对于第二行1 a取出last_value就是从[c,a,Null]中取出最后一个Null跳过再取得到a,对于2 e从集合[Null,e,d]last_value=d*
- LEAD(col, n, DEFAULT)
返回分组窗口内往下第n行col的值,n默认为1,往下第n没有时返回DEFAULT(DEFAULT默认为NULL)使用分组后那么分组之间就不交叉计算.
WITH tmp AS
(
SELECT 1 AS group_id, 'a' AS col
UNION ALL SELECT 1 AS group_id, 'b' AS col
UNION ALL SELECT 1 AS group_id, 'c' AS col
UNION ALL SELECT 2 AS group_id, 'd' AS col
UNION ALL SELECT 2 AS group_id, 'e' AS col
)
SELECT group_id,
col,
LEAD(col) over(partition by group_id order by col) as col_new
FROM tmp;
返回结果
group_id col col_new
1 a b
1 b c
1 c NULL
2 d e
2 e NULL
等同于
WITH tmp AS
(
SELECT 1 AS group_id, 'a' AS col
UNION ALL SELECT 1 AS group_id, 'b' AS col
UNION ALL SELECT 1 AS group_id, 'c' AS col
UNION ALL SELECT 2 AS group_id, 'd' AS col
UNION ALL SELECT 2 AS group_id, 'e' AS col
)
SELECT group_id,
col,
LAST_VALUE(col) over(partition by group_id order by col rows between 1 FOLLOWING and 1 FOLLOWING) as col_new
FROM tmp;
其中rows between 1 FOLLOWING and 1 FOLLOWING为从往后一行开始到往后一行结束=往后一行
返回结果
group_id col col_new
1 a b
1 b c
1 c NULL
2 d e
2 e NULL
使用LEAD默认值
WITH tmp AS
(
SELECT 1 AS group_id, 'a' AS col
UNION ALL SELECT 1 AS group_id, 'b' AS col
UNION ALL SELECT 1 AS group_id, 'c' AS col
UNION ALL SELECT 2 AS group_id, 'd' AS col
UNION ALL SELECT 2 AS group_id, 'e' AS col
)
SELECT group_id,
col,
LEAD(col, 2, 'z') over(partition by group_id order by col) as col_new
FROM tmp;
返回结果
group_id col col_new
1 a c
1 b z
1 c z
2 d z
2 e z
- LAG(col, n, DEFAULT)
返回分组窗口内往上第n行col的值,n默认为1,往上第n没有时返回DEFAULT(DEFAULT默认为NULL)
WITH tmp AS
(
SELECT 1 AS group_id, 'a' AS col
UNION ALL SELECT 1 AS group_id, 'b' AS col
UNION ALL SELECT 1 AS group_id, 'c' AS col
UNION ALL SELECT 2 AS group_id, 'd' AS col
UNION ALL SELECT 2 AS group_id, 'e' AS col
)
SELECT group_id,
col,
LAG(col) over(partition by group_id order by col) as col_new
FROM tmp;
等同于
WITH tmp AS
(
SELECT 1 AS group_id, 'a' AS col
UNION ALL SELECT 1 AS group_id, 'b' AS col
UNION ALL SELECT 1 AS group_id, 'c' AS col
UNION ALL SELECT 2 AS group_id, 'd' AS col
UNION ALL SELECT 2 AS group_id, 'e' AS col
)
SELECT group_id,
col,
FIRST_VALUE(col) over(partition by group_id order by col rows BETWEEN 1 PRECEDING and 1 PRECEDING) as col_new
FROM tmp;
返回结果都是
group_id col col_new
1 a NULL
1 b a
1 c b
2 d NULL
2 e d
使用默认值
WITH tmp AS
(
SELECT 1 AS group_id, 'a' AS col
UNION ALL SELECT 1 AS group_id, 'b' AS col
UNION ALL SELECT 1 AS group_id, 'c' AS col
UNION ALL SELECT 2 AS group_id, 'd' AS col
UNION ALL SELECT 2 AS group_id, 'e' AS col
)
SELECT group_id,
col,
LAG(col, 2, 'zz') over(partition by group_id order by col) as col_new
FROM tmp;
返回结果
group_id col col_new
1 a zz
1 b zz
1 c a
2 d zz
2 e zz
OVER详解 The OVER clause
** FUNCTION(expr) OVER([PARTITION BY statement] [ORDER BY statement] [window clause]) **
中括号为可选参数
FUNCTION:包括标准聚合函数(COUNT/SUM/MIN/MAX/AVG)和一些分析函数(RANK/ROW_NUMBER/DENSE_RANK等)
PARTITION BY:可以由一个或者多个列组成
ORDER BY:可以由一个或者多个列组成
window clause:(ROWS | RANGE) BETWEEN (UNBOUNDED PRECEDING | num PRECEDING | CURRENT ROW) AND (UNBOUNDED PRECEDING | num PRECEDING | CURRENT ROW)
当window clause 未指定时默认为RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW,即分组内第一行至当前行作为窗口
当 window clause和ORDER BY都未指定时,默认为ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
**即分组内第一行至最后一行作为窗口.**
标准聚合函数
COUNT(expr) OVER()
返回窗口内行数
WITH tmp AS
(
SELECT 1 AS group_id, 'a' AS col
UNION ALL SELECT 1 AS group_id, 'b' AS col
UNION ALL SELECT 1 AS group_id, 'c' AS col
UNION ALL SELECT 2 AS group_id, 'e' AS col
UNION ALL SELECT 2 AS group_id, 'e' AS col
)
SELECT group_id,
col,
count(col) over(partition by group_id) as cnt1,
count(col) over(partition by group_id order by col) as cnt2,
count(col) over(partition by group_id order by col rows between CURRENT ROW and UNBOUNDED following) as cnt3,
count(distinct col) over(partition by group_id order by col rows between CURRENT ROW and UNBOUNDED following) as cnt4
FROM tmp;
返回结果为
group_id col cnt1 cnt2 cnt3 cnt4
1 a 3 1 3 3
1 b 3 2 2 2
1 c 3 3 1 1
2 e 2 2 2 1
2 e 2 2 1 1
SUM(expr) OVER()
返回窗口内求和值
WITH tmp AS
(
SELECT 1 AS group_id, 1 AS col
UNION ALL SELECT 1 AS group_id, 2 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 2 AS group_id, 4 AS col
UNION ALL SELECT 2 AS group_id, 4 AS col
)
SELECT group_id,
col,
SUM(col) over(partition by group_id) as sum1,
SUM(col) over(partition by group_id order by col) as sum2,
SUM(col) over(partition by group_id order by col rows between CURRENT ROW and UNBOUNDED following) as sum3,
SUM(distinct col) over(partition by group_id order by col rows between CURRENT ROW and UNBOUNDED following) as sum4
FROM tmp;
返回结果为
group_id col sum1 sum2 sum3 sum4
1 1 6 1 6 6
1 2 6 3 5 5
1 3 6 6 3 3
2 4 8 8 8 4
2 4 8 8 4 4
MIN(expr) OVER()
返回窗口内最小值
WITH tmp AS
(
SELECT 1 AS group_id, 1 AS col
UNION ALL SELECT 1 AS group_id, 2 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 2 AS group_id, 4 AS col
UNION ALL SELECT 2 AS group_id, 5 AS col
)
SELECT group_id,
col,
MIN(col) over(partition by group_id) as min1,
MIN(col) over(partition by group_id order by col) as min2,
MIN(col) over(partition by group_id order by col rows between CURRENT ROW and UNBOUNDED following) as min3
FROM tmp;
group_id col min1 min2 min3
1 1 1 1 1
1 2 1 1 2
1 3 1 1 3
2 4 4 4 4
2 5 4 4 5
MAX(expr) OVER()
返回窗口内最大值
WITH tmp AS
(
SELECT 1 AS group_id, 1 AS col
UNION ALL SELECT 1 AS group_id, 2 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 2 AS group_id, 4 AS col
UNION ALL SELECT 2 AS group_id, 5 AS col
)
SELECT group_id,
col,
MAX(col) over(partition by group_id) as max1,
MAX(col) over(partition by group_id order by col) as max2,
MAX(col) over(partition by group_id order by col rows between CURRENT ROW and UNBOUNDED following) as max3
FROM tmp;
返回结果为
group_id col max1 max2 max3
1 1 3 1 3
1 2 3 2 3
1 3 3 3 3
2 4 5 4 5
2 5 5 5 5
AVG(expr) OVER()
返回窗口内平均值
WITH tmp AS
(
SELECT 1 AS group_id, 1 AS col
UNION ALL SELECT 1 AS group_id, 2 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 2 AS group_id, 4 AS col
UNION ALL SELECT 2 AS group_id, 4 AS col
)
SELECT group_id,
col,
AVG(col) over(partition by group_id) as avg1,
AVG(col) over(partition by group_id order by col) as avg2,
AVG(col) over(partition by group_id order by col rows between CURRENT ROW and UNBOUNDED following) as avg3,
AVG(distinct col) over(partition by group_id order by col rows between CURRENT ROW and UNBOUNDED following) as avg4
FROM tmp;
返回结果为
|group_id|col|avg1|avg2|avg3|avg4|
|1|1|2.0|1.0|2.0|2.0|
|1|2|2.0|1.5|2.5|2.5|
|1|3|2.0|2.0|3.0|3.0|
|2|4|4.0|4.0|4.0|4.0|
|2|4|4.0|4.0|4.0|4.0|
分析函数 Analytics functions
RANK() OVER()
返回分组内排名(不支持自定义窗口)
WITH tmp AS
(
SELECT 1 AS group_id, 1 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 2 AS group_id, 4 AS col
UNION ALL SELECT 2 AS group_id, 5 AS col
)
SELECT group_id,
col,
RANK() over(partition by group_id order by col desc) as r
FROM tmp;
返回结果为
|group_id|col|r|
|1|3|1|
|1|3|1|
|1|1|3|
|2|5|1|
|2|4|2|
ROW_NUMBER() OVER()
返回分组内行号(不支持自定义窗口)
WITH tmp AS
(
SELECT 1 AS group_id, 1 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 2 AS group_id, 4 AS col
UNION ALL SELECT 2 AS group_id, 5 AS col
)
SELECT group_id,
col,
ROW_NUMBER() over(partition by group_id order by col desc) as r
FROM tmp;
返回结果为
|group_id|col|r|
|1|3|1|
|1|3|2|
|1|1|3|
|2|5|1|
|2|4|2|
DENSE_RANK() OVER()
返回分组内排名(排名相等不会留下空位,不支持自定义窗口)
WITH tmp AS
(
SELECT 1 AS group_id, 1 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 2 AS group_id, 4 AS col
UNION ALL SELECT 2 AS group_id, 5 AS col
)
SELECT group_id,
col,
DENSE_RANK() over(partition by group_id order by col desc) as r
FROM tmp;
返回结果为
|group_id|col|r|
|1|3|1|
|1|3|1|
|1|1|2|
|2|5|1|
|2|4|2|
CUME_DIST() OVER()
返回分组内累计分布值,即分组内小于(或者大于)等于当前值行数/分组内总行数
WITH tmp AS
(
SELECT 1 AS group_id, 1 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 2 AS group_id, 4 AS col
UNION ALL SELECT 2 AS group_id, 5 AS col
)
SELECT group_id,
col,
CUME_DIST() over(partition by group_id order by col asc) as d1,
CUME_DIST() over(partition by group_id order by col desc) as d2
FROM tmp;
返回结果为
|group_id|col|d1|d2|
|1|3|1.0|0.6666666666666666|
|1|3|1.0|0.6666666666666666|
|1|1|0.3333333333333333|1.0|
|2|5|1.0|0.5|
|2|4|0.5|1.0|
PERCENT_RANK() OVER()
返回百分比排序值,即分组内当前行的RANK值-1/分组内总行数-1
WITH tmp AS
(
SELECT 1 AS group_id, 1 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 2 AS group_id, 4 AS col
UNION ALL SELECT 2 AS group_id, 5 AS col
)
SELECT group_id,
col,
RANK() over(partition by group_id order by col asc) as r1,
PERCENT_RANK() over(partition by group_id order by col asc) as p1,
RANK() over(partition by group_id order by col desc) as r2,
PERCENT_RANK() over(partition by group_id order by col desc) as p2
FROM tmp;
返回结果为
|group_id|col|r1|p1|r2|p2|
|1|3|2|0.5|1|0.0|
|1|3|2|0.5|1|0.0|
|1|1|1|0.0|3|1.0|
|2|5|2|1.0|1|0.0|
|2|4|1|0.0|2|1.0|
NTILE(INTEGER x) OVER()
返回分区编号(将有序分区划分为x个组,称为bucket,并为分区中的每一行分配一个bucket编号)
WITH tmp AS
(
SELECT 1 AS group_id, 1 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 2 AS group_id, 4 AS col
UNION ALL SELECT 2 AS group_id, 5 AS col
)
SELECT group_id,
col,
NTILE(2) over(partition by group_id order by col asc) as bucket_id
FROM tmp;
返回结果为
|group_id|col|bucket_id|
|1|1|1|
|1|3|1|
|1|3|2|
|1|3|2|
|2|4|1|
|2|5|2|
OVER子句也支持聚合函数
Hive 2.1.0及之后版本,OVER子句也支持聚合函数,如:
WITH tmp AS
(
SELECT 1 AS group_id, 1 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 2 AS group_id, 4 AS col
UNION ALL SELECT 2 AS group_id, 5 AS col
)
SELECT group_id,
RANK() over(order by sum(col) desc) as r
FROM tmp
group by group_id;
结果为
|group_id|r|
|2|1|
|1|2|
window clause 的另一种写法
将window子句写在from后面,在over后使用别名进行引用,如下:
WITH tmp AS
(
SELECT 1 AS group_id, 1 AS col
UNION ALL SELECT 1 AS group_id, 2 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 2 AS group_id, 4 AS col
UNION ALL SELECT 2 AS group_id, 4 AS col
)
SELECT group_id,
col,
AVG(col) over w1 as avg1,
AVG(distinct col) over(partition by group_id order by col rows between CURRENT ROW and UNBOUNDED following) as avg2
FROM tmp
WINDOW w1 AS (partition by group_id order by col rows between CURRENT ROW and UNBOUNDED following);
结果为
|group_id|col|avg1|avg2|
|1|1|2.0|2.0|
|1|2|2.5|2.5|
|1|3|3.0|3.0|
|2|4|4.0|4.0|
|2|4|4.0|4.0|
WITH tmp AS
(
SELECT 1 AS group_id, 1 AS col
UNION ALL SELECT 1 AS group_id, 2 AS col
UNION ALL SELECT 1 AS group_id, 3 AS col
UNION ALL SELECT 2 AS group_id, 4 AS col
UNION ALL SELECT 2 AS group_id, 4 AS col
)
SELECT group_id,
col,
AVG(col) over w1 as avg1,
AVG(distinct col) over w2 as avg2
FROM tmp
WINDOW w1 AS (partition by group_id order by col rows between CURRENT ROW and UNBOUNDED following),
w2 AS (partition by group_id order by col rows between CURRENT ROW and UNBOUNDED following);
结果为
|group_id|col|avg1|avg2|
|1|1|2.0|2.0|
|1|2|2.5|2.5|
|1|3|3.0|3.0|
|2|4|4.0|4.0|
|2|4|4.0|4.0|
本文完.
Any suggestions and criticisms will be sincerely welcomed.
资料
https://blog.csdn.net/happyrocking/article/details/105369558
https://docs.aws.amazon.com/redshift/latest/dg/redshift
https://www.jianshu.com/p/3f3cf58472ca
https://www.cnblogs.com/hyunbar/p/13524855.html
https://blog.csdn.net/weixin_42307036/article/details/112381387