12.21.1 Window Function Descriptions
12.21.2 Window Function Concepts and Syntax
12.21.3 Window Function Frame Specification
12.21.5 Window Function Restrictions
Frame定义
MySQL中的窗口定义是中包含一个frame定义,frame是当前窗口中的一个子数据集,frame子句便定义了如何定义子数据集(frame可以认为是窗口数据中的一个子数据集,数据帧)。
Frame的范围是根据当前row进行确定的,frame可以使用的场景:
1:通过定义frame计算截止每一订单时刻的总共金额。
2:还可以借助定义frame进行滚动计算平均值。
具体例子如下:
mysql> SELECT
time, subject, val,
SUM(val) OVER (PARTITION BY subject ORDER BY time
ROWS UNBOUNDED PRECEDING)
AS running_total,
AVG(val) OVER (PARTITION BY subject ORDER BY time
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
AS running_average
FROM observations;
+----------+---------+------+---------------+-----------------+
| time | subject | val | running_total | running_average |
+----------+---------+------+---------------+-----------------+
| 07:00:00 | st113 | 10 | 10 | 9.5000 |
| 07:15:00 | st113 | 9 | 19 | 14.6667 |
| 07:30:00 | st113 | 25 | 44 | 18.0000 |
| 07:45:00 | st113 | 20 | 64 | 22.5000 |
| 07:00:00 | xh458 | 0 | 0 | 5.0000 |
| 07:15:00 | xh458 | 10 | 10 | 5.0000 |
| 07:30:00 | xh458 | 5 | 15 | 15.0000 |
| 07:45:00 | xh458 | 30 | 45 | 20.0000 |
| 08:00:00 | xh458 | 25 | 70 | 27.5000 |
+----------+---------+------+---------------+-----------------+
分析1:
按照subject分类计算每一个订单时刻的订单金额:
SUM(val) OVER (PARTITION BY subject ORDER BY time ROWS UNBOUNDED PRECEDING) AS running_total
ROWS UNBOUNDED PRECEDING为frame的定义,含义是每次计算当前行到第一行的所有val的和,例如 :第三行running_total=44的计算为10+9+25。
分析2:
指定frame大小为3进行滚定计算平均值:
AVG(val) OVER (PARTITION BY subject ORDER BY time ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS running_average
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS running_average为frame的定义,对当前行的前一行,当前行以及当前行的后一行三行进行计算平均值,对于第一行与最后一行的处理会使用两行进行平均值。(这就是一个滚动窗口,窗口大小为3);例如第三行的running_average=18计算过程为:(9+25+20)/3=18 。
Frame语法定义
frame_clause:
frame_units frame_extent
frame_units:
{ROWS | RANGE}
frame的范围有两个单位,分别是ROWS与RANGE,分别是使用行确定frame,另外一个就是使用数值范围。
frame_extent:
{frame_start | frame_between}
frame_between:
BETWEEN frame_start AND frame_end
frame_start, frame_end: {
CURRENT ROW
| UNBOUNDED PRECEDING
| UNBOUNDED FOLLOWING
| expr PRECEDING
| expr FOLLOWING
}
可以使用BETWEEN定义frame的开始与结束范围,frame_start
必须大于frame_end
. 范围定义关键字含义:
-
CURRENT ROW
: ForROWS
, the bound is the current row. ForRANGE
, the bound is the peers of the current row. -
UNBOUNDED PRECEDING
: The bound is the first partition row. -
UNBOUNDED FOLLOWING
: The bound is the last partition row. -
: Forexpr
PRECEDINGROWS
, the bound isexpr
rows before the current row. ForRANGE
, the bound is the rows with values equal to the current row value minusexpr
; if the current row value isNULL
, the bound is the peers of the row. -
: Forexpr
FOLLOWINGROWS
, the bound isexpr
rows after the current row. ForRANGE
, the bound is the rows with values equal to the current row value plusexpr
; if the current row value isNULL
, the bound is the peers of the row.
示例:
数据集:
查询语句:
select *, sum(cnt) over w as from_7_days_ago_to_current_cnt
from open_id_to_pin
window w as (partition by open_id order by expire_time desc RANGE between INTERVAL 7 DAY PRECEDING and current row );
解读:
根据open_id划分窗口之后按照expire_time排序,frame的范围确定是根据当前行的expire_time开始到当前expire_time-7天作为结束的frame,查询结果为: