Oracle 开窗函数

Oracle 开窗函数(Window Functions)允许在不合并行的前提下对数据进行复杂分析,常用于排名、累计计算、前后行对比等场景。

一、核心语法结构

函数名() OVER (
  [PARTITION BY 分区列] 
  [ORDER BY 排序列 [ASC|DESC]] 
  [窗口帧子句 (ROWS | RANGE BETWEEN ... AND ...)]
)

PARTITION BY: 将数据划分为独立的分区,函数在每个分区内单独计算。

ORDER BY: 定义分区内的排序规则。

窗口帧子句: 指定计算范围(如当前行前 N 行、后 N 行等)。

二、常用开窗函数分类
1. 排名函数
(1) ROW_NUMBER()
功能: 为每行分配唯一的连续序号(相同值序号不同)。

-- 语法:
ROW_NUMBER() OVER ([PARTITION BY col1] ORDER BY col2)

-- 示例:
SELECT 
  name, score,
  ROW_NUMBER() OVER (ORDER BY score DESC) AS rank
FROM students;

结果:

name	score	rank
Alice	95    	1
Bob 	90  	2
Carol	90  	3

(2) RANK()
**功能:**允许并列排名,后续序号跳跃(如 1, 2, 2, 4)。

-- 语法:
RANK() OVER ([PARTITION BY col1] ORDER BY col2)

-- 示例:
SELECT 
  name, score,
  RANK() OVER (ORDER BY score DESC) AS rank
FROM students;

结果:

name	score	rank
Alice	95	    1
Bob   	90	    2
Carol	90	    2
Dave	85	    4

(3) DENSE_RANK()
**功能:**允许并列排名,后续序号连续(如 1, 2, 2, 3)。

-- 语法:
DENSE_RANK() OVER ([PARTITION BY col1] ORDER BY col2)

-- 示例:
SELECT 
  name, score,
  DENSE_RANK() OVER (ORDER BY score DESC) AS rank
FROM students;

结果:

name	score	rank
Alice	95	    1
Bob 	90    	2
Carol	90  	2
Dave	85	    3

2. 聚合函数 + 窗口
(1) SUM()
功能: 计算分区内的总和。

-- 语法:
SUM(col) OVER ([PARTITION BY col1] [ORDER BY col2] [窗口帧子句])

示例(累计销售额):
SELECT 
  month, sales,
  SUM(sales) OVER (ORDER BY month) AS cumulative_sales
FROM monthly_sales;

结果:

month	sales	cumulative_sales
2023-01	100	        100
2023-02	200	        300
2023-03	150     	450

(2) AVG()
功能: 计算分区内的平均值。

-- 语法:
AVG(col) OVER ([PARTITION BY col1] [ORDER BY col2] [窗口帧子句])

-- 示例(部门平均工资):
SELECT 
  department, salary,
  AVG(salary) OVER (PARTITION BY department) AS avg_salary
FROM employees;

(3) COUNT()
功能: 计算分区内的行数。

-- 语法:
COUNT(col) OVER ([PARTITION BY col1] [ORDER BY col2] [窗口帧子句])

-- 示例(统计每个部门的员工数):
SELECT 
  department, employee_id,
  COUNT(*) OVER (PARTITION BY department) AS dept_emp_count
FROM employees;

3. 前后行分析函数
(1) LAG(col, n, default)
功能: 获取当前行前第 n 行的值(默认 n=1)。

-- 语法:
LAG(col, n, default) OVER ([PARTITION BY col1] ORDER BY col2)

-- 示例(计算环比增长):
SELECT 
  month, sales,
  LAG(sales, 1, 0) OVER (ORDER BY month) AS prev_sales,
  sales - LAG(sales, 1, 0) OVER (ORDER BY month) AS growth
FROM monthly_sales;

(2) LEAD(col, n, default)
功能: 获取当前行后第 n 行的值(默认 n=1)。

-- 语法:
LEAD(col, n, default) OVER ([PARTITION BY col1] ORDER BY col2)
-- 示例(预测下月销售额):
SELECT 
  month, sales,
  LEAD(sales, 1, 0) OVER (ORDER BY month) AS next_sales
FROM monthly_sales;

4. 首尾值函数
(1) FIRST_VALUE(col)
功能: 返回窗口内的第一个值。

-- 语法:
FIRST_VALUE(col) OVER ([PARTITION BY col1] ORDER BY col2 [窗口帧子句])
-- 示例(获取部门最高工资):
SELECT 
  department, salary,
  FIRST_VALUE(salary) OVER (PARTITION BY department ORDER BY salary DESC) AS highest_salary
FROM employees;

(2) LAST_VALUE(col)
功能: 返回窗口内的最后一个值(需结合窗口帧子句)。

-- 语法:
LAST_VALUE(col) OVER ([PARTITION BY col1] ORDER BY col2 [窗口帧子句])
-- 示例(默认范围需扩展至末尾):
SELECT 
  month, sales,
  LAST_VALUE(sales) OVER (
    ORDER BY month 
    ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
  ) AS last_sales
FROM monthly_sales;

(3) NTH_VALUE(col, n)
功能: 返回窗口内的第 n 个值。

-- 语法:
NTH_VALUE(col, n) OVER ([PARTITION BY col1] ORDER BY col2 [窗口帧子句])

-- 示例(获取每个部门第二高的工资):
SELECT 
  department, salary,
  NTH_VALUE(salary, 2) OVER (
    PARTITION BY department 
    ORDER BY salary DESC 
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  ) AS second_highest
FROM employees;

5. 分布函数
(1) NTILE(n)
功能: 将数据分为 n 个桶,并为每行分配桶号。

-- 语法:
NTILE(n) OVER ([PARTITION BY col1] ORDER BY col2)

--示例(将学生分为三等):
SELECT 
  student_id, score,
  NTILE(3) OVER (ORDER BY score DESC) AS tier
FROM students;

(2) CUME_DIST()
功能: 计算累积分布值(当前行值在分区内的相对位置,范围 [0,1])。

-- 语法:
CUME_DIST() OVER ([PARTITION BY col1] ORDER BY col2)
-- 示例(计算工资累积分布):
SELECT 
  employee_id, salary,
  CUME_DIST() OVER (ORDER BY salary) AS cume_dist
FROM employees;

(3) PERCENT_RANK()
功能: 计算百分比排名(基于 RANK(),范围 [0,1])。

-- 语法:
PERCENT_RANK() OVER ([PARTITION BY col1] ORDER BY col2)

-- 示例:
SELECT 
  student_id, score,
  PERCENT_RANK() OVER (ORDER BY score) AS percent_rank
FROM students;

6. 窗口帧子句详解
(1) ROWS BETWEEN
范围定义:基于物理行偏移。

常用选项:
UNBOUNDED PRECEDING: 分区第一行。

n PRECEDING: 当前行前 n 行。

CURRENT ROW: 当前行。

n FOLLOWING: 当前行后 n 行。

UNBOUNDED FOLLOWING: 分区最后一行。

示例(计算近3个月移动平均):
sql
SELECT 
  month, sales,
  AVG(sales) OVER (
    ORDER BY month 
    ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
  ) AS moving_avg
FROM monthly_sales;

(2) RANGE BETWEEN
范围定义:基于逻辑值范围(如日期差值)。

-- 常用选项:
RANGE BETWEEN INTERVAL '7' DAY PRECEDING AND CURRENT ROW

-- 示例(计算近7天累计销售额):
SELECT 
  sale_date, sales,
  SUM(sales) OVER (
    ORDER BY sale_date 
    RANGE BETWEEN INTERVAL '7' DAY PRECEDING AND CURRENT ROW
  ) AS weekly_sales
FROM daily_sales;

三、总结

需求推荐函数
唯一连续序号ROW_NUMBER()
允许并列的排名RANK() 或 DENSE_RANK()
累计计算(总和/平均)SUM() OVER() 或 AVG() OVER()
前后行数据对比LAG() 或 LEAD()
分组首尾值FIRST_VALUE() 或 LAST_VALUE()
数据分桶NTILE(n)
累积分布分析CUME_DIST() 或 PERCENT_RANK()

四、注意事项
性能优化:
避免过度分区(PARTITION BY 列过多)。

结合索引优化排序字段(ORDER BY)。

窗口帧范围:

默认范围:RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW。

显式指定 ROWS 或 RANGE 以控制计算逻辑。

NULL 处理:

多数函数忽略 NULL,但需注意 ORDER BY 中 NULL 的排序位置。

通过灵活组合开窗函数,可以高效实现复杂分析需求,提升数据处理能力!

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值