over()分析函数
聚合函数(count,sum、max,min,avg等)可以计算基于分组的某种聚合值,但聚合函数对于某个分组只能返回一行记录。若想对于某组返回多行记录,则需要使用分析函数。
over()分析函数主要有以下用处:
1,排序
rank()/dense_rank over(partition by columns1 order by column2)
2,取最值
min()/max() over(partition by columns1 order by column2)
3,取前后记录
lead(column1,n,m)/lag(column1,n,m) over(partition by columns1 order by column2) 取前面/后面第n行记录
4,取首尾记录
first_value/last_value() over(partition by columns1 order by column2 asc) 取首尾记录
5,分页排序
row_number() over(partition by columns1 order by column2) 排序(应用:分页)
6,统计分析
sum/avg/count() over(partition by columns1 )
7,范围内求值
rows/range between … preceding and … following 上下范围内求值
rows between … preceding and … following
1,排序
rank()/dense_rank over(partition by columns1 order by column2)
说明
over()在什么条件之上进行展示,按columns1 字段划分组,按column2字段排序。
注意:
(1)使用rank()/dense_rank() 时,必须要有order by子句,否则不合符语法。
(2)rank(): 跳跃排序,如果有两个一级,接下来就是三级;dense_rank(): 连续排序,如果有两个一级时,接下来是二级。
示例
查询各部门人员工资排行
1,跳跃排序
select deptno,
job,
ename,
sal,
rank() over(partition by deptno order by sal desc)
from emp;
# 结果
1 10 PRESIDENT KING 5000.00 1
2 10 MANAGER CLARK 2450.00 2
3 10 CLERK MILLER 1300.00 3
4 20 ANALYST SCOTT 3000.00 1
5 20 ANALYST FORD 3000.00 1
6 20 MANAGER JONES 2975.00 3
7 20 CLERK ADAMS 1100.00 4
8 20 CLERK SMITH 800.00 5
9 30 MANAGER BLAKE 2850.00 1
10 30 SALESMAN ALLEN 1600.00 2
11 30 SALESMAN TURNER 1500.00 3
12 30 SALESMAN MARTIN 1250.00 4
13 30 SALESMAN WARD 1250.00 4
14 30 CLERK JAMES 950.00 6
2,连续排序
select deptno,
job,
ename,
sal,
dense_rank() over(partition by deptno order by sal desc)
from emp;
# 结果
1 10 PRESIDENT KING 5000.00 1
2 10 MANAGER CLARK 2450.00 2
3 10 CLERK MILLER 1300.00 3
4 20 ANALYST SCOTT 3000.00 1
5 20 ANALYST FORD 3000.00 1
6 20 MANAGER JONES 2975.00 2
7 20 CLERK ADAMS 1100.00 3
8 20 CLERK SMITH 800.00 4
9 30 MANAGER BLAKE 2850.00 1
10 30 SALESMAN ALLEN 1600.00 2
11 30 SALESMAN TURNER 1500.00 3
12 30 SALESMAN MARTIN 1250.00 4
13 30 SALESMAN WARD 1250.00 4
14 30 CLERK JAMES 950.00 5
2,取最值
min()/max() over(partition by columns1 order by column2)
示例
查询部门人员工资及工资差距
select deptno,
job,
ename,
sal,
min(sal) over(partition by deptno order by sal asc) min,
max(sal) over(partition by deptno order by sal asc) max,
nvl(sal - min(sal) over(partition by deptno), 0) more,
nvl(max(sal) over(partition by deptno) - sal, 0) less
from emp;
# 结果
1 10 CLERK MILLER 1300.00 1300 5000 0 3700
2 10 MANAGER CLARK 2450.00 1300 5000 1150 2550
3 10 PRESIDENT KING 5000.00 1300 5000 3700 0
4 20 CLERK SMITH 800.00 800 3000 0 2200
5 20 CLERK ADAMS 1100.00 800 3000 300 1900
6 20 MANAGER JONES 2975.00 800 3000 2175 25
7 20 ANALYST SCOTT 3000.00 800 3000 2200 0
8 20 ANALYST FORD 3000.00 800 3000 2200 0
9 30 CLERK JAMES 950.00 950 2850 0 1900
10 30 SALESMAN WARD 1250.00 950 2850 300 1600
11 30 SALESMAN MARTIN 1250.00 950 2850 300 1600
12 30 SALESMAN TURNER 1500.00 950 2850 550 1350
13 30 SALESMAN ALLEN 1600.00 950 2850 650 1250
14 30 MANAGER BLAKE 2850.00 950 2850 1900 0
注意:这里使用两个排序,只有第一个排序asc生效
3,取前后记录
lead(列名,n,m)/lag(列名,n,m) over(partition by columns1 order by column2)
说明
lead(columns1 ,n,m): 当前记录后面第n行记录的columns1 的值,没有则显示m;如果不带参数n,m,则查找当前记录后面第一行的记录columns1 的值,没有则默认值为null。
lag(columns1 ,n,m): 当前记录前面第n行记录的columns1 的值,没有则显示m;如果不带参数n,m,则查找当前记录前面第一行的记录columns1 的值,没有则默认值为null
示例
查询部门人员工资及其与相邻的人员工资
select deptno,
job,
ename,
sal,
lead(sal, 1, 0) over(partition by deptno order by sal) lead,
lag(sal, 1, 0) over(partition by deptno order by sal) lag
from emp;
# 结果
1 10 CLERK MILLER 1300.00 2450 0
2 10 MANAGER CLARK 2450.00 5000 1300
3 10 PRESIDENT KING 5000.00 0 2450
4 20 CLERK SMITH 800.00 1100 0
5 20 CLERK ADAMS 1100.00 2975 800
6 20 MANAGER JONES 2975.00 3000 1100
7 20 ANALYST SCOTT 3000.00 3000 2975
8 20 ANALYST FORD 3000.00 0 3000
9 30 CLERK JAMES 950.00 1250 0
10 30 SALESMAN MARTIN 1250.00 1250 950
11 30 SALESMAN WARD 1250.00 1500 1250
12 30 SALESMAN TURNER 1500.00 1600 1250
13 30 SALESMAN ALLEN 1600.00 2850 1500
14 30 MANAGER BLAKE 2850.00 0 1600
4,取首尾记录
first_value/last_value() over(partition by columns1 order by column2 asc)
说明
显示分组数据的第一行和最后一行纪录的column1值,并按column2排序
注意:如果一条语句出现多个first_value/last_value()语句,请在第一个语句上写排序子句,否则会出现乱序。
示例
查询员工工资并显示本部门第一人和最后一人工资
select deptno,
job,
ename,
sal,
first_value(sal) over(partition by deptno order by sal asc) first,
last_value(sal) over(partition by deptno) last
from emp;
# 结果
1 10 CLERK MILLER 1300.00 1300 5000
2 10 MANAGER CLARK 2450.00 1300 5000
3 10 PRESIDENT KING 5000.00 1300 5000
4 20 CLERK SMITH 800.00 800 3000
5 20 CLERK ADAMS 1100.00 800 3000
6 20 MANAGER JONES 2975.00 800 3000
7 20 ANALYST SCOTT 3000.00 800 3000
8 20 ANALYST FORD 3000.00 800 3000
9 30 CLERK JAMES 950.00 950 2850
10 30 SALESMAN MARTIN 1250.00 950 2850
11 30 SALESMAN WARD 1250.00 950 2850
12 30 SALESMAN TURNER 1500.00 950 2850
13 30 SALESMAN ALLEN 1600.00 950 2850
14 30 MANAGER BLAKE 2850.00 950 2850
5,分页排序
row_number() over(partition by columns1 order by column2)
说明
按分组条件显示行数,和rank()/dense_rank()不同的是,按照排序结果递增显示。
示例
查询各部门员工的工资排名(按递增序列)
select deptno,
job,
ename,
sal,
row_number() over(partition by deptno order by sal) row_num
from emp;
# 结果
1 10 CLERK MILLER 1300.00 1
2 10 MANAGER CLARK 2450.00 2
3 10 PRESIDENT KING 5000.00 3
4 20 CLERK SMITH 800.00 1
5 20 CLERK ADAMS 1100.00 2
6 20 MANAGER JONES 2975.00 3
7 20 ANALYST SCOTT 3000.00 4
8 20 ANALYST FORD 3000.00 5
9 30 CLERK JAMES 950.00 1
10 30 SALESMAN MARTIN 1250.00 2
11 30 SALESMAN WARD 1250.00 3
12 30 SALESMAN TURNER 1500.00 4
13 30 SALESMAN ALLEN 1600.00 5
14 30 MANAGER BLAKE 2850.00 6
6,统计分析
sum/avg/count() over(partition by columns1 order by column2)
说明
分组显示统计信息
注意:有多个统计量时,仅有第一个order子句有效,多次使用order子句会导致统计量出现错误。
示例
查询各部门员工工资及部门相关统计信息
select deptno,
job,
ename,
sal,
sum(sal) over(partition by deptno order by sal) sum_sal,
avg(sal) over(partition by deptno) avg_sal,
count(0) over(partition by deptno) count_sal
from emp;
# 结果
1 10 CLERK MILLER 1300.00 1300 2916.66666666667 3
2 10 MANAGER CLARK 2450.00 3750 2916.66666666667 3
3 10 PRESIDENT KING 5000.00 8750 2916.66666666667 3
4 20 CLERK SMITH 800.00 800 2175 5
5 20 CLERK ADAMS 1100.00 1900 2175 5
6 20 MANAGER JONES 2975.00 4875 2175 5
7 20 ANALYST SCOTT 3000.00 10875 2175 5
8 20 ANALYST FORD 3000.00 10875 2175 5
9 30 CLERK JAMES 950.00 950 1566.66666666667 6
10 30 SALESMAN MARTIN 1250.00 3450 1566.66666666667 6
11 30 SALESMAN WARD 1250.00 3450 1566.66666666667 6
12 30 SALESMAN TURNER 1500.00 4950 1566.66666666667 6
13 30 SALESMAN ALLEN 1600.00 6550 1566.66666666667 6
14 30 MANAGER BLAKE 2850.00 9400 1566.66666666667 6
7,范围内求值
rows/range between … preceding and … following
说明
over的子句,用来控制统计量的取值范围,rows和range具有同样的效果。
preceding:当前记录之前
following:当前记录之后
unbounded:不受限制的,所有的
CURRENT ROW:当前记录
例如 :
between unbounded preceding and unbounded following 统计所有记录
between current row and unbounded following 当前行及其后的所有记录
between unbounded preceding and current row 当前行及其前的所有记录
between 1 preceding and 1 following 统计当前行及前后各一行的记录
示例
select deptno,
ename,
sal,
sum(sal) over(partition by deptno order by sal rows between 1 preceding and 1 following) sum
from emp;
# 结果
1 10 MILLER 1300.00 3750
2 10 CLARK 2450.00 8750
3 10 KING 5000.00 7450
4 20 SMITH 800.00 1900
5 20 ADAMS 1100.00 4875
6 20 JONES 2975.00 7075
7 20 SCOTT 3000.00 8975
8 20 FORD 3000.00 6000
9 30 JAMES 950.00 2200
10 30 MARTIN 1250.00 3450
11 30 WARD 1250.00 4000
12 30 TURNER 1500.00 4350
13 30 ALLEN 1600.00 5950
14 30 BLAKE 2850.00 4450