语法
摘自官方文档的语法:
FIRST and LAST are very similar functions.
由于first与last函数只有一个顺序的差别,本文将只讲解first情况下的。
文档中对语法的解释:
FIRST and LAST are very similar functions. Both are aggregate and analytic functions that operate on a set of values from a set of rows that rank as the FIRST or LAST with respect to a given sorting specification. If only one row ranks as FIRST or LAST, then the aggregate operates on the set with only one element.
解释:它会返回排序之后为first的部分。first的部分可以是一行,也可以是多行(这也是与first_value函数最大的区别)。
If you omit the OVER clause, then the FIRST and LAST functions are treated as aggregate functions. You can use these functions as analytic functions by specifying the OVER clause. The query_partition_clause is the only part of the OVER clause valid with these functions. If you include the OVER clause but omit the query_partition_clause, then the function is treated as an analytic function, but the window defined for analysis is the entire table.
解释:如果省略over,first将作为聚合函数。可以添加over从句使其成为分析函数。over从句中将只能使用分组从句(如果用排序将直接报错)。如果用了over并省略分组从句,函数将作用于全表(over从句的一般特点)。
先建一张测试表
create table nayi_first_180410 as
select cast('a' as varchar2(10)) name, 1 dept, 2500 salary, 100 bonus from dual
union all
select 'b' , 2, 3000, 500 from dual
union all
select 'c' , 2, 3000, 50 from dual
union all
select 'd' , 2, 5000, 600 from dual
union all
select 'e' , 3, 4000, 800 from dual
union all
select 'f' , 3, 4200, 800 from dual
;
select*from nayi_first_180410;
NAME | DEPT | SALARY | BONUS |
---|---|---|---|
a | 1 | 2500 | 100 |
b | 2 | 3000 | 500 |
c | 2 | 3000 | 50 |
d | 2 | 5000 | 600 |
e | 3 | 4000 | 800 |
f | 3 | 4200 | 800 |
例子
聚合函数
获取bouns最大的行中最大的salary
select max(t.salary) keep(dense_rank first order by t.bonus desc) max
from nayi_first_180410 t;
MAX |
---|
4200 |
分析函数
获取bouns最大的行中salary最大值
获取每组中bouns最大的行中salary最大值
对bouns最大行的salary求和
select t.dept,
t.salary,
t.bonus,
max(t.salary) keep(dense_rank first order by t.bonus desc) over() for_all,
max(t.salary) keep(dense_rank first order by t.bonus desc) over(partition by t.dept) for_dept,
sum(t.salary) keep(dense_rank first order by t.bonus desc) over() sum
from nayi_first_180410 t;
DEPT | SALARY | BONUS | FOR_ALL | FOR_DEPT | SUM |
---|---|---|---|---|---|
1 | 2500 | 100 | 4200 | 2500 | 8200 |
2 | 3000 | 500 | 4200 | 5000 | 8200 |
2 | 3000 | 50 | 4200 | 5000 | 8200 |
2 | 5000 | 600 | 4200 | 5000 | 8200 |
3 | 4000 | 800 | 4200 | 4200 | 8200 |
3 | 4200 | 800 | 4200 | 4200 | 8200 |
从sum列中可以很清晰的看出是按照dense_rank查出的,它可以是多行。
应用场景
When you need a value from the first or last row of a sorted group, but the needed value is not the sort key, the FIRST and LAST functions eliminate the need for self-joins or views and enable better performance.
解释:从排好序的组中获取first rank的行,并且所取的列不是排序用的列。在这种情况下,使用first函数可以替代自连接或是视图的方法,并且可以获得更好的性能。
使用示例
简单点说就是以B列做排序,并取B列最大的行所对应的A列。正如上一个例子所写的那样。
再扩展一下。以任意规则排序,并取first行所对应的指定列。比如这样
select t.dept,
t.salary,
t.bonus,
max(t.salary) keep(dense_rank first order by t.bonus + t.salary, mod(t.bonus, 2), t.bonus / t.salary nulls last) over() max
from nayi_first_180410 t;
这个排序并没有实际意义,我想说的是,它可以作用于任何合法的order语句,而不是限定于某个列。
效率解释
至于文档中所说的提升效率,我们可以通过改写sql的方法来观察一下。
--作为聚合函数求最大bonus对应的salary
select max(t.salary) keep(dense_rank first order by t.bonus desc) max
from nayi_first_180410 t;
--通过视图方法求最大bonus对应的salary
select max(t.salary)
from (select t.salary,
t.bonus,
dense_rank() over(order by t.bonus desc) dense_rank
from nayi_first_180410 t) t
where t.dense_rank = 1;
--通过自连接方法求最大bonus对应的salary
select max(t1.salary)
from nayi_first_180410 t1
where t1.bonus = (select max(v1.bonus) from nayi_first_180410 v1)
;
------------------------------------------------------------------
--作为分析函数求每个dept最大bonus行所对应的salary
select t.*, max(t.salary) keep(dense_rank first order by t.bonus desc) over(partition by t.dept) max
from nayi_first_180410 t;
--通过视图方法求每个dept最大bonus行所对应的salary
select t1.name,
t1.dept,
t1.salary,
t1.bonus,
max(decode(t1.dense_rank, 1, t1.salary, null)) over(partition by t1.dept) max
from (select t1.*,
dense_rank() over(partition by t1.dept order by t1.bonus desc) dense_rank
from nayi_first_180410 t1) t1;
--通过自连接方法求每个dept最大bonus行所对应的salary
select t1.name, t1.dept, t1.salary, t1.bonus, t2.salary max
from nayi_first_180410 t1,
(select t2.dept,
t2.bonus,
max(t2.salary) salary,
dense_rank() over(partition by t2.dept order by t2.bonus desc) dense_rank
from nayi_first_180410 t2
group by t2.dept, t2.bonus) t2
where t1.dept = t2.dept
and t2.dense_rank = 1
order by t1.name
;
前三个sql结果集
MAX |
---|
4200 |
后三个sql结果集
NAME | DEPT | SALARY | BONUS | MAX |
---|---|---|---|---|
a | 1 | 2500 | 100 | 2500 |
b | 2 | 3000 | 500 | 5000 |
c | 2 | 3000 | 50 | 5000 |
d | 2 | 5000 | 600 | 5000 |
e | 3 | 4000 | 800 | 4200 |
f | 3 | 4200 | 800 | 4200 |
这是两组完全等效的sql。
虽然有很多种其他的改写方法,但是执行计划都是大同小异。
很明显,使用first函数将只需要一次table access full与一次window sort。使用视图将在此基础上多一层view。使用自连接的方法将至少多一次table access full。因此使用first函数将更有效率优势。在代码量与可读性上也有一些优势。
可用范围
The aggregate_function argument is any one of the MIN, MAX, SUM, AVG, COUNT, VARIANCE, or STDDEV functions. It operates on values from the rows that rank either FIRST or LAST. If only one row ranks as FIRST or LAST, then the aggregate operates on a singleton (nonaggregate) set.
规定了可用范围。只有min, max, sum, avg, count, variance, stddev 这些聚合函数可以接keep使用。也就是说像 first_value,last_value,lead,lag,percent_rank这样的函数是不可用的。应用范围其实比较窄。
令人意外的是,它竟然可以与wm_concat连用。这也导致我对它的适用范围有点搞不懂了,求大神指点。
select wmsys.wm_concat(t.name) keep(dense_rank first order by t.bonus desc) wm_concat
from nayi_first_180410 t;
WM_CONCAT |
---|
e,f |
总结
- 以任意规则排序,并取first rank行所对应的指定列情况下使用。
- 在正确使用的情况下通常会有更好的效率。
- 与min, max, sum, avg, count, variance, stddev这些函数组合使用
注:文中全部引用来自官方文档
https://docs.oracle.com/cd/E11882_01/server.112/e41084/functions065.htm#SQLRF00641