Oracle分析函数总结(1) - 基本概念和语法

Analytic functions compute an aggregate value based on a group of rows. They differ from aggregate functions in that they return multiple rows for each group. The group of rows is called a window and is defined by the analytic_clause. For each row, a sliding
window of rows is defined. The window determines the range of rows used to perform the calculations for the current row. Window sizes can be based on either a physical number of rows or a logical interval such as time.

Analytic functions are the last set of operations performed in a query except for the final ORDER BY clause. All joins and all WHERE, GROUP BY, and HAVING clauses are completed before the analytic functions are processed. Therefore, analytic functions can appear
only in the select list or ORDER BY clause.


分析函数基于一个行集计算聚集值,和聚合函数不同的地方在于分析函数会返回多行。

(很多分析函数也可以用作聚合函数,只是两种用法在语法上有所差异)

行集被称为窗口,对于每一行,定义了一个滑动窗口。这个窗口决定了对于当前行,用来执行计算的行的范围。

窗口的大小可以是物理的(行号)也可以是逻辑的(取值范围)。


除了最后的order by子句,分析函数是一个查询中最后被执行的操作,

所有的join,where,group by,having语句都在分析函数之前执行,

因此分析函数只能出现在select语句或者order by子句中。

分析函数语法
analytic_function([arguments])

OVER(

[partition_clause]

[order_by_clause[windowing_clause]]

)

1)分析函数
有以下的分析函数,带*号的函数表示允许windowing子句。

AVG *

CORR *

COUNT *

COVAR_POP *

COVAR_SAMP *

CUME_DIST

DENSE_RANK

FIRST

FIRST_VALUE *

LAG

LAST

LAST_VALUE *

LEAD

LISTAGG

MAX *

MIN

NTH_VALUE


NTILE

PERCENT_RANK

PERCENTILE_CONT

PERCENTILE_DISC

RANK

RATIO_TO_REPORT

REGR_ (Linear Regression) Functions *

ROW_NUMBER

STDDEV *

STDDEV_POP *

STDDEV_SAMP *

SUM *

VAR_POP *

VAR_SAMP *

VARIANCE *

2)PARTITION子句
按照表达式分分组,如果省略此子句,则全部的结果集被看作是一个组。

3)ORDER BY子句
ORDER BY将添加一个默认的WINDOWING子句,这个窗口从分组的第一行开始,一直继续到当前行(值),

即RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.

没有ORDER BY时,默认的窗口是组中的全部行。

4)WINDOWING子句
定义分析函数作用的行集合,集合中的数据可以看作是一个窗口。

要使用WINDOWING子句,必须使用ORDER BY子句。

有2个建立窗口的关键字:ROWS和RANGE。

ROWS表示一个物理窗口(行);RANGE表示一个逻辑窗口。

BETWEEN…AND: 指定一个范围,如果省略,只指定一个端点,它将作为起点,终点是当前行。

UNBOUNDED PRECEDING:表示窗口从当前分组的第一行开始。

UNBOUNDED FOLLOWING:表示窗口,结束于当前分组的最后一行。

CURRENT ROW:表示窗口开始/结束于当前行或者当前值。

value_expr PRECEDING/value_expr FOLLOWING:指定一个物理或者逻辑的偏移量。


下面通过几个例子来理解这些概念。emp表中有如下数据:

  1. tony@ORCL1> select empno,ename,sal,deptno
  2. 2 from emp order by deptno, sal;
  3. EMPNO ENAME SAL DEPTNO
  4. ---------- ---------- ---------- ----------
  5. 7934 MILLER 1300 10
  6. 7782 CLARK 2450 10
  7. 7839 KING 5000 10
  8. 7369 SMITH 800 20
  9. 7876 ADAMS 1100 20
  10. 7566 JONES 2975 20
  11. 7788 SCOTT 3000 20
  12. 7902 FORD 3000 20
  13. 7900 JAMES 950 30
  14. 7654 MARTIN 1250 30
  15. 7521 WARD 1250 30
  16. 7844 TURNER 1500 30
  17. 7499 ALLEN 1600 30
  18. 7698 BLAKE 2850 30
  19. 14 rows selected.


wmsys.wm_concat()函数可以用来进行字段合并,把多行的某一列数据合并成一行显示。
例如,下面利用这个函数把一个部门的所有人名显示在一行。

  1. tony@ORCL1> column emp_list format a40
  2. tony@ORCL1> select deptno, wm_concat(ename) emp_list
  3. 2 from emp group by deptno;
  4. DEPTNO EMP_LIST
  5. ---------- ----------------------------------------
  6. 10 CLARK,MILLER,KING
  7. 20 SMITH,FORD,ADAMS,SCOTT,JONES
  8. 30 ALLEN,JAMES,TURNER,BLAKE,MARTIN,WARD


上面的例子里wm_concat()作为聚合函数使用,下面看看它作为分析函数的结果。
可以看到和聚合函数用法不同的是分析函数为每一行都返回一个结果。

  1. tony@ORCL1> select ename,deptno,
  2. 2 wm_concat(ename) over (partition by deptno) emp_list
  3. 3 from emp;
  4. ENAME DEPTNO EMP_LIST
  5. ---------- ---------- ----------------------------------------
  6. CLARK 10 CLARK,KING,MILLER
  7. KING 10 CLARK,KING,MILLER
  8. MILLER 10 CLARK,KING,MILLER
  9. JONES 20 JONES,FORD,ADAMS,SMITH,SCOTT
  10. FORD 20 JONES,FORD,ADAMS,SMITH,SCOTT
  11. ADAMS 20 JONES,FORD,ADAMS,SMITH,SCOTT
  12. SMITH 20 JONES,FORD,ADAMS,SMITH,SCOTT
  13. SCOTT 20 JONES,FORD,ADAMS,SMITH,SCOTT
  14. WARD 30 WARD,TURNER,ALLEN,JAMES,BLAKE,MARTIN
  15. TURNER 30 WARD,TURNER,ALLEN,JAMES,BLAKE,MARTIN
  16. ALLEN 30 WARD,TURNER,ALLEN,JAMES,BLAKE,MARTIN
  17. JAMES 30 WARD,TURNER,ALLEN,JAMES,BLAKE,MARTIN
  18. BLAKE 30 WARD,TURNER,ALLEN,JAMES,BLAKE,MARTIN
  19. MARTIN 30 WARD,TURNER,ALLEN,JAMES,BLAKE,MARTIN
  20. 14 rows selected.


上面的例子没有加上order by子句,前面说过如果没有order by子句默认的窗口是分组中的全部行。
所以在每个部门分组中的所有人名都会显示。
下面看看如果加上order by子句会怎么样。
结果比较有趣,原因在于前面提到,order by子句会添加一个默认的窗口,
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
即从分组的第一行开始,一直继续到当前行(值)。

  1. tony@ORCL1> select ename,deptno,
  2. 2 wm_concat(ename) over (partition by deptno order by ename) emp_list
  3. 3 from emp;
  4. ENAME DEPTNO EMP_LIST
  5. ---------- ---------- ----------------------------------------
  6. CLARK 10 CLARK
  7. KING 10 CLARK,KING
  8. MILLER 10 CLARK,KING,MILLER
  9. ADAMS 20 ADAMS
  10. FORD 20 ADAMS,FORD
  11. JONES 20 ADAMS,FORD,JONES
  12. SCOTT 20 ADAMS,FORD,JONES,SCOTT
  13. SMITH 20 ADAMS,FORD,JONES,SCOTT,SMITH
  14. ALLEN 30 ALLEN
  15. BLAKE 30 ALLEN,BLAKE
  16. JAMES 30 ALLEN,BLAKE,JAMES
  17. MARTIN 30 ALLEN,BLAKE,JAMES,MARTIN
  18. TURNER 30 ALLEN,BLAKE,JAMES,MARTIN,TURNER
  19. WARD 30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD
  20. 14 rows selected.


再来看几个例子。
没有partition子句和order by子句,窗口是分组中的全部行

  1. tony@ORCL1> column sal format $99999.99
  2. tony@ORCL1> column avgsal format $99999.99
  3. tony@ORCL1> column sumsal format $99999.99
  4. tony@ORCL1> select ename, sal, sum(sal) over() sumsal from emp;
  5. ENAME SAL SUMSAL
  6. ---------- ---------- ----------
  7. SMITH $800.00 $29025.00
  8. ALLEN $1600.00 $29025.00
  9. WARD $1250.00 $29025.00
  10. JONES $2975.00 $29025.00
  11. MARTIN $1250.00 $29025.00
  12. BLAKE $2850.00 $29025.00
  13. CLARK $2450.00 $29025.00
  14. SCOTT $3000.00 $29025.00
  15. KING $5000.00 $29025.00
  16. TURNER $1500.00 $29025.00
  17. ADAMS $1100.00 $29025.00
  18. JAMES $950.00 $29025.00
  19. FORD $3000.00 $29025.00
  20. MILLER $1300.00 $29025.00
  21. 14 rows selected.


有order by子句没有开窗子句,窗口是
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.
即分组中的第一行到当前行(值).

  1. tony@ORCL1> select ename, sal,
  2. 2 sum(sal) over(order by ename) sumsal
  3. 3 from emp;
  4. ENAME SAL SUMSAL
  5. ---------- ---------- ----------
  6. ADAMS $1100.00 $1100.00
  7. ALLEN $1600.00 $2700.00
  8. BLAKE $2850.00 $5550.00
  9. CLARK $2450.00 $8000.00
  10. FORD $3000.00 $11000.00
  11. JAMES $950.00 $11950.00
  12. JONES $2975.00 $14925.00
  13. KING $5000.00 $19925.00
  14. MARTIN $1250.00 $21175.00
  15. MILLER $1300.00 $22475.00
  16. SCOTT $3000.00 $25475.00
  17. SMITH $800.00 $26275.00
  18. TURNER $1500.00 $27775.00
  19. WARD $1250.00 $29025.00
  20. 14 rows selected.


如果order by指定的是salary,注意结果中相同salary值的行的结果。
原因在于默认的窗口子句RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
是逻辑窗口,即第一行到等于当前值的所有行。

  1. tony@ORCL1> select ename, sal,
  2. 2 sum(sal) over(order by sal) sumsal
  3. 3 from emp;
  4. ENAME SAL SUMSAL
  5. ---------- ---------- ----------
  6. SMITH $800.00 $800.00
  7. JAMES $950.00 $1750.00
  8. ADAMS $1100.00 $2850.00
  9. WARD $1250.00 $5350.00
  10. MARTIN $1250.00 $5350.00
  11. MILLER $1300.00 $6650.00
  12. TURNER $1500.00 $8150.00
  13. ALLEN $1600.00 $9750.00
  14. CLARK $2450.00 $12200.00
  15. BLAKE $2850.00 $15050.00
  16. JONES $2975.00 $18025.00
  17. SCOTT $3000.00 $24025.00
  18. FORD $3000.00 $24025.00
  19. KING $5000.00 $29025.00
  20. 14 rows selected.


分部门计算薪水总和的结果。

  1. tony@ORCL1> select ename, sal, deptno,
  2. 2 sum(sal) over(partition by deptno) sumsal
  3. 3 from emp;
  4. ENAME SAL DEPTNO SUMSAL
  5. ---------- ---------- ---------- ----------
  6. CLARK $2450.00 10 $8750.00
  7. KING $5000.00 10 $8750.00
  8. MILLER $1300.00 10 $8750.00
  9. JONES $2975.00 20 $10875.00
  10. FORD $3000.00 20 $10875.00
  11. ADAMS $1100.00 20 $10875.00
  12. SMITH $800.00 20 $10875.00
  13. SCOTT $3000.00 20 $10875.00
  14. WARD $1250.00 30 $9400.00
  15. TURNER $1500.00 30 $9400.00
  16. ALLEN $1600.00 30 $9400.00
  17. JAMES $950.00 30 $9400.00
  18. BLAKE $2850.00 30 $9400.00
  19. MARTIN $1250.00 30 $9400.00
  20. 14 rows selected.


分部门计算薪水总和的结果。因为有order by子句,
加上了一个默认的窗口子句RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

  1. tony@ORCL1> select ename, sal, deptno,
  2. 2 sum(sal) over(partition by deptno order by ename) sumsal
  3. 3 from emp;
  4. ENAME SAL DEPTNO SUMSAL
  5. ---------- ---------- ---------- ----------
  6. CLARK $2450.00 10 $2450.00
  7. KING $5000.00 10 $7450.00
  8. MILLER $1300.00 10 $8750.00
  9. ADAMS $1100.00 20 $1100.00
  10. FORD $3000.00 20 $4100.00
  11. JONES $2975.00 20 $7075.00
  12. SCOTT $3000.00 20 $10075.00
  13. SMITH $800.00 20 $10875.00
  14. ALLEN $1600.00 30 $1600.00
  15. BLAKE $2850.00 30 $4450.00
  16. JAMES $950.00 30 $5400.00
  17. MARTIN $1250.00 30 $6650.00
  18. TURNER $1500.00 30 $8150.00
  19. WARD $1250.00 30 $9400.00
  20. 14 rows selected.


计算各部门的薪水平均值。

  1. tony@ORCL1> select ename, sal, deptno,
  2. 2 avg(sal) over(partition by deptno) avgsal
  3. 3 from emp;
  4. ENAME SAL DEPTNO AVGSAL
  5. ---------- ---------- ---------- ----------
  6. CLARK $2450.00 10 $2916.67
  7. KING $5000.00 10 $2916.67
  8. MILLER $1300.00 10 $2916.67
  9. JONES $2975.00 20 $2175.00
  10. FORD $3000.00 20 $2175.00
  11. ADAMS $1100.00 20 $2175.00
  12. SMITH $800.00 20 $2175.00
  13. SCOTT $3000.00 20 $2175.00
  14. WARD $1250.00 30 $1566.67
  15. TURNER $1500.00 30 $1566.67
  16. ALLEN $1600.00 30 $1566.67
  17. JAMES $950.00 30 $1566.67
  18. BLAKE $2850.00 30 $1566.67
  19. MARTIN $1250.00 30 $1566.67
  20. 14 rows selected.


下面再来看几个指定了开窗子句的例子。
按照名字排序,计算前面3行和本行的薪水总和。

  1. tony@ORCL1> select ename, sal,
  2. 2 sum(sal) over(order by ename rows 3 preceding) sumsal
  3. 3 from emp;
  4. ENAME SAL SUMSAL
  5. ---------- ---------- ----------
  6. ADAMS $1100.00 $1100.00
  7. ALLEN $1600.00 $2700.00
  8. BLAKE $2850.00 $5550.00
  9. CLARK $2450.00 $8000.00
  10. FORD $3000.00 $9900.00
  11. JAMES $950.00 $9250.00
  12. JONES $2975.00 $9375.00
  13. KING $5000.00 $11925.00
  14. MARTIN $1250.00 $10175.00
  15. MILLER $1300.00 $10525.00
  16. SCOTT $3000.00 $10550.00
  17. SMITH $800.00 $6350.00
  18. TURNER $1500.00 $6600.00
  19. WARD $1250.00 $6550.00
  20. 14 rows selected.


按照新水排序,计算>=(自己薪水-100) and <=自己薪水的所有人的薪水总和

  1. tony@ORCL1> select ename, sal,
  2. 2 sum(sal) over(order by sal range 100 preceding) sumsal
  3. 3 from emp;
  4. ENAME SAL SUMSAL
  5. ---------- ---------- ----------
  6. SMITH $800.00 $800.00
  7. JAMES $950.00 $950.00
  8. ADAMS $1100.00 $1100.00
  9. WARD $1250.00 $2500.00
  10. MARTIN $1250.00 $2500.00
  11. MILLER $1300.00 $3800.00
  12. TURNER $1500.00 $1500.00
  13. ALLEN $1600.00 $3100.00
  14. CLARK $2450.00 $2450.00
  15. BLAKE $2850.00 $2850.00
  16. JONES $2975.00 $2975.00
  17. SCOTT $3000.00 $8975.00
  18. FORD $3000.00 $8975.00
  19. KING $5000.00 $5000.00
  20. 14 rows selected.


按照新水排序,计算>=(自己薪水-100) and <=(自己薪水+200)的所有人的薪水总和

  1. tony@ORCL1> select ename, sal,
  2. 2 sum(sal) over(order by sal range
  3. 3 between 100 preceding and 200 following) sumsal
  4. 4 from emp;
  5. ENAME SAL SUMSAL
  6. ---------- ---------- ----------
  7. SMITH $800.00 $1750.00
  8. JAMES $950.00 $2050.00
  9. ADAMS $1100.00 $4900.00
  10. WARD $1250.00 $3800.00
  11. MARTIN $1250.00 $3800.00
  12. MILLER $1300.00 $5300.00
  13. TURNER $1500.00 $3100.00
  14. ALLEN $1600.00 $3100.00
  15. CLARK $2450.00 $2450.00
  16. BLAKE $2850.00 $11825.00
  17. JONES $2975.00 $8975.00
  18. SCOTT $3000.00 $8975.00
  19. FORD $3000.00 $8975.00
  20. KING $5000.00 $5000.00
  21. 14 rows selected.



 

发布了5 篇原创文章 · 获赞 0 · 访问量 1万+
展开阅读全文

没有更多推荐了,返回首页

©️2019 CSDN 皮肤主题: 大白 设计师: CSDN官方博客

分享到微信朋友圈

×

扫一扫,手机浏览