Oracle分析函数入门

Oracle 8i and 9i分析函数

 

 

Analytic functions compute an aggregate value based on a group of rows. They differ

from aggregate functions in that they return multiple rows for each group. The group

of rows is called a window and is defined by the analytic clause. For each row, a

"sliding" window of rows is defined. The window determines the range of rows used to

perform the calculations for the "current row". Window sizes can be based on either a

physical number of rows or a logical interval such as time.

 

分析函数是在一个记录行分组的基础上计算它们的总值。与集合函数不同,他们从每个分组返回多个记录。

这些被分组后的记录集合被称为窗口,即一个窗口对应一个分组,分组是通过分析子句定义的。

对于每记录行,定义了一个滑动窗口。

该窗口确定当前行计算的范围。窗口的大小可由各行的实际编号或由时间等逻辑间隔确定。

 

Analytic functions are the last set of operations performed in a query except for the

final ORDER BY clause. All joins and all WHERE, GROUP BY, and HAVING clauses

are completed before the analytic functions are processed. Therefore, analytic

functions can appear only in the select list or ORDER BY clause.

 

除了ORDER BY(按…排序)语句外,分析函数是一条查询被执行的操作。

所有合并、WHEREGROUP BYHAVING语句都是分析函数处理之前完成的。

因此,分析函数只出现在选择目录或ORDER BY(按…排序)语句中。

 

Query-Partition-Clause查询划分语句

 

The PARTITION BY clause logically breaks a single result set into N

groups, according to the criteria set by the partition expressions. The

words "partition" and "group" are used synonymously here. The

analytic functions are applied to each group independently, they are

reset for each group.

 

根据划分表达式设置的规则,PARTITIONBY(按…划分)将一个结果逻辑分成N个分组划分表达式。

在此划分分组用作同义词。分析函数独立应用于各个分组,并在应用时重置。

 

Order-By-Clause排序语句

 

The ORDER BY clause specifies how the data is sorted within each

group (partition). This will definitely affect the outcome of any analytic

function.

 

ORDER BY(按…排序)语句规定了每个分组(划分)的数据如何排序。

这必然影响分析函数的结果。

 

Windowing-Clause窗口生成语句

 

The windowing clause gives us a way to define a sliding or anchored

window of data, on which the analytic function will operate, within a

group. This clause can be used to have the analytic function compute

its value based on any arbitrary sliding or anchored window within a

group.

 

窗口生成语句用以定义滑动或固定数据窗口,分析函数在分组内进行分析。该语

句能够对分组中任意定义的滑动或固定窗口进行计算。

 

 

--累计计算

--本例中对某部门的工资进行逐行计算,每行包括之前所有行中工资的合计。

SELECT deptno, ename, sal,

       SUM(SAL)     OVER(PARTITION BY deptno ORDER BY ename) "Dept_Total",

       SUM(sal)     OVER(ORDER BY deptno, ename) "ALL_Total",

       ROW_NUMBER() OVER(PARTITION BY deptno ORDER BY sal) "Seq"

  FROM scott.emp

 ORDER BY deptno,sal

 

DEPTNO      ENAME                 SAL                   Dept_Total  ALL_Total   Seq

10                    MILLER      1300.00     8750                  8750                  1

10                    CLARK                 2450.00     2450                  2450                  2

10                    KING                  5000.00     7450                  7450                  3

20                    SMITH                 800.00      10875                 19625                 1

20                    ADAMS                 1100.00     1100                  9850                  2

20                    JONES                 2975.00     7075                  15825                 3

20                    FORD                  3000.00     4100                  12850                 4

20                    SCOTT                 3000.00     10075                 18825                 5

30                    JAMES                 950.00      5400                  25025                 1

30                    MARTIN      1250.00     6650                  26275                 2

30                    WARD                  1250.00     9400                  29025                 3

30                    TURNER      1500.00     8150                  27775                 4

30                    ALLEN                 1600.00     1600                  21225                 5

30                    BLAKE                 2850.00     4450                  24075                 6

 

以上的例子在去掉最后那个ORDER BY deptno,sal后,结果如下:

SELECT deptno, ename, sal,

       SUM(SAL)     OVER(PARTITION BY deptno ORDER BY ename) "Dept_Total",

       SUM(sal)     OVER(ORDER BY deptno, ename) "ALL_Total",

       ROW_NUMBER() OVER(PARTITION BY deptno ORDER BY sal) "Seq"

  FROM scott.emp

 

          DEPTNO         ENAME           SAL                 Dept_Total      ALL_Total      Seq

1              10                            CLARK 2450.00  2450                       2450                       2

2              10                            KING      5000.00  7450                        7450                        3

3              10                            MILLER               1300.00   8750                       8750                       1

4              20                            ADAMS 1100.00   1100                         9850                       2

5              20                            FORD    3000.00  4100                        12850                     5

6              20                            JONES  2975.00  7075                        15825                     3

7              20                            SCOTT  3000.00  10075                      18825                     4

8              20                            SMITH   800.00                    10875                      19625                     1

9              30                            ALLEN 1600.00   1600                        21225                     5

10            30                            BLAKE 2850.00  4450                        24075                     6

11             30                            JAMES  950.00                    5400                        25025                     1

12            30                            MARTIN               1250.00  6650                       26275                     2

13            30                            TURNER             1500.00   8150                        27775                     4

14            30                            WARD   1250.00  9400                        29025                     3

 

--这样的结果可能比较好了解,也说明了分析函数在开始窗口计算前,最后面的ORDER BY deptno,sal不起作用,

--直到分析函数结束后,ORDER BY deptno,sal才最后排序计算的结果。

 

--对于以下语句:

SUM(SAL)     OVER(PARTITION BY deptno ORDER BY ename)

 

其中的over可理解为窗口,partition by可理解为分组的范围,order by可理解为分组内记录的运算顺序;

如果没有指定partition by,则分组可视为所有的记录;如果指定了,则根据指定的字段分组为不同的小分组

分析函数独立应用于各个分组,并在应用时重置.

 

如果没有指定order by,则在分组内,分析函数在每条记录上都是对整个分组进行计算;

指定order by 后,分析函数就会在分组内,根据排序后的记录依次计算;

比如SUM(sal) OVER(),其实就相当于SUM(sal)

 

-----------------------------------------------

-----------------------------------------------

 

 

-- Top-N Queries前N条查询

-- 从多到少排列销售人员的工资,取前三行。

-- 如果该部门少于三人,则返回的记录少于三个

SELECT * FROM

(

  SELECT deptno, ename, sal,

         ROW_NUMBER() OVER(PARTITION BY deptno ORDER BY sal DESC) Top3

    FROM scott.emp

)

WHERE Top3 <= 3

 

DEPTNO      ENAME                 SAL                   TOP3

10                    KING                  5000.00     1

10                    CLARK                 2450.00     2

10                    MILLER                1300.00     3

20                    SCOTT                 3000.00     1

20                    FORD                  3000.00     2

20                    JONES                 2975.00     3

30                    BLAKE                 2850.00     1

30                    ALLEN                 1600.00     2

30                    TURNER                1500.00     3

 

 

-----------------------------------------------

-----------------------------------------------

 

--我需要工资为前三位的销售人员名字——

--即查找工资金额、排序、取最高的三项金额、给我领取这些工资的人员的名字。

--使用DENSE_RANK函数得出最高的三个工资金额。然后指定Dense rank至工资列,并将其按降序排列.

--DENSE_RANK函数计算排序后分组中各行的序数。序数为从1开始的连续整数。

--最大的序数就是查询所所返回唯一值的个数。如果出现并列,序数不跳计。具有相同值的列的序数相同:

 

SELECT * FROM

(

  SELECT deptno, ename, sal,

         DENSE_RANK() OVER (PARTITION BY deptno ORDER BY sal desc) TopN

    FROM scott.emp

)

WHERE TopN <= 3

ORDER BY deptno, sal DESC

 

DEPTNO      ENAME                 SAL                   TOPN

10                    KING                  5000.00     1

10                    CLARK                 2450.00     2

10                    MILLER      1300.00     3

20                    SCOTT                 3000.00     1  <--!

20                    FORD                  3000.00     1  <--!

20                    JONES                 2975.00     2

20                    ADAMS                 1100.00     3

30                    BLAKE                 2850.00     1

30                    ALLEN                 1600.00     2

30                    TURNER      1500.00     3

 

 

-----------------------------------------------

-----------------------------------------------

 

 

--Windows窗口

窗口语句用以定义滑动或固定数据窗口。在其上面运行组的分析函数。默

认窗口为固定窗口,从第一行开始到当前行。

 

可根据两种规则对窗口进行设置:数据值的范围或当前行指定区距的行。

分析函数中的ORDER BY会默认添加一条窗口语句:RANGE UNBOUNDED PRECEDING

即按照ORDER BY语句取得划分中的当前之前的所有行。

 

 

--以下例子为一个分组中的滑动窗口,计算该组中当前行与其前两行的SAL列的和。

--如我们需要计算当前员工的工资和其之前的两人工资的和。

--preceding:(时间或地点上)在先的, 在前的, 前面的

--sliding;滑行的,变化的

select deptno,empno,ename,sal,

       sum(sal) over(partition by deptno order by empno rows 2 preceding) "Slidnig"

  from scott.emp

 order by deptno,empno

 

DEPTNO      EMPNO ENAME       SAL         Slidnig

10           7782  CLARK        2450.00     2450

10           7839  KING         5000.00     7450

10           7934  MILLER      1300.00     8750

20           7369  SMITH        800.00      800    <--0 + 0 + 800

20          7566  JONES       2975.00     3775   <--0 + 800 + 2975

20           7788  SCOTT        3000.00     6775   <--800 + 2975 + 3000

20           7876  ADAMS        1100.00     7075   <--2975+ 3000 + 1100  

20           7902  FORD         3000.00     7100

30           7499  ALLEN        1600.00     1600

30           7521  WARD         1250.00     2850

30           7654  MARTIN      1250.00     4100

30           7698  BLAKE        2850.00     5350

30           7844  TURNER      1500.00     5600

30           7900  JAMES        950.00      5300

 

--对以上语句的分析:

--划分语句使得SUM (sal)在各部门内进行,并独立于其他组。当部门改变时,SUM (sal) 也被重置

--ORDER BY EMPNO语句通过EMPNO排列各部门的数据。这使得窗口语句:ROWS 2 PRECEDING获取该分组中当前行之前2行的数据以计算合计工资。

 

-----------------------------------------------

-----------------------------------------------

 

--Range Windows范围窗口

--范围窗口根据WHERE语句对行进行收集。例如之前5”将会生成一个滑动视窗,

--包括该分组中当前行之前的5个单位所有行。这些单位可以是数值或日期,使用数字或日期以外的其他数据类型表示的范围无效。

 

--计算当前雇佣日期之前100天内雇佣的员工的数量。范围窗口返回当前行

--雇佣日期100天之前并在这个范围内计算行数。

SELECT ename, hiredate, hiredate-100 hiredate_pre,

       COUNT(*) OVER (ORDER BY hiredate ASC RANGE 100 PRECEDING) cnt --该员工入职前100天内,入职的员工数(员工数中,包括该员工)

  FROM scott.emp

 ORDER BY hiredate ASC

 

ENAME       HIREDATE    HIREDATE_PRE      CNT

SMITH       1980-12-17  1980-9-8          1

ALLEN       1981-2-20   1980-11-12        2

WARD        1981-2-22   1980-11-14        3  <--入职时间范围[1980-11-14 ~ 1981-2-22],即WARDALLENSMITH

JONES       1981-4-2    1980-12-23        3

BLAKE       1981-5-1    1981-1-21         4

CLARK       1981-6-9    1981-3-1          3

TURNER      1981-9-8    1981-5-31         2

MARTIN      1981-9-28   1981-6-20         2

KING        1981-11-17  1981-8-9          3

JAMES       1981-12-3   1981-8-25         5

FORD        1981-12-3   1981-8-25         5

MILLER      1982-1-23   1981-10-15        4

SCOTT       1987-4-19   1987-1-9          1  <--SCOTT

ADAMS       1987-5-23   1987-2-12         2  <--SCOTTADAMS

 

 

-----------------------------------------------

-----------------------------------------------

 

 

--Compute average salary for defined range计算定义范围的平均工资

--计算每个员工雇佣之前100天内雇佣员工的平均工资。

select ename,hiredate,sal,

       avg(sal) over(order by hiredate ASC RANGE 100 PRECEDING) as "AVG_SAL1",

       to_char(avg(sal) over(order by hiredate ASC RANGE 100 PRECEDING) ,'999999.99') as "AVG_SAL2" --格式化显示内容

  from scott.emp

 order by hiredate ASC

 

ENAME       HIREDATE    SAL         AVG_SAL1                AVG_SAL2

SMITH        1980-12-17  800.00      800                        800.00

ALLEN        1981-2-20   1600.00     1200                       1200.00

WARD         1981-2-22   1250.00     1216.66666666667         1216.67

JONES        1981-4-2    2975.00     1941.66666666667         1941.67

BLAKE        1981-5-1    2850.00     2168.75                   2168.75

CLARK        1981-6-9    2450.00     2758.33333333333         2758.33

TURNER      1981-9-8    1500.00     1975                       1975.00

MARTIN      1981-9-28   1250.00     1375                       1375.00

KING         1981-11-17  5000.00     2583.33333333333         2583.33

JAMES        1981-12-3   950.00      2340                       2340.00

FORD         1981-12-3   3000.00     2340                       2340.00

MILLER      1982-1-23   1300.00     2562.5                    2562.50

SCOTT        1987-4-19   3000.00     3000                       3000.00

ADAMS        1987-5-23   1100.00     2050                       2050.00

 

 

-----------------------------------------------

-----------------------------------------------

 

 

--Row Windows行窗口

--行窗口为实际单位,是包括在窗口中实际行数。

--例如可以计算一给定记录的平均工资,该记录包括其之前或之后雇佣的员工(至多5名)

SELECT ename, hiredate, sal,

       AVG(sal)  OVER (ORDER BY hiredate ASC ROWS 5 PRECEDING) AvgAsc,

       COUNT(*)  OVER (ORDER BY hiredate ASC ROWS 5 PRECEDING) CntAsc,

       AVG(sal)  OVER (ORDER BY hiredate DESC ROWS 5 PRECEDING) AvgDes,

       COUNT(*)  OVER (ORDER BY hiredate DESC ROWS 5 PRECEDING) CntDes

  FROM scott.emp

 ORDER BY hiredate

 

ENAME       HIREDATE    SAL         AVGASC      CNTASC  AVGDES    CNTDES

SMITH        1980-12-17  800.00         800 1       1988 6

ALLEN        1981-2-20   1600.00       1200 2       2104 6

WARD         1981-2-22   1250.00       1217 3       2046 6

JONES        1981-4-2    2975.00       1656 4       2671 6

BLAKE        1981-5-1    2850.00       1895 5       2675 6

CLARK        1981-6-9    2450.00       1988 6       2358 6

TURNER      1981-9-8    1500.00       2104 6       2167 6

MARTIN      1981-9-28   1250.00       2046 6       2417 6

KING         1981-11-17  5000.00       2671 6       2392 6

JAMES        1981-12-3   950.00        2333 6       1588 4

FORD         1981-12-3   3000.00       2358 6       1870 5

MILLER      1982-1-23   1300.00       2167 6       1800 3

SCOTT        1987-4-19   3000.00       2417 6       2050 2

ADAMS        1987-5-23   1100.00       2392 6       1100 1

 

--分析以上语句:

--该窗口中包括6行,现有行及此行之前5行,其中之前ORDER BY语句定义。

--对于ROW(行)的划分,不受RANGE(范围)划分的限制——数据可以是任何类型,order by可包括许多列。

--注意,也要选择COUNT*),可以说明是多少行的平均值。从ALLEN记录可以清楚看到,他之前

--雇佣员工平均工资的计算使用了2个记录,他之后雇佣员工平均工资的计算使用了6个记录。

 

 

-----------------------------------------------

-----------------------------------------------

 

 

--Accessing Rows Around Your Current Row访问当前行前后的行

我们常常不仅想访问当前行,还想访问之前之后行中的数据。

例如,某份报告需要表明各部门的所有员工、员工雇佣日期、距上一雇佣的天数、距下一雇佣的天数。

 

直接编写SQL会比较困难,其执行性能必然存在问题。过去我常用的方法是select a select或编写PL/SQL函数,

从当前行得到数据,并找到之前以及之后行中的数据。这样可以达到目的,查询的开发与运行会带来很大开销。

 

SELECT deptno, ename, hiredate,

       LAG(hiredate,1,NULL)            OVER (PARTITION BY deptno ORDER BY hiredate, ename) last_hire,

       hiredate - LAG(hiredate,1,NULL) OVER (PARTITION BY deptno ORDER BY hiredate, ename) days_last,

       LEAD(hiredate,1,NULL)           OVER (PARTITION BY deptno ORDER BY hiredate, ename) next_hire,

       LEAD(hiredate,1,NULL)           OVER (PARTITION BY deptno ORDER BY hiredate, ename) - hiredate days_next

  FROM scott.emp

 ORDER BY deptno, hiredate

 

DEPTNO      ENAME       HIREDATE    LAST_HIRE   DAYS_LAST   NEXT_HIRE   DAYS_NEXT

10           CLARK        1981-6-9    1981-11-17  161

10           KING         1981-11-17  1981-6-9    161          1982-1-23   67

10           MILLER      1982-1-23   1981-11-17  67         

20           SMITH        1980-12-17  1981-4-2    106

20           JONES        1981-4-2    1980-12-17  106          1981-12-3   245

20           FORD         1981-12-3   1981-4-2    245          1987-4-19   1963

20           SCOTT        1987-4-19   1981-12-3   1963         1987-5-23   34

20           ADAMS        1987-5-23   1987-4-19   34         

30           ALLEN        1981-2-20   1981-2-22   2

30           WARD         1981-2-22   1981-2-20   2            1981-5-1    68

30           BLAKE        1981-5-1    1981-2-22   68           1981-9-8    130

30           TURNER      1981-9-8    1981-5-1    130          1981-9-28   20

30           MARTIN      1981-9-28   1981-9-8    20           1981-12-3   66

30           JAMES        1981-12-3   1981-9-28   66         

 

--分析以上语句:

--LEADLAG例程可看作对分组进行索引。使用这些函数可以访问任何一行。

--上面例子表明KING记录包括之前一行(上一雇佣)和下一行(下一)中的数据(加粗红色字体)。

--可以很容易的访问排序后的分组中当前记录之前或之后记录中的字段。

 

 

-----------------------------------------------

-----------------------------------------------

 

 

 

 

Links and Documents链接和参考文档

Further articles about Analytic Functions can be found in:

有关分析函数的更多文章请看:

Oracle9i SQL Reference Release 1 (9.0.1) Oracle9i SQL参考 

Oracle9i Database Concepts Release 1 (9.0.1) Oracle9i数据库概念 

Oracle9i Data Warehousing Guide Release 1 (9.0.1) Oracle9i数据仓储指南 

 

 

 

 

 

-----------------------------------------------------------------------

以下为原文:

-----------------------------------------------------------------------

Christina,valenwon翻译2004-11-28

Analytic Functions in Oracle 8i and 9i

Oracle 8i and 9i分析函数

Contents目录

Overview and Introduction概述与简介

How Analytic Functions Work分析函数原理

The Syntax句法

Calculate a running Total累计计算

Top-N Queries前N条查询

Example 1例1

Example 2例2

Windows窗口

Range Windows范围窗口

Compute average salary for defined range计算定义范围的平均工资

Row Windows行窗口

Accessing Rows Around Your Current Row访问当前行前后的行

LAG

LEAD

Determine the First Value / Last Value of a Group确定组的首值和末值

Crosstab or Pivot Queries交叉表或Pivot查询

Conclusion结论

Links and Documents链接和文档

Overview概述:

Analytic Functions, which have been available since Oracle 8.1.6, are designed to

address such problems as "Calculate a running total", "Find percentages within a

group", "Top-N queries", "Compute a moving average" and many more. Most of these

problems can be solved using standard PL/SQL, however the performance is often

not what it should be. Analytic Functions add extensions to the SQL language that not

only make these operations easier to code; they make them faster than could be

achieved with pure SQL or PL/SQL. These extensions are currently under review by

the ANSI SQL committee for inclusion in the SQL specification.

分析函数,最早是从ORACLE8.1.6开始出现的,它的设计目的是为了解决诸如累计

计算找出分组内百分比前-N条查询移动平均数计算”"等问题。其

实大部分的问题都可以用PL/SQL解决,但是它的性能并不能达到你所期望的效果。分

析函数是SQL言语的一种扩充,它并不是仅仅试代码变得更简单而已,它的速度比纯

粹的SQL或者PL/SQL更快。现在这些扩展已经被纳入了美国国家标准化组织SQL委员

会的SQL规范说明书中。

How Analytic Functions Work ? 分析函数的原理

Analytic functions compute an aggregate value based on a group of rows. They differ

from aggregate functions in that they return multiple rows for each group. The group

of rows is called a window and is defined by the analytic clause. For each row, a

"sliding" window of rows is defined. The window determines the range of rows used to

perform the calculations for the "current row". Window sizes can be based on either a

physical number of rows or a logical interval such as time. 分析函数是在一个记录行分

组的基础上计算它们的总值。与集合函数不同,他们返回各分组的多行记录。行的分

组被称窗口,并通过分析语句定义。对于每记录行,定义了一个滑动窗口。该窗

口确定当前行计算的范围。窗口的大小可由各行的实际编号或由时间等逻辑间隔

确定。

Analytic functions are the last set of operations performed in a query except for the

final ORDER BY clause. All joins and all WHERE, GROUP BY, and HAVING clauses

are completed before the analytic functions are processed. Therefore, analytic

functions can appear only in the select list or ORDER BY clause. 除了ORDER

BY(按…排序)语句外,分析函数是一条查询被执行的操作。所有合

并、WHEREGROUP BYHAVING语句都是分析函数处理之前完成的。因此,分析

函数只出现在选择目录或ORDER BY(按…排序)语句中。

The Syntax句法

The Syntax of analytic functions is rather straightforward in appearance分析函数的句

法非常简单。

Analytic-Function(<Argument>,<Argument>,...)

OVER (

<Query-Partition-Clause>

<Order-By-Clause>

<Windowing-Clause>

)

Analytic-Function分析函数的种类

Specify the name of an analytic function, Oracle actually provides many

analytic functions such as AVG, CORR, COVAR_POP, COVAR_SAMP, COUNT,

CUME_DIST, DENSE_RANK, FIRST, FIRST_VALUE, LAG, LAST, LAST_VALUE, LEAD,

MAX, MIN, NTILE, PERCENT_RANK, PERCENTILE_CONT, PERCENTILE_DISC, RANK,

RATIO_TO_REPORT, STDDEV, STDDEV_POP, STDDEV_SAMP, SUM, VAR_POP, VAR_SAMP,

VARIANCE.

分析函数的名称,ORACLE通常多个分析函数,包括:AVG, CORR, COVAR_POP,

COVAR_SAMP, COUNT, CUME_DIST, DENSE_RANK, FIRST, FIRST_VALUE, LAG, LAST,

LAST_VALUE, LEAD, MAX, MIN, NTILE, PERCENT_RANK, PERCENTILE_CONT,

PERCENTILE_DISC, RANK, RATIO_TO_REPORT, STDDEV, STDDEV_POP, STDDEV_SAMP,

SUM, VAR_POP, VAR_SAMP, VARIANCE.

Arguments参数

Analytic functions take 0 to 3 arguments. 分析函数通常有03个参数

Query-Partition-Clause查询划分语句

The PARTITION BY clause logically breaks a single result set into N

groups, according to the criteria set by the partition expressions. The

words "partition" and "group" are used synonymously here. The

analytic functions are applied to each group independently, they are

reset for each group. 根据划分表达式设置的规则,PARTITION

BY(按…划分)将一个结果逻辑分成N个分组划分表达式。在此

分组用作同义词。分析函数独立应用于各个分组,并在应

用时重置。

Order-By-Clause排序语句

The ORDER BY clause specifies how the data is sorted within each

group (partition). This will definitely affect the outcome of any analytic

function. ORDER BY(按…排序)语句规定了每个分组(划分)的数

据如何排序。这必然影响分析函数的结果。

Windowing-Clause窗口生成语句

The windowing clause gives us a way to define a sliding or anchored

window of data, on which the analytic function will operate, within a

group. This clause can be used to have the analytic function compute

its value based on any arbitrary sliding or anchored window within a

group. More information on windows can be found here. 窗口生成语

句用以定义滑动或固定数据窗口,分析函数在分组内进行分析。该语

句能够对分组中任意定义的滑动或固定窗口进行计算。点击此处了解

更多。

Example: Calculate a running Total例:累计计算

This example shows the cumulative salary within a departement row by row, with

each row including a summation of the prior rows salary. 本例中对某部门的工资进行

逐行计算,每行包括之前所有行中工资的合计。

set autotrace traceonly explain

break on deptno skip 1

column ename format A6

column deptno format 999

column sal format 99999

column seq format 999

SELECT ename "Ename", deptno "Deptno", sal "Sal",

SUM(sal)

OVER (ORDER BY deptno, ename) "Running Total",

SUM(SAL)

OVER (PARTITION BY deptno

ORDER BY ename) "Dept Total",

ROW_NUMBER()

OVER (PARTITION BY deptno

ORDER BY ENAME) "Seq"

FROM emp

ORDER BY deptno, ename

/

Ename Deptno Sal Running Total Dept Total Seq

------ ------ ------ ------------- ---------- ----

CLARK 10 2450 2450 2450 1

KING 5000 7450 7450 2

MILLER 1300 8750 8750 3

ADAMS 20 1100 9850 1100 1

FORD 3000 12850 4100 2

JONES 2975 15825 7075 3

SCOTT 3000 18825 10075 4

SMITH 800 19625 10875 5

ALLEN 30 1600 21225 1600 1

BLAKE 2850 24075 4450 2

JAMES 950 25025 5400 3

MARTIN 1250 26275 6650 4

TURNER 1500 27775 8150 5

WARD 1250 29025 9400 6

Execution Plan

---------------------------------------------------

0 SELECT STATEMENT Optimizer=CHOOSE

1 0 WINDOW (SORT)

2 1 TABLE ACCESS (FULL) OF 'EMP'

Statistics

---------------------------------------------------

0 recursive calls

0 db block gets

3 consistent gets

0 physical reads

0 redo size

1658 bytes sent via SQL*Net to client

503 bytes received via SQL*Net from client

2 SQL*Net roundtrips to/from client

1 sorts (memory)

0 sorts (disk)

14 rows processed

The example shows how to calculate a "Running Total" for the entire query. This is

done using the entire ordered result set, via SUM(sal) OVER (ORDER BY deptno,

ename).本例指出了如何计算整条查询的累计。即使用排序后的整个结果集合,通

SUM(sal) OVER (ORDER BY deptno, ename)函数得到?

Further, we were able to compute a running total within each department, a total

that would be reset at the beginning of the next department. The PARTITION BY

deptno in that SUM(sal) caused this to happen, a partitioning clause was specified in

the query in order to break the data up into groups.可以进一步计算各个部门的累计

值,该值在开始下一个部门计算时将被重置。由SUM(sal)中的PARTITION BY

deptno实现?该条查询中指定划分语句将数据进行分组。

The ROW_NUMBER() function is used to sequentially number the rows returned in

each group, according to our ordering criteria (a "Seq" column was added to in order

to display this position). 根据排序规则(增加了“Seq”列以显示该状

)?ROW_NUMBER()函数将每组返回的记录行进行顺序编号,

The execution plan shows, that the whole query is very well performed with only 3

consistent gets, this can never be accomplished with standard SQL or even PL/SQL.

执行计划显示,整条查询仅需3条一致get函数就可以很好的执行。这一点是标

SQL甚至PL/SQL不能都实现的。

Top-N Queries前N条查询

How can we get the Top-N records by some set of fields ?如何通过部分字段得到

前N条记录?

Prior to having access to these analytic functions, questions of this nature were

extremely difficult to answer.在未使用这些分析函数之前,很难对此类问题做出回答。

There are some problems with Top-N queries however; mostly in the way people

phrase them. It is something to be careful about when designing reports. Consider

this seemingly sensible request: 人们关于前N条查询的说法存在问题。在设计报告

时,应留意这一点。

I would like the top three paid sales reps by department???????????3????????

The problem with this question is that it is ambiguous. It is ambiguous because of

repeated values, there might be four people who all make the same salary, what

should we do then ?这句话的问题在于含混不清。因为存在重复的值,如果有四个人

领着同样的工资,该怎么处理?

Let's look at three examples, all use the well known table EMP.以下3个例子均使

用EMP表。

Example 1例1

Sort the sales people by salary from greatest to least. Give the first three rows. If

there are less then three people in a department, this will return less than three

records.从多到少排列销售人员的工资,取前三行。如果该部门少于三人,则返回的记

录少于三个。

set autotrace on explain

break on deptno skip 1

SELECT * FROM (

SELECT deptno, ename, sal, ROW_NUMBER()

OVER (

PARTITION BY deptno ORDER BY sal DESC

) Top3 FROM emp

)

WHERE Top3 <= 3

/

DEPTNO ENAME SAL TOP3

---------- ---------- ---------- ----------

10 KING 5000 1

CLARK 2450 2

MILLER 1300 3

20 SCOTT 3000 1

FORD 3000 2

JONES 2975 3

30 BLAKE 2850 1

ALLEN 1600 2

TURNER 1500 3

9 rows selected.

Execution Plan

--------------------------------------------

0 SELECT STATEMENT Optimizer=CHOOSE

1 0 VIEW

2 1 WINDOW (SORT)

3 2 TABLE ACCESS (FULL) OF 'EMP'

This query works by sorting each partition (or group, which is the deptno), in a

descending order, based on the salary column and then assigning a sequential row

number to each row in the group as it is processed. The use of a WHERE clause

after doing this to get just the first three rows in each partition.该查询根据工资列以降

序排列各个划分(或分组,属于该deptno),并在处理过程中为每行分配一个顺序

号。然后使用WHERE语句得到各划分的前三行。

Example 2例2

Give me the set of sales people who make the top 3 salaries - that is, find the set of

distinct salary amounts, sort them, take the largest three, and give me everyone who

makes one of those values.我需要工资为前三位的销售人员名字——即查找工资金

额、排序、取最高的三项金额、给我领取这些工资的人员的名字。

SELECT * FROM (

SELECT deptno, ename, sal,

DENSE_RANK()

OVER (

PARTITION BY deptno ORDER BY sal desc

) TopN FROM emp

)

WHERE TopN <= 3

ORDER BY deptno, sal DESC

/

DEPTNO ENAME SAL TOPN

---------- ---------- ---------- ----------

10 KING 5000 1

CLARK 2450 2

MILLER 1300 3

20 SCOTT 3000 1 <--- !

FORD 3000 1 <--- !

JONES 2975 2

ADAMS 1100 3

30 BLAKE 2850 1

ALLEN 1600 2

30 TURNER 1500 3

10 rows selected.

Execution Plan

--------------------------------------------

0 SELECT STATEMENT Optimizer=CHOOSE

1 0 VIEW

2 1 WINDOW (SORT PUSHED RANK)

3 2 TABLE ACCESS (FULL) OF 'EMP'

Here the DENSE_RANK function was used to get the top three salaries. We

assigned the dense rank to the salary column and sorted it in a descending order.

中,使用DENSE_RANK函数得出最高的三个工资金额。然后指定Dense rank至工资

列,并将其按降序排列。

The DENSE_RANK function computes the rank of a row in an ordered group of rows.

The ranks are consecutive integers beginning with 1. The largest rank value is the

number of unique values returned by the query. Rank values are not skipped in the

event of ties. Rows with equal values for the ranking criteria receive the same rank.

DENSE_RANK函数计算排序后分组中各行的序数。序数为从1开始的连续整数。最大

的序数就是查询所所返回唯一值的个数。如果出现并列,序数不跳计。具有相同值的

列的序数相同。

The DENSE_RANK function does not skip numbers and will assign the same number

to those rows with the same value. Hence, after the result set is built in the inline

view, we can simply select all of the rows with a dense rank of three or less, this gives

us everyone who makes the top three salaries by department number.

DENSE_RANK函数不跳计序数,并为相同值的列赋予相同的序数。结果集合在当前窗

口建立后,通过部门编号选择Dense rank为3 或3之前的行,就可以知道工资在该部门

前三位的名字。

Windows窗口

The windowing clause gives us a way to define a sliding or anchored window of data,

on which the analytic function will operate, within a group. The default window is an

anchored window that simply starts at the first row of a group an continues to the

current row. 窗口语句用以定义滑动或固定数据窗口。在其上面运行组的分析函数。默

认窗口为固定窗口,从第一行开始到当前行。

We can set up windows based on two criteria: RANGES of data values or ROWS

offset from the current row. It can be said, that the existance of an ORDER BY in

an analytic function will add a default window clause of RANGE UNBOUNDED

PRECEDING. That says to get all rows in our partition that came before us as

specified by the ORDER BY clause.可根据两种规则对窗口进行设置:数据值的

或当前行指定区距的。分析函数中的ORDER BY会默认添加一条窗口语

句:RANGE UNBOUNDED PRECEDING。即按照ORDER BY语句取得划分中的当前

之前的所有行。

Let's look at an example with a sliding window within a group and compute the sum of

the current row's SAL column plus the previous 2 rows in that group. If we need a

report that shows the sum of the current employee's salary with the preceding two

salaries within a departement, it would look like this.5。以下例子为一个分组中的滑动

窗口,计算该组中当前行与其前两行的SAL列的和。如我们需要计算当前员工的工资和

其之前的两人工资的和,如下例所示。

break on deptno skip 1

column ename format A6

column deptno format 999

column sal format 99999

SELECT deptno "Deptno", ename "Ename", sal "Sal",

SUM(SAL)

OVER (PARTITION BY deptno

ORDER BY ename

ROWS 2 PRECEDING) "Sliding Total"

FROM emp

ORDER BY deptno, ename

/

Deptno Ename Sal Sliding Total

------ ------ ------ -------------

10 CLARK 2450 2450

KING 5000 7450

MILLER 1300 8750

20 ADAMS 1100 1100

FORD 3000 4100

JONES 2975 7075 ^

SCOTT 3000 8975 |

SMITH 800 6775 /-- Sliding Window

30 ALLEN 1600 1600

BLAKE 2850 4450

JAMES 950 5400

MARTIN 1250 5050

TURNER 1500 3700

WARD 1250 4000

The partition clause makes the SUM (sal) be computed within each department,

independent of the other groups. Tthe SUM (sal) is ' reset ' as the department

changes. The ORDER BY ENAME clause sorts the data within each department by

ENAME; this allows the window clause: ROWS 2 PRECEDING, to access the 2

rows prior to the current row in a group in order to sum the salaries.划分语句

使SUM (sal)在各部门内进行,并独立于其他组。当部门改变时,SUM (sal) 也被

ORDER BY ENAME语句通过ENAME排列各部门的数据。这使得窗口语

句:ROWS 2 PRECEDING获取该分组中当前行之前两行的数据以计算合计工资。

For example, if you note the SLIDING TOTAL value for SMITH is 6 7 7 5, which is the

sum of 800, 3000, and 2975. That was simply SMITH's row plus the salary from the

preceding two rows in the window.例如,SMITHSLIDING TOTAL(滑动合计)6 7

7 58003000以及2975的和。即窗口中SMITH行及其之前两行工资的简单相加。

Range Windows范围窗口

Range windows collect rows together based on a WHERE clause. If I say ' range 5

preceding ' for example, this will generate a sliding window that has the set of all

preceding rows in the group such that they are within 5 units of the current row.

These units may either be numeric comparisons or date comparisons and it is not

valid to use RANGE with datatypes other than numbers and dates.范围窗口根

WHERE语句对行进行收集。例如之前5”将会生成一个滑动视窗,包括该分组中当

前行之前的5个单位所有行。这些单位可以是数值或日期,使用数字或日期以外的其他

数据类型表示的范围无效。

Example

Count the employees which where hired within the last 100 days preceding the own

hiredate. The range window goes back 100 days from the current row's hiredate and

then counts the rows within this range. The solution ist to use the following window

specification:计算当前雇佣日期之前100天内雇佣的员工的数量。范围窗口返回当前行

雇佣日期100天之前并在这个范围内计算行数。计算使用以下窗口规格:

COUNT(*) OVER (ORDER BY hiredate ASC RANGE 100 PRECEDING)

column ename heading "Name" format a8

column hiredate heading "Hired" format a10

column hiredate_pre heading "Hired-100" format a10

column cnt heading "Cnt" format 99

SELECT ename, hiredate, hiredate-100 hiredate_pre,

COUNT(*)

OVER (

ORDER BY hiredate ASC

RANGE 100 PRECEDING

) cnt

FROM emp

ORDER BY hiredate ASC

/

Name Hired Hired-100 Cnt

-------- ---------- ---------- ---

SMITH 17-DEC-80 08-SEP-80 1

ALLEN 20-FEB-81 12-NOV-80 2

WARD 22-FEB-81 14-NOV-80 3

JONES 02-APR-81 23-DEC-80 3

BLAKE 01-MAY-81 21-JAN-81 4

CLARK 09-JUN-81 01-MAR-81 3

TURNER 08-SEP-81 31-MAY-81 2

MARTIN 28-SEP-81 20-JUN-81 2

KING 17-NOV-81 09-AUG-81 3

JAMES 03-DEC-81 25-AUG-81 5

FORD 03-DEC-81 25-AUG-81 5

MILLER 23-JAN-82 15-OCT-81 4

SCOTT 09-DEC-82 31-AUG-82 1

ADAMS 12-JAN-83 04-OCT-82 2

We ordered the single partition by hiredate ASC. If we look for example at the row for

CLARK we can see that his hiredate was 09-JUN-81, and 100 days prior to that is the

date 01-MAR-81. If we look who was hired between 01-MAR-81 and 09-JUN-81, we

find JONES (hired: 02-APR-81) and BLAKE (hired: 01-MAY-81). This are 3 rows

including the current row, this is what we see in the column "Cnt" of CLARK's row.

据雇佣日期ASC对每个划分进行排序。例中CLARK行可看到其雇佣日期

198169日,100天之前是198131日,看看在这期间雇佣的员工,会发

JONES(雇佣日期:198142日)、BLAKE(雇佣日期:198151日),

3行,包括当前行,在CLARK“Cnt”列中。

Compute average salary for defined range计算定义范围的平均工资

As an example, compute the average salary of people hired within 100 days before

for each employee. The query looks like this:例如,计算每个员工雇佣之前100天内雇

佣员工的平均工资。查询如下:

column ename heading "Name" format a8

column hiredate heading "Hired" format a10

column hiredate_pre heading "Hired-100" format a10

column avg_sal heading "Avg-100" format 999999

SELECT ename, hiredate, sal,

AVG(sal)

OVER (

ORDER BY hiredate ASC

RANGE 100 PRECEDING

) avg_sal

FROM emp

ORDER BY hiredate ASC

/

Name Hired SAL Avg-100

-------- ---------- ---------- -------

SMITH 17-DEC-80 800 800

ALLEN 20-FEB-81 1600 1200

WARD 22-FEB-81 1250 1217

JONES 02-APR-81 2975 1942

BLAKE 01-MAY-81 2850 2169

CLARK 09-JUN-81 2450 2758

TURNER 08-SEP-81 1500 1975

MARTIN 28-SEP-81 1250 1375

KING 17-NOV-81 5000 2583

JAMES 03-DEC-81 950 2340

FORD 03-DEC-81 3000 2340

MILLER 23-JAN-82 1300 2563

SCOTT 09-DEC-82 3000 3000

ADAMS 12-JAN-83 1100 2050

Look at CLARK again, since we understand his range window within the group. We

can see that the average salary of 2758 is equal to (2975+2850+2450)/3. This is the

average of the salaries for CLARK and the rows preceding CLARK, those of JONES

and BLAKE. The data must be sorted in ascending order.再看看CLARK,我们已知

道他在本组中的范围窗口,可以看到平均工资2758由(2975+2850+2450/3得来,

CLARK行和其之前的JONESBLAKE行工资的平均数。数据必须按由小到大顺序排

列。

Row Windows行窗口

Row Windows are physical units; physical number of rows, to include in the window.

For example you can calculate the average salary of a given record with the (up to 5)

employees hired before them or after them as follows:行窗口为实际单位,是包括在窗

口中实际行数。例如可以计算一给定记录的平均工资,该记录包括其之前或之后雇佣

的员工(至多5名),具体如下:

set numformat 9999

SELECT ename, hiredate, sal,

AVG(sal)

OVER (ORDER BY hiredate ASC ROWS 5 PRECEDING) AvgAsc,

COUNT(*)

OVER (ORDER BY hiredate ASC ROWS 5 PRECEDING) CntAsc,

AVG(sal)

OVER (ORDER BY hiredate DESC ROWS 5 PRECEDING) AvgDes,

COUNT(*)

OVER (ORDER BY hiredate DESC ROWS 5 PRECEDING) CntDes

FROM emp

ORDER BY hiredate

/

ENAME HIREDATE SAL AVGASC CNTASC AVGDES CNTDES

---------- --------- ----- ------ ------ ------ ------

SMITH 17-DEC-80 800 800 1 1988 6

ALLEN 20-FEB-81 1600 1200 2 2104 6

WARD 22-FEB-81 1250 1217 3 2046 6

JONES 02-APR-81 2975 1656 4 2671 6

BLAKE 01-MAY-81 2850 1895 5 2675 6

CLARK 09-JUN-81 2450 1988 6 2358 6

TURNER 08-SEP-81 1500 2104 6 2167 6

MARTIN 28-SEP-81 1250 2046 6 2417 6

KING 17-NOV-81 5000 2671 6 2392 6

JAMES 03-DEC-81 950 2333 6 1588 4

FORD 03-DEC-81 3000 2358 6 1870 5

MILLER 23-JAN-82 1300 2167 6 1800 3

SCOTT 09-DEC-82 3000 2417 6 2050 2

ADAMS 12-JAN-83 1100 2392 6 1100 1

The window consist of up to 6 rows, the current row and five rows " in front of " this

row, where " in front of " is defined by the ORDER BY clause. With ROW partitions,

we do not have the limitation of RANGE partition - the data may be of any type and

the order by may include many columns. Notice, that we selected out a COUNT(*) as

well. This is useful just to demonstrate how many rows went into making up a given

average. We can see clearly that for ALLEN's record, the average salary computation

for people hired before him used only 2 records whereas the computation for salaries

of people hired after him used 6.该窗口中包括6行,现有行及此行之前5行,其中

之前ORDER BY语句定义。对于ROW(行)的划分,不受RANGE(范围)划分的

限制——数据可以是任何类型,order by可包括许多列。注意,也要选

COUNT*),可以说明是多少行的平均值。从ALLEN记录可以清楚看到,他之前

雇佣员工平均工资的计算使用了2个记录,他之后雇佣员工平均工资的计算使用

6个记录。

Accessing Rows Around Your Current Row访问当前行前后的行

Frequently you want to access data not only from the current row but the current row

" in front of " or " behind " them. For example, let's say you need a report that shows,

by department all of the employees; their hire date; how many days before was the

last hire; how many days after was the next hire.我们常常不仅想访问当前行,还想访

之前之后行中的数据。例如,某份报告需要表明各部门的所有员工、员工雇佣

日期、距上一雇佣的天数、距下一雇佣的天数。

Using straight SQL this query would be difficult to write. Not only that but its

performance would once again definitely be questionable. The approach I typically

took in the past was either to "select a select" or write a PL/SQL function that would

take some data from the current row and "find" the previous and next rows data. This

workedbut introduce large overhead into both the development of the query and the

run-time execution of the query.直接编写SQL会比较困难,其执行性能必然存在问

题。过去我常用的方法是“select a select”或编写PL/SQL函数,从当前行得到数据,并

找到之前以及之后行中的数据。这样可以达到目的,查询的开发与运行会带来很大开

销。

Using analytic functions, this is easy and efficient to do.使用分析函数,简单易行且有

效。

set echo on

column deptno format 99 heading Dep

column ename format a6 heading Ename

column hiredate heading Hired

column last_hire heading LastHired

column days_last heading DaysLast

column next_hire heading NextHire

column days_next heading NextDays

break on deptno skip 1

SELECT deptno, ename, hiredate,

LAG(hiredate,1,NULL)

OVER (PARTITION BY deptno

ORDER BY hiredate, ename) last_hire,

hiredate - LAG(hiredate,1,NULL)

OVER (PARTITION BY deptno

ORDER BY hiredate, ename) days_last,

LEAD(hiredate,1,NULL)

OVER (PARTITION BY deptno

ORDER BY hiredate, ename) next_hire,

LEAD(hiredate,1,NULL)

OVER (PARTITION BY deptno

ORDER BY hiredate, ename) - hiredate days_next

FROM emp

ORDER BY deptno, hiredate

/

Dep Ename Hired LastHired DaysLast NextHire NextDays

--- ------ --------- --------- -------- --------- --------

10 CLARK 09-JUN-81 17-NOV-81 161

KING 17-NOV-81 09-JUN-81 161 23-JAN-82 67

MILLER 23-JAN-82 17-NOV-81 67

20 SMITH 17-DEC-80 02-APR-81 106

JONES 02-APR-81 17-DEC-80 106 03-DEC-81 245

FORD 03-DEC-81 02-APR-81 245 09-DEC-82 371

SCOTT 09-DEC-82 03-DEC-81 371 12-JAN-83 34

ADAMS 12-JAN-83 09-DEC-82 34

30 ALLEN 20-FEB-81 22-FEB-81 2

WARD 22-FEB-81 20-FEB-81 2 01-MAY-81 68

BLAKE 01-MAY-81 22-FEB-81 68 08-SEP-81 130

TURNER 08-SEP-81 01-MAY-81 130 28-SEP-81 20

MARTIN 28-SEP-81 08-SEP-81 20 03-DEC-81 66

JAMES 03-DEC-81 28-SEP-81 66

The LEAD and LAG routines could be considered a way to "index into your

partitioned group ". Using these functions you can access any individual row. Notice

for example in the above printout, it shows that the record for KING includes the data

(in bold red font) from the prior row (LAST HIRE) and the next row (NEXT-HIRE). We

can access the fields in records preceding or following the current record in an

ordered partition easily. LEADLAG例程可看作对分组进行索引。使用这些函数可以

访问任何一行。上面例子表明KING记录包括之前一行(上一雇佣)和下一行(下一)

中的数据(加粗红色字体)。可以很容易的访问排序后的分组中当前记录之前或之后

记录中的字段。

LAG

LAG ( value_expr [, offset] [, default] )

OVER ( [query_partition_clause] order_by_clause )

LAG provides access to more than one row of a table at the same time without a self

join. Given a series of rows returned from a query and a position of the cursor, LAG

provides access to a row at a given physical offset prior to that position. LAG无须自

合并就可以获取同一个表格中的多行记录。知道来自查询中的多行以及光标的位

置,LEAD就可以进入位置指定区距之前的行。

If you do not specify offset, then its default is 1. The optional default value is returned

if the offset goes beyond the scope of the window. If you do not specify default, then

its default value is null.如果不指定??,默认值为1。如果区距超出窗口范围,则返

回可指定默认值。如不指定默认值,则默认值为null

The following example provides, for each person in the EMP table, the salary of the

employee hired just before:下例中,EMP表中的每个人之前雇佣员工的工资:

SELECT ename,hiredate,sal,

LAG(sal, 1, 0)

OVER (ORDER BY hiredate) AS PrevSal

FROM emp

WHERE job = 'CLERK';

Ename Hired SAL PREVSAL

------ --------- ----- -------

SMITH 17-DEC-80 800 0

JAMES 03-DEC-81 950 800

MILLER 23-JAN-82 1300 950

ADAMS 12-JAN-83 1100 1300

LEAD

LEAD ( value_expr [, offset] [, default] )

OVER ( [query_partition_clause] order_by_clause )

LEAD provides access to more than one row of a table at the same time without a

self join. Given a series of rows returned from a query and a position of the cursor,

LEAD provides access to a row at a given physical offset beyond that position.

LEAD不用自合并就可同时进入一个表格中的多行。知道来自查询中的多行以及光标的

位置,LEAD就可以进入位置指定区距之后的行。

If you do not specify offset, then its default is 1. The optional default value is returned

if the offset goes beyond the scope of the table. If you do not specify default, then its

default value is null.如不指定区距,则默认值为1。如区距超出窗口范围则返回任意默

认值。如不指定默认值,默认值为0

The following example provides, for each employee in the EMP table, the hire date of

the employee hired just after:下例,EMP表中的每个员工,他们之后雇佣员工的雇佣

日期:

SELECT ename, hiredate,

LEAD(hiredate, 1)

OVER (ORDER BY hiredate) AS NextHired

FROM emp WHERE deptno = 30;

Ename Hired NEXTHIRED

------ --------- ---------

ALLEN 20-FEB-81 22-FEB-81

WARD 22-FEB-81 01-MAY-81

BLAKE 01-MAY-81 08-SEP-81

TURNER 08-SEP-81 28-SEP-81

MARTIN 28-SEP-81 03-DEC-81

JAMES 03-DEC-81

Determine the First Value / Last Value of a Group确定组中的第一个值/最后一个值

The FIRST_VALUE and LAST_VALUE functions allow you to select the first and last

rows from a group. These rows are especially valuable because they are often used

as the baselines in calculations. FIRST_VALUELAST_VALUE函数可以选择一组中

的第一行和最后一行。这些行很有用,因为它们经常用作计算的基线。

Example

The following example selects, for each employee in each department, the name of

the employee with the lowest salary. 下例为每个部门的每名员工、最高工资员工的姓

名。

break on deptno skip 1

SELECT deptno, ename, sal,

FIRST_VALUE(ename)

OVER (PARTITION BY deptno

ORDER BY sal ASC) AS MIN_SAL_HAS

FROM emp

ORDER BY deptno, ename;

DEPTNO ENAME SAL MIN_SAL_HAS

---------- ---------- ---------- -----------

10 CLARK 2450 MILLER

KING 5000 MILLER

MILLER 1300 MILLER

20 ADAMS 1100 SMITH

FORD 3000 SMITH

JONES 2975 SMITH

SCOTT 3000 SMITH

SMITH 800 SMITH

30 ALLEN 1600 JAMES

BLAKE 2850 JAMES

JAMES 950 JAMES

MARTIN 1250 JAMES

TURNER 1500 JAMES

WARD 1250 JAMES

The following example selects, for each employee in each department, the name of

the employee with the highest salary.下例中为每个部门的每名员工、最高工资员工的

姓名。

SELECT deptno, ename, sal,

FIRST_VALUE(ename)

OVER (PARTITION BY deptno

ORDER BY sal DESC) AS MAX_SAL_HAS

FROM emp

ORDER BY deptno, ename;

DEPTNO ENAME SAL MAX_SAL_HAS

---------- ---------- ---------- -----_-----

10 CLARK 2450 KING

KING 5000 KING

MILLER 1300 KING

20 ADAMS 1100 FORD

FORD 3000 FORD

JONES 2975 FORD

SCOTT 3000 FORD

SMITH 800 FORD

30 ALLEN 1600 BLAKE

BLAKE 2850 BLAKE

JAMES 950 BLAKE

MARTIN 1250 BLAKE

TURNER 1500 BLAKE

WARD 1250 BLAKE

The following example selects, for each employee in department 30 the name of the

employee with the lowest salary using an inline view下例为第30个部门中的每名员

工、工资最低员工的姓名,使用内联视图。

SELECT deptno, ename, sal,

FIRST_VALUE(ename)

OVER (ORDER BY sal ASC) AS MIN_SAL_HAS

FROM (SELECT * FROM emp WHERE deptno = 30)

DEPTNO ENAME SAL MIN_SAL_HAS

---------- ---------- ---------- -----------

30 JAMES 950 JAMES

MARTIN 1250 JAMES

WARD 1250 JAMES

TURNER 1500 JAMES

ALLEN 1600 JAMES

BLAKE 2850 JAMES

Crosstab or Pivot Queries交叉表或Pivot查询

A crosstab query, sometimes known as a pivot query, groups your data in a slightly

different way from those we have seen hitherto. A crosstab query can be used to get

a result with three rows (one for each project), with each row having three columns

(the first listing the projects and then one column for each year) -- like this:交叉表查

询,或者pivot查询,用稍微不同的方法将这些数据分组。交叉表查询可根据三行(一

行代表一个项目),每行有三列(第一列列出项目,然后一列代表一年)得出结果

——如下:

Project 2001 2002

ID CHF CHF

-------------------------------

100 123.00 234.50

200 543.00 230.00

300 238.00 120.50

Example

Let's say you want to show the top 3 salary earners in each department as columns.

The query needs to return exactly 1 row per department and the row would have 4

columns. The DEPTNO, the name of the highest paid employee in the department,

the name of the next highest paid, and so on. Using analytic functions this almost

easy, without analytic functions this was virtually impossible.如果需要?????显示每个

部门中工资最多的3个人,查询为每个部门返回一行,一行有4列。DEPTNO,该部门

中工资最高的人,工资第二高人的名字,依次类推。使用分析函数很容易做到,不采

用有分析函数这将是不可能的。

SELECT deptno,

MAX(DECODE(seq,1,ename,null)) first,

MAX(DECODE(seq,2,ename,null)) second,

MAX(DECODE(seq,3,ename,null)) third

FROM (SELECT deptno, ename,

row_number()

OVER (PARTITION BY deptno

ORDER BY sal desc NULLS LAST) seq

FROM emp)

WHERE seq <= 3

GROUP BY deptno

/

DEPTNO FIRST SECOND THIRD

---------- ---------- ---------- ----------

10 KING CLARK MILLER

20 SCOTT FORD JONES

30 BLAKE ALLEN TURNER

Note the inner query, that assigned a sequence (RowNr) to each employee by

department number in order of salary.注意内查询,它按照工资高低根据部门为每名员

工赋予一个序列(RowNr)。

SELECT deptno, ename, sal,

row_number()

OVER (PARTITION BY deptno

ORDER BY sal desc NULLS LAST) RowNr

FROM emp;

DEPTNO ENAME SAL ROWNR

---------- ---------- ---------- ----------

10 KING 5000 1

10 CLARK 2450 2

10 MILLER 1300 3

20 SCOTT 3000 1

20 FORD 3000 2

20 JONES 2975 3

20 ADAMS 1100 4

20 SMITH 800 5

30 BLAKE 2850 1

30 ALLEN 1600 2

30 TURNER 1500 3

30 WARD 1250 4

30 MARTIN 1250 5

30 JAMES 950 6

The DECODE in the outer query keeps only rows with sequences 1, 2 or 3 and

assigns them to the correct "column". The GROUP BY gets rid of the redundant rows

and we are left with our collapsed result. It may be easier to understand if you see the

resultset without the aggregate function MAX grouped by deptno. 外查询

DECODE只保留含有序列123的记录行,并将它们分派到正确的GROUP

BY去掉多余行,只剩下压缩结果。如果没有根据deptno分组的合计函数MAX,结果会

更容易理解。

SELECT deptno,

DECODE(seq,1,ename,null) first,

DECODE(seq,2,ename,null) second,

DECODE(seq,3,ename,null) third

FROM (SELECT deptno, ename,

row_number()

OVER (PARTITION BY deptno

ORDER BY sal desc NULLS LAST) seq

FROM emp)

WHERE seq <= 3

/

DEPTNO FIRST SECOND THIRD

---------- ---------- ---------- ----------

10 KING

10 CLARK

10 MILLER

20 SCOTT

20 FORD

20 JONES

30 BLAKE

30 ALLEN

30 TURNER

The MAX aggregate function will be applied by the GROUP BY column DEPTNO. In

any given DEPTNO above only one row will have a non-null value for FIRST, the

remaining rows in that group will always be NULL. The MAX function will pick out the

non-null row and keep that for us. Hence, the group by and MAX will collapse our

resultset, removing the NULL values from it and giving us what we want.通过GROUP

BYDEPTNO(按DEPTNO分组列)应用MAX合计函数。在上面任意给出

DEPTNO中,只有一行将对FIRST有非null值,其余行将永远是NULLMAX函数将

选出非NULL值并保存。因此,group byMAX将会压缩结果集,去掉NULL值并给出

我们想要的值。

Conclusion结论

This new set of functionality holds some exiting possibilities. It opens up a whole new

way of looking at the data. It will remove a lot of procedural code and complex or

inefficient queries that would have taken a long tome to develop, to achieve the same

result.新的函数集包括了一些现有的可能性,它开辟处理数据的一种全新方法,减少过

程代码以及浪费大量时间开发的复杂或低效查询,实现相同的结果。

Links and Documents链接和参考文档

Further articles about Analytic Functions can be found in:有关分析函数的更多文章请

看:

Oracle9i SQL Reference Release 1 (9.0.1) Oracle9i SQL参考 版本19.0.1)。

Oracle9i Database Concepts Release 1 (9.0.1) Oracle9i数据库概念 版

19.0.1)。

Oracle9i Data Warehousing Guide Release 1 (9.0.1) Oracle9i数据仓储指南 版

19.0.1)。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值