orcale分析函数(一)

分析函数是 oracle816 引入的一个全新的概念 , 为我们分析数据提供了一种简单高效的处理方式 . 在分析函数出现以前 , 我们必须使用自联查询 , 子查询或者内联视图 , 甚至复杂的 存储过程 实现的语句 , 现在只要一条简单的 sql 语句就可以实现了 , 而且在执行效率方面也有相当大的提高 . 下面我将针对分析函数做一些具体的说明 .
 
今天我主要给大家介绍一下以下几个函数的使用方法
 
1. 自动汇总函数 rollup,cube,
 
2. rank 函数 , rank,dense_rank,row_number
 
3. lag,lead 函数
 
4. sum,avg, 的移动增加 , 移动平均数
 
5. ratio_to_report 报表处理函数
 
6. first,last 取基数的分析函数
 
基础数据
 
Code: [Copy to clipboard]
06:34:23 SQL> select * from t;
 
BILL_MONTH AREA_CODE NET_TYPE LOCAL_FARE
--------------- ---------- ---------- --------------
200405 5761 G 7393344.04
200405 5761 J 5667089.85
200405 5762 G 6315075.96
200405 5762 J 6328716.15
200405 5763 G 8861742.59
200405 5763 J 7788036.32
200405 5764 G 6028670.45
200405 5764 J 6459121.49
200405 5765 G 13156065.77
200405 5765 J 11901671.70
200406 5761 G 7614587.96
200406 5761 J 5704343.05
200406 5762 G 6556992.60
200406 5762 J 6238068.05
200406 5763 G 9130055.46
200406 5763 J 7990460.25
200406 5764 G 6387706.01
200406 5764 J 6907481.66
200406 5765 G 13562968.81
200406 5765 J 12495492.50
200407 5761 G 7987050.65
200407 5761 J 5723215.28
200407 5762 G 6833096.68
200407 5762 J 6391201.44
200407 5763 G 9410815.91
200407 5763 J 8076677.41
200407 5764 G 6456433.23
200407 5764 J 6987660.53
200407 5765 G 14000101.20
200407 5765 J 12301780.20
200408 5761 G 8085170.84
200408 5761 J 6050611.37
200408 5762 G 6854584.22
200408 5762 J 6521884.50
200408 5763 G 9468707.65
200408 5763 J 8460049.43
200408 5764 G 6587559.23
 
BILL_MONTH AREA_CODE NET_TYPE LOCAL_FARE
--------------- ---------- ---------- --------------
200408 5764 J 7342135.86
200408 5765 G 14450586.63
200408 5765 J 12680052.38
 
40 rows selected.
 
Elapsed: 00:00:00.00
 
1. 使用 rollup 函数的介绍
 
Quote:
 
下面是直接使用普通 sql 语句求出各地区的汇总数据的例子
06:41:36 SQL> set autot on
06:43:36 SQL> select area_code,sum(local_fare) local_fare
06:43:50 2 from t
06:43:51 3 group by area_code
06:43:57 4 union all
06:44:00 5 select ' 合计 ' area_code,sum(local_fare) local_fare
06:44:06 6 from t
06:44:08 7 /
 
AREA_CODE LOCAL_FARE
---------- --------------
5761 54225413.04
5762 52039619.60
5763 69186545.02
5764 53156768.46
5765 104548719.19
合计 333157065.31
 
6 rows selected.
 
Elapsed: 00:00:00.03
 
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=7 Card=1310 Bytes=
24884)
 
1 0 UNION-ALL
2 1 SORT (GROUP BY) (Cost=5 Card=1309 Bytes=24871)
3 2 TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=1309 Bytes=248
71)
 
4 1 SORT (AGGREGATE)
5 4 TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=1309 Bytes=170
17)
 
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
6 consistent gets
0 physical reads
0 redo size
561 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
6 rows processed
 
下面是使用分析函数 rollup 得出的汇总数据的例子
 
06:44:09 SQL> select nvl(area_code,' 合计 ') area_code,sum(local_fare) local_fare
06:45:26 2 from t
06:45:30 3 group by rollup(nvl(area_code,' 合计 '))
06:45:50 4 /
 
AREA_CODE LOCAL_FARE
---------- --------------
5761 54225413.04
5762 52039619.60
5763 69186545.02
5764 53156768.46
5765 104548719.19
333157065.31
 
6 rows selected.
 
Elapsed: 00:00:00.00
 
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=5 Card=1309 Bytes=
24871)
 
1 0 SORT (GROUP BY ROLLUP) (Cost=5 Card=1309 Bytes=24871)
2 1 TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=1309 Bytes=24871
)
 
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
557 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
6 rows processed
 
从上面的例子我们不难看出使用 rollup 函数 , 系统的 sql 语句更加简单 , 耗用的资源更少 , 6 consistent gets 降到 4 consistent gets, 如果基表很大的话 , 结果就可想而知了 .
 
1. 使用 cube 函数的介绍
 
Quote:
 
为了介绍 cube 函数我们再来看看另外一个使用 rollup 的例子
 
06:53:00 SQL> select area_code,bill_month,sum(local_fare) local_fare
06:53:37 2 from t
06:53:38 3 group by rollup(area_code,bill_month)
06:53:49 4 /
 
AREA_CODE BILL_MONTH LOCAL_FARE
---------- --------------- --------------
5761 200405 13060433.89
5761 200406 13318931.01
5761 200407 13710265.93
5761 200408 14135782.21
5761 54225413.04
5762 200405 12643792.11
5762 200406 12795060.65
5762 200407 13224298.12
5762 200408 13376468.72
5762 52039619.60
5763 200405 16649778.91
5763 200406 17120515.71
5763 200407 17487493.32
5763 200408 17928757.08
5763 69186545.02
5764 200405 12487791.94
5764 200406 13295187.67
5764 200407 13444093.76
5764 200408 13929695.09
5764 53156768.46
5765 200405 25057737.47
5765 200406 26058461.31
5765 200407 26301881.40
5765 200408 27130639.01
5765 104548719.19
333157065.31
 
26 rows selected.
 
Elapsed: 00:00:00.00
 
系统只是根据 rollup 的第一个参数 area_code 对结果集的数据做了汇总处理 , 而没有对 bill_month 做汇总分析处理 ,cube 函数就是为了这个而 设计 .
 
下面 , 让我们看看使用 cube 函数的结果
 
06:58:02 SQL> select area_code,bill_month,sum(local_fare) local_fare
06:58:30 2 from t
06:58:32 3 group by cube(area_code,bill_month)
06:58:42 4 order by area_code,bill_month nulls last
06:58:57 5 /
 
AREA_CODE BILL_MONTH LOCAL_FARE
---------- --------------- --------------
5761 200405 13060.43
5761 200406 13318.93
5761 200407 13710.27
5761 200408 14135.78
5761 54225.41
5762 200405 12643.79
5762 200406 12795.06
5762 200407 13224.30
5762 200408 13376.47
5762 52039.62
5763 200405 16649.78
5763 200406 17120.52
5763 200407 17487.49
5763 200408 17928.76
5763 69186.54
5764 200405 12487.79
5764 200406 13295.19
5764 200407 13444.09
5764 200408 13929.69
5764 53156.77
5765 200405 25057.74
5765 200406 26058.46
5765 200407 26301.88
5765 200408 27130.64
5765 104548.72
200405 79899.53
200406 82588.15
200407 84168.03
200408 86501.34
333157.05
 
30 rows selected.
 
Elapsed: 00:00:00.01
 
可以看到 , cube 函数的输出结果比使用 rollup 多出了几行统计数据 . 这就是 cube 函数根据 bill_month 做的汇总统计结果 ]
1 rollup cube 函数的再深入
 
Quote:
 
从上面的结果中我们很容易发现 , 每个统计数据所对应的行都会出现 null, 我们如何来区分到底是根据那个字段做的汇总呢 , 这时候 ,oracle grouping 函数就粉墨登场了 .
 
如果当前的汇总记录是利用该字段得出的 ,grouping 函数就会返回 1, 否则返回 0
 
1 select decode(grouping(area_code),1,'all area',to_char(area_code)) area_code,
2 decode(grouping(bill_month),1,'all month',bill_month) bill_month,
3 sum(local_fare) local_fare
4 from t
5 group by cube(area_code,bill_month)
6* order by area_code,bill_month nulls last
07:07:29 SQL> /
 
AREA_CODE BILL_MONTH LOCAL_FARE
---------- --------------- --------------
5761 200405 13060.43
5761 200406 13318.93
5761 200407 13710.27
5761 200408 14135.78
5761 all month 54225.41
5762 200405 12643.79
5762 200406 12795.06
5762 200407 13224.30
5762 200408 13376.47
5762 all month 52039.62
5763 200405 16649.78
5763 200406 17120.52
5763 200407 17487.49
5763 200408 17928.76
5763 all month 69186.54
5764 200405 12487.79
5764 200406 13295.19
5764 200407 13444.09
5764 200408 13929.69
5764 all month 53156.77
5765 200405 25057.74
5765 200406 26058.46
5765 200407 26301.88
5765 200408 27130.64
5765 all month 104548.72
all area 200405 79899.53
all area 200406 82588.15
all area 200407 84168.03
all area 200408 86501.34
all area all month 333157.05
 
30 rows selected.

2. rank函数的介绍

介绍完rollup和cube函数的使用,下面我们来看看rank系列函数的使用方法.

问题2.我想查出这几个月份中各个地区的总话费的排名.


  Quote:
为了将rank,dense_rank,row_number函数的差别显示出来,我们对已有的基础数据做一些修改,将5763的数据改成与5761的数据相同.
  1  update t t1 set local_fare = (
  2    select local_fare from t t2
  3     where t1.bill_month = t2.bill_month
  4     and t1.net_type = t2.net_type
  5     and t2.area_code = '5761'
  6* ) where area_code = '5763'
07:19:18 SQL> /

8 rows updated.

Elapsed: 00:00:00.01

我们先使用rank函数来计算各个地区的话费排名.
07:34:19 SQL> select area_code,sum(local_fare) local_fare,
07:35:25   2    rank() over (order by sum(local_fare) desc) fare_rank
07:35:44   3  from t
07:35:45   4  group by area_codee
07:35:50   5
07:35:52 SQL> select area_code,sum(local_fare) local_fare,
07:36:02   2    rank() over (order by sum(local_fare) desc) fare_rank
07:36:20   3  from t
07:36:21   4  group by area_code
07:36:25   5  /

AREA_CODE      LOCAL_FARE  FARE_RANK
---------- -------------- ----------
5765            104548.72          1
5761             54225.41          2
5763             54225.41          2
5764             53156.77          4
5762             52039.62          5

Elapsed: 00:00:00.01

我们可以看到红色标注的地方出现了,跳位,排名3没有出现
下面我们再看看dense_rank查询的结果.


07:36:26 SQL> select area_code,sum(local_fare) local_fare,
07:39:16   2    dense_rank() over (order by sum(local_fare) desc ) fare_rank
07:39:39   3  from t
07:39:42   4  group by area_code
07:39:46   5  /

AREA_CODE      LOCAL_FARE  FARE_RANK
---------- -------------- ----------
5765            104548.72          1
5761             54225.41          2
5763             54225.41          2
5764             53156.77          3  这是这里出现了第三名
5762             52039.62          4

Elapsed: 00:00:00.00


在这个例子中,出现了一个第三名,这就是rank和dense_rank的差别,
rank如果出现两个相同的数据,那么后面的数据就会直接跳过这个排名,而dense_rank则不会,
差别更大的是,row_number哪怕是两个数据完全相同,排名也会不一样,这个特性在我们想找出对应没个条件的唯一记录的时候又很大用处


  1  select area_code,sum(local_fare) local_fare,
  2     row_number() over (order by sum(local_fare) desc ) fare_rank
  3  from t
  4* group by area_code
07:44:50 SQL> /

AREA_CODE      LOCAL_FARE  FARE_RANK
---------- -------------- ----------
5765            104548.72          1
5761             54225.41          2
5763             54225.41          3
5764             53156.77          4
5762             52039.62          5

在row_nubmer函数中,我们发现,哪怕sum(local_fare)完全相同,我们还是得到了不一样排名,我们可以利用这个特性剔除数据库中的重复记录.

这个帖子中的几个例子是为了说明这三个函数的基本用法的. 下个帖子我们将详细介绍他们的一些用法.




2. rank函数的介绍

a. 取出数据库中最后入网的n个用户
select user_id,tele_num,user_name,user_status,create_date
from (
   select user_id,tele_num,user_name,user_status,create_date,
      rank() over (order by create_date desc) add_rank
   from user_info
)
where add_rank <= :n;

b.根据object_name删除数据库中的重复记录
create table t as select obj#,name from sys.obj$;
再insert into t1 select * from t1 数次.
delete from t1 where rowid in (
   select row_id from (
      select rowid row_id,row_number() over (partition by obj# order by rowid ) rn
   ) where rn <> 1
);

c. 取出各地区的话费收入在各个月份排名.
SQL> select bill_month,area_code,sum(local_fare) local_fare,
  2     rank() over (partition by bill_month order by sum(local_fare) desc) area_rank
  3  from t
  4  group by bill_month,area_code
  5  /

BILL_MONTH      AREA_CODE           LOCAL_FARE  AREA_RANK
--------------- --------------- -------------- ----------
200405          5765                  25057.74          1
200405          5761                  13060.43          2
200405          5763                  13060.43          2
200405          5762                  12643.79          4
200405          5764                  12487.79          5
200406          5765                  26058.46          1
200406          5761                  13318.93          2
200406          5763                  13318.93          2
200406          5764                  13295.19          4
200406          5762                  12795.06          5
200407          5765                  26301.88          1
200407          5761                  13710.27          2
200407          5763                  13710.27          2
200407          5764                  13444.09          4
200407          5762                  13224.30          5
200408          5765                  27130.64          1
200408          5761                  14135.78          2
200408          5763                  14135.78          2
200408          5764                  13929.69          4
200408          5762                  13376.47          5

20 rows selected.
SQL>


3. lag和lead函数介绍

取出每个月的上个月和下个月的话费总额
  1  select area_code,bill_month, local_fare cur_local_fare,
  2     lag(local_fare,2,0) over (partition by area_code order by bill_month ) pre_local_fare,
  3     lag(local_fare,1,0) over (partition by area_code order by bill_month ) last_local_fare,
  4     lead(local_fare,1,0) over (partition by area_code order by bill_month ) next_local_fare,
  5     lead(local_fare,2,0) over (partition by area_code order by bill_month ) post_local_fare
  6  from (
  7     select area_code,bill_month,sum(local_fare) local_fare
  8     from t
  9     group by area_code,bill_month
10* )
SQL> /
AREA_CODE BILL_MONTH CUR_LOCAL_FARE PRE_LOCAL_FARE LAST_LOCAL_FARE NEXT_LOCAL_FARE POST_LOCAL_FARE
--------- ---------- -------------- -------------- --------------- --------------- ---------------
5761      200405          13060.433              0               0        13318.93       13710.265
5761      200406           13318.93              0       13060.433       13710.265       14135.781
5761      200407          13710.265      13060.433        13318.93       14135.781               0
5761      200408          14135.781       13318.93       13710.265               0               0
5762      200405          12643.791              0               0        12795.06       13224.297
5762      200406           12795.06              0       12643.791       13224.297       13376.468
5762      200407          13224.297      12643.791        12795.06       13376.468               0
5762      200408          13376.468       12795.06       13224.297               0               0
5763      200405          13060.433              0               0        13318.93       13710.265
5763      200406           13318.93              0       13060.433       13710.265       14135.781
5763      200407          13710.265      13060.433        13318.93       14135.781               0
5763      200408          14135.781       13318.93       13710.265               0               0
5764      200405          12487.791              0               0       13295.187       13444.093
5764      200406          13295.187              0       12487.791       13444.093       13929.694
5764      200407          13444.093      12487.791       13295.187       13929.694               0
5764      200408          13929.694      13295.187       13444.093               0               0
5765      200405          25057.736              0               0        26058.46       26301.881
5765      200406           26058.46              0       25057.736       26301.881       27130.638
5765      200407          26301.881      25057.736        26058.46       27130.638               0
5765      200408          27130.638       26058.46       26301.881               0               0
20 rows selected.

利用lag和lead函数,我们可以在同一行中显示前n行的数据,也可以显示后n行的数据.


4. sum,avg,max,min移动计算数据介绍

计算出各个连续3个月的通话费用的平均数
  1  select area_code,bill_month, local_fare,
  2     sum(local_fare)
  3             over (  partition by area_code
  4                     order by to_number(bill_month)
  5                     range between 1 preceding and 1 following ) "3month_sum",
  6     avg(local_fare)
  7             over (  partition by area_code
  8                     order by to_number(bill_month)
  9                     range between 1 preceding and 1 following ) "3month_avg",
10     max(local_fare)
11             over (  partition by area_code
12                     order by to_number(bill_month)
13                     range between 1 preceding and 1 following ) "3month_max",
14     min(local_fare)
15             over (  partition by area_code
16                     order by to_number(bill_month)
17                     range between 1 preceding and 1 following ) "3month_min"
18  from (
19     select area_code,bill_month,sum(local_fare) local_fare
20     from t
21     group by area_code,bill_month
22* )
SQL> /

AREA_CODE BILL_MONTH       LOCAL_FARE 3month_sum 3month_avg 3month_max 3month_min
--------- ---------- ---------------- ---------- ---------- ---------- ----------
5761      200405            13060.433  26379.363 13189.6815   13318.93  13060.433
5761      200406            13318.930  40089.628 13363.2093  13710.265  13060.433
5761      200407            13710.265  41164.976 13721.6587  14135.781   13318.93
40089.628 = 13060.433 + 13318.930 + 13710.265
13363.2093 = (13060.433 + 13318.930 + 13710.265) / 3
13710.265 = max(13060.433 + 13318.930 + 13710.265)
13060.433 = min(13060.433 + 13318.930 + 13710.265)
5761      200408            14135.781  27846.046  13923.023  14135.781  13710.265
5762      200405            12643.791  25438.851 12719.4255   12795.06  12643.791
5762      200406            12795.060  38663.148  12887.716  13224.297  12643.791
5762      200407            13224.297  39395.825 13131.9417  13376.468   12795.06
5762      200408            13376.468  26600.765 13300.3825  13376.468  13224.297
5763      200405            13060.433  26379.363 13189.6815   13318.93  13060.433
5763      200406            13318.930  40089.628 13363.2093  13710.265  13060.433
5763      200407            13710.265  41164.976 13721.6587  14135.781   13318.93
5763      200408            14135.781  27846.046  13923.023  14135.781  13710.265
5764      200405            12487.791  25782.978  12891.489  13295.187  12487.791
5764      200406            13295.187  39227.071 13075.6903  13444.093  12487.791
5764      200407            13444.093  40668.974 13556.3247  13929.694  13295.187
5764      200408            13929.694  27373.787 13686.8935  13929.694  13444.093
5765      200405            25057.736  51116.196  25558.098   26058.46  25057.736
5765      200406            26058.460  77418.077 25806.0257  26301.881  25057.736
5765      200407            26301.881  79490.979  26496.993  27130.638   26058.46
5765      200408            27130.638  53432.519 26716.2595  27130.638  26301.881

20 rows selected.

5. ratio_to_report函数的介绍




  Quote:
  1  select bill_month,area_code,sum(local_fare) local_fare,
  2     ratio_to_report(sum(local_fare)) over
  3       ( partition by bill_month ) area_pct
  4  from t
  5* group by bill_month,area_code
SQL> break on bill_month skip 1
SQL> compute sum of local_fare on bill_month
SQL> compute sum of area_pct on bill_month
SQL> /

BILL_MONTH AREA_CODE       LOCAL_FARE   AREA_PCT
---------- --------- ---------------- ----------
200405     5761             13060.433 .171149279
           5762             12643.791 .165689431
           5763             13060.433 .171149279
           5764             12487.791 .163645143
           5765             25057.736 .328366866
**********           ---------------- ----------
sum                         76310.184          1

200406     5761             13318.930 .169050772
           5762             12795.060 .162401542
           5763             13318.930 .169050772
           5764             13295.187 .168749414
           5765             26058.460 .330747499
**********           ---------------- ----------
sum                         78786.567          1

200407     5761             13710.265 .170545197
           5762             13224.297 .164500127
           5763             13710.265 .170545197
           5764             13444.093 .167234221
           5765             26301.881 .327175257
**********           ---------------- ----------
sum                         80390.801          1

200408     5761             14135.781 .170911147
           5762             13376.468 .161730539
           5763             14135.781 .170911147
           5764             13929.694 .168419416
           5765             27130.638 .328027751
**********           ---------------- ----------
sum                         82708.362          1


20 rows selected.



6 first,last函数使用介绍




  Quote:
取出每月通话费最高和最低的两个用户.
1  select bill_month,area_code,sum(local_fare) local_fare,
  2     first_value(area_code)
  3             over (order by sum(local_fare) desc
  4                     rows unbounded preceding) firstval,
  5     first_value(area_code)
  6             over (order by sum(local_fare) asc
  7                     rows unbounded preceding) lastval
  8  from t
  9  group by bill_month,area_code
10* order by bill_month
SQL> /

BILL_MONTH AREA_CODE       LOCAL_FARE FIRSTVAL        LASTVAL
---------- --------- ---------------- --------------- ---------------
200405     5764             12487.791 5765            5764
200405     5762             12643.791 5765            5764
200405     5761             13060.433 5765            5764
200405     5765             25057.736 5765            5764
200405     5763             13060.433 5765            5764
200406     5762             12795.060 5765            5764
200406     5763             13318.930 5765            5764
200406     5764             13295.187 5765            5764
200406     5765             26058.460 5765            5764
200406     5761             13318.930 5765            5764
200407     5762             13224.297 5765            5764
200407     5765             26301.881 5765            5764
200407     5761             13710.265 5765            5764
200407     5763             13710.265 5765            5764
200407     5764             13444.093 5765            5764
200408     5762             13376.468 5765            5764
200408     5764             13929.694 5765            5764
200408     5761             14135.781 5765            5764
200408     5765             27130.638 5765            5764
200408     5763             14135.781 5765            5764

20 rows selected.  

 
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值