OraDev05_p2.mp4 排名和相邻类分析函数
OraDev05_p3_1.mp4 统计类分析函数(上)
OraDev05_p3_2.mp4 统计类分析函数(下)
OraDev05_p4.mp4 分析函数语法及规范和习惯
OraDev05_p5.mp4 第三次课程作业讲解
SQL分析函数功能:排名、相邻、统计
******************************分析函数笔记大纲***********************
1、排名
row_number()
rank()
dense_rank()
2、相邻
LAG(sal,1,0)over(partition by deptno order by sal) ---提取排序后当前记录之前的记录 就是往前找
LEAD(sal,1,99999)over(partition by deptno order by sal) ---提取排序后当前记录之后的记录 就是往后找
(case when deptno=lag(deptno,1,423)over(partition by deptno order by ename) then null else deptno end) ---去重
3、统计
(1)SUM
求出每个部门按月的累计销售额?
(2)AVG
每个部门每种货物的销售额与该货物在公司各部门平均销售额之间的差值
(3)MAX/MIN
查出货物G03在销售当天及前十二天的最大销售额(按部门)
max(sale_cnt)over(partition by dept_id order by sale_date range interval '12' day preceding)
(4)FIRST_VALUE和LAST_VALUE
查出所有雇员的雇员职位、编号、姓名、薪水以及同职位上薪水最高和最低的雇员的雇员编号(薪水相同的,按雇员编号倒序排列)
①last_value(empno)over(partition by job order by sal, empno desc rows between unbounded preceding and unbounded following) lv
②max(empno)keep(dense_rank first order by sal,empno desc)over(partition by job)
补全缺失的部门编号
****************************************************************************
--------------------排名类分析函数------------------------------------
with t as (select rownum*10 cnt from dual connect by rownum<5 union all select rownum*40-10 cnt from dual connect by rownum<3)
select cnt,
row_number() over(order by cnt) rn, ----无并列
rank() over(order by cnt)rk, ----并列不连续
dense_rank() over(order by cnt) drk from t ---并列连续
/
CNT RN RK DRK
------- ---------- ---------- ----------
10 1 1 1
20 2 2 2
30 3 3 3
30 4 3 3
40 5 5 4
70 6 6 5
--------分组排名----------
select deptno,ename,sal, row_number()over(partition by deptno order by sal) rn from emp;
DEPTNO ENAME SAL RN
-------- ---------- ---------- ----------
10 MILLER 1518 1
10 CLARK 2668 2
10 KING 5218 3
20 Smith 1018 1
20 ADAMS 1318 2
20 JONES 3193 3
20 FORD 3218 4
20 SCOTT 10035 5
30 JAMES 1168 1
30 MARTIN 1468 2
30 WARD 1468 3
DEPTNO ENAME SAL RN
-------- ---------- ---------- ----------
30 TURNER 1718 4
30 ALLEN 1818 5
30 BLAKE 3068 6
排名分析函数要点
–排名分析函数不需要参数
– 排名分析函数里的order by子句是必须的
–partition by 和 order by后均可跟多列
–只在没有partition by 的情况下,rownum才能做到和row_number同样的事情
-----------------------相邻类分析函数--------------------------------------
LAG(sal,1,0)over(partition by deptno order by sal) ---提取排序后当前记录之前的记录 就是往前找
LEAD(sal,1,99999)over(partition by deptno order by sal) ---提取排序后当前记录之后的记录 就是往后找
select deptno, ename,
lag(ename,1,'AAA')over(partition by deptno order by ename) lower_name,
lead(ename,1,'ZZZ')over(partition by deptno order by ename) higher_name
from emp
/
select (case when deptno=lag(deptno,1,423)over(partition by deptno order by ename)
then null else deptno end) deptno,
ename,
lag(ename)over(partition by deptno order by ename) lower_name,
lead(ename)over(partition by deptno order by ename) higher_name
from emp
/
DEPTNO ENAME LOWER_NAME HIGHER_NAM
---------- ---------- ---------- ----------
10 CLARK KING
KING CLARK luyang
luyang KING
20 ADAMS FORD
FORD ADAMS JONES
JONES FORD SCOTT
SCOTT JONES SMITH
SMITH SCOTT
30 ALLEN BLAKE
BLAKE ALLEN JAMES
JAMES BLAKE MARTIN
DEPTNO ENAME LOWER_NAME HIGHER_NAM
---------- ---------- ---------- ----------
MARTIN JAMES TURNER
TURNER MARTIN WARD
WARD TURNER
14 rows selected.
相邻类分析函数的要点
–LAG/LEAD(v, n, dv)里的n表示位移,必须是0或正整数,dv是在没有取到对应值时的默认值。n默认是1,dv默认是null。
–相邻类分析函数后面order by子句是必须的
–partition by 和 order by后均可跟多列
-------------统计类的分析函数------------
用到的lw_sale表的数据在最后的附中;
LW_SALE表中字段:
dept_id 部门编号,
sale_date 销售日期,
goods_type 货物类别,
sale_cnt 销售额
1、SUM
①求出每个部门按月的累计销售额
with t as (select dept_id, to_char(sale_date,'YYYY-MM') sale_month, sum(sale_cnt) month_sale_cnt from lw_sales group by dept_id, to_char(sale_date,'YYYY-MM'))
select dept_id, sale_month, month_sale_cnt, sum(month_sale_cnt)over(partition by dept_id order by sale_month) cum_month_sale_cnt from t;
②求出每个部门按售出货物类别的累计销售额(按货物类别代码正序排列)以及每个货物按部门的累计销售额(按部门编号正序排列)
with t as (select dept_id, goods_type, sum(sale_cnt) goods_sale_cnt from lw_sales group by dept_id, goods_type)
select dept_id, goods_type, goods_sale_cnt,
sum(goods_sale_cnt)over(partition by dept_id order by goods_type)cum_gsc_goods,
sum(goods_sale_cnt)over(partition by goods_type order by dept_id) cum_gsc_dept ---先按good_type排序,再按dept_id排序
from t; ----分析函数总是最后一个的排序生效
2、AVG
①、查询出emp表中每个部门的平均工资
select deptno,ename,sal,avg(sal)over(partition by deptno) from emp;
DEPTNO ENAME SAL AVG(SAL)OVER(PARTITIONBYDEPTNO)
---------- ---------- ---------- -------------------------------
10 CLARK 2459 2925.66667
10 KING 5009 2925.66667
10 luyang 1309 2925.66667
20 JONES 2984 2310
20 FORD 3009 2310
20 ADAMS 1109 2310
20 SMITH 809 2310
20 SCOTT 3639 2310
30 WARD 1259 1575.66667
30 TURNER 1509 1575.66667
30 ALLEN 1609 1575.66667
30 JAMES 959 1575.66667
30 BLAKE 2859 1575.66667
30 MARTIN 1259 1575.66667
②求出每个部门每种货物的销售额与该货物在公司各部门平均销售额之间的差值
with t as (select dept_id, goods_type, sum(sale_cnt) goods_sale_cnt from lw_sales group by dept_id, goods_type)
select dept_id, goods_type, goods_sale_cnt,
round(AVG(goods_sale_cnt)over(partition by goods_type),2) avg_goods_sale_cnt,
goods_sale_cnt-round(AVG(goods_sale_cnt)over(partition by goods_type),2) dv_goods_sale_cnt
from t
3、MAX/MIN
*其实partition by 就是把数据分成一段一段的
①查询出emp表中每个部门的最高工资
select deptno,ename,sal,max(sal)over(partition by deptno) from emp;
DEPTNO ENAME SAL MAX(SAL)OVER(PARTITIONBYDEPTNO)
---------- ---------- ---------- -------------------------------
10 CLARK 2459 5009
10 KING 5009 5009
10 luyang 1309 5009
20 JONES 2984 3639
20 FORD 3009 3639
20 ADAMS 1109 3639
20 SMITH 809 3639
20 SCOTT 3639 3639
30 WARD 1259 2859
30 TURNER 1509 2859
30 ALLEN 1609 2859
30 JAMES 959 2859
30 BLAKE 2859 2859
30 MARTIN 1259 2859
②货物G01每月的最高和最低销售额对应的部门(如有多个部门按部门ID列出最小的一个,如某部门某月无销售额则不做统计)
with a as (select dept_id , to_char(sale_date,'YYYY-MM') sale_month,
sum(sale_cnt) goods_sale_cnt,
max(sum(sale_cnt))over(partition by to_char(sale_date,'YYYY-MM')) max_gsc,
min(sum(sale_cnt))over(partition by to_char(sale_date,'YYYY-MM')) min_gsc
from lw_sales where goods_type='G01' group by dept_id, to_char(sale_date,'YYYY-MM'))
select sale_month,
min(case when goods_sale_cnt=max_gsc then dept_id end) max_dept_id,
min(case when goods_sale_cnt=min_gsc then dept_id end) min_dept_id
from a where goods_sale_cnt=max_gsc or goods_sale_cnt=min_gsc
group by sale_month
SALE_MONTH MAX_DEPT_ID MIN_DEPT_ID
---------- ----------- -----------
2013-03 S01 S00
2013-04 S01 S00
③查出货物G03在销售当天及前十二天的最大销售额(按部门和整个公司分别求出)
select dept_id,sale_date,sale_cnt,
max(sale_cnt)over(partition by dept_id order by sale_date range interval '12' day preceding) max_dept_sale_cnt,
max(sale_cnt)over(order by sale_date range numtodsinterval(12, 'DAY') preceding) max_sale_cnt
from lw_sales where goods_type='G03'
DEPT_ID SALE_DATE SALE_CNT MAX_DEPT_SALE_CNT MAX_SALE_CNT
------- ----------- ----------- ----------------- ------------
S00 2013-3-6 800 800 800
S01 2013-3-17 600 600 800
S00 2013-3-21 200 200 600
S01 2013-4-3 200 200 200
S02 2013-4-5 800 800 800
S02 2013-4-22 300 300 300
4、FIRST_VALUE和LAST_VALUE
①查出所有雇员的雇员职位、编号、姓名、薪水以及同职位上薪水最高和最低的雇员的雇员编号(薪水相同的,按雇员编号倒序排列)
方法一:
select job, empno, ename, sal,
last_value(empno)over(partition by job order by sal, empno desc rows between unbounded preceding and unbounded following) lv,
first_value(empno)over(partition by job order by sal, empno desc rows between unbounded preceding and unbounded following) fv
from emp
JOB EMPNO ENAME SAL LV FV
--------- ----- ---------- --------- ---------- ----------
ANALYST 7902 FORD 3000.00 7788 7902
ANALYST 7788 SCOTT 3000.00 7788 7902
CLERK 7369 SMITH 800.00 7934 7369
CLERK 7900 JAMES 950.00 7934 7369
CLERK 7876 ADAMS 1100.00 7934 7369
CLERK 7934 MILLER 1300.00 7934 7369
MANAGER 7782 CLARK 2450.00 7566 7782
MANAGER 7698 BLAKE 2850.00 7566 7782
MANAGER 7566 JONES 2975.00 7566 7782
PRESIDENT 7839 KING 5000.00 7839 7839
SALESMAN 7654 MARTIN 1250.00 7499 7654
SALESMAN 7521 WARD 1250.00 7499 7654
SALESMAN 7844 TURNER 1500.00 7499 7654
SALESMAN 7499 ALLEN 1600.00 7499 7654
注:有了order by参数后,会对窗口进行缩减(第一行到当前行)
rows between unbounded preceding and unbounded following就是把这个窗口的所有行取出来
查出所有雇员的雇员职位、编号、姓名、薪水以及同职位上薪水最高和最低的雇员的雇员编号(薪水相同的,按雇员编号倒序排列)
方法二:
select job,empno,ename,sal
max(empno)keep(dense_rank first order by sal,empno desc)over(partition by job),
min(empno)keep(dense_rank last order by sal,empno desc)over(partition by job)
from emp
--keep(dense_rank first order by sal,empno desc) 选出按sal、emp倒序排列后的第一行数据(如有并列第一都选出)
AVG()keep()
SUM()keep()
MAX()keep()
keep(dense_rank first order by ....)参考:http://blog.csdn.net/wanghai__/article/details/5011051
②补全缺失的部门编号
with t as
(select (case when deptno=lag(deptno,1,423)over(partition by deptno order by ename)
then null else deptno end) deptno,
ename,
lag(ename)over(partition by deptno order by ename) lower_name,
lead(ename)over(partition by deptno order by ename) higher_name
from emp),
t1 as (select t.*,rownum rn from t)
select (case when deptno is not null
then deptno
else last_value(deptno ignore nulls)over(order by rn) end) deptno,
ename,lower_name,higher_name
from t1
统计分析函数特点
可以有partition by、order by和range/rows子句(此即windowing子句)
–以上子句都不是必须出现的,但若出现windowing子句则其前必须出现order by子句
–没有order by都就是分组统计,有order by就是分组累计
–FIRST/LAST这两个分析函数后面只能跟partition by子句
–FIRST_VALUE和LAST_VALUE用于获取一组有序的数据中的第一个和最后一个值,和MAX/MIN以及FIRST/LAST函数很像,但要注意区别
---------------------------------------------------------------------------
附:
create table lw_sales(dept_id varchar2(6), sale_date date, goods_type varchar2(4), sale_cnt number(10));
COMMENT ON TABLE LW_SALES IS 'lastwinner建立的销售数据测试表。'
/
SET DEFINE OFF;
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S00', TO_DATE('03/16/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G00', 400);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S02', TO_DATE('04/05/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G00', 400);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S01', TO_DATE('03/17/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G00', 700);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S01', TO_DATE('04/13/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G00', 900);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S00', TO_DATE('04/18/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G01', 300);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S00', TO_DATE('03/15/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G01', 30);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S01', TO_DATE('04/15/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G01', 400);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S02', TO_DATE('03/10/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G01', 200);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S02', TO_DATE('03/06/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G01', 70);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S01', TO_DATE('03/17/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G01', 400);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S01', TO_DATE('04/07/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G02', 300);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S01', TO_DATE('03/14/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G02', 300);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S02', TO_DATE('03/09/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G02', 900);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S01', TO_DATE('03/16/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G02', 800);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S02', TO_DATE('04/22/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G03', 300);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S00', TO_DATE('03/21/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G03', 200);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S01', TO_DATE('03/17/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G03', 600);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S02', TO_DATE('04/05/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G03', 800);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S01', TO_DATE('04/03/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G03', 200);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S00', TO_DATE('03/06/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G03', 800);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S01', TO_DATE('03/09/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G04', 800);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S01', TO_DATE('03/07/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G04', 30);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S01', TO_DATE('04/20/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G04', 900);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S01', TO_DATE('03/21/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G04', 800);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S00', TO_DATE('03/02/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G04', 500);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S00', TO_DATE('03/10/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G04', 500);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S02', TO_DATE('04/06/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G04', 300);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S00', TO_DATE('03/07/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G05', 400);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S02', TO_DATE('03/06/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G05', 200);
Insert into LW_SALES
(DEPT_ID, SALE_DATE, GOODS_TYPE, SALE_CNT)
Values
('S00', TO_DATE('03/06/2013 00:00:00', 'MM/DD/YYYY HH24:MI:SS'), 'G05', 500);
COMMIT;