1.sum(column) over(partition by column order by column)
2.ratio_to_report(column) over(partition by column order by column)
为了更易理解上述函数在数据分析所能发挥的作用,下面将举例进行讲解:
create table T_salary(
F_depart varchar2(20),
F_EMP varchar2(20),
F_salary integer
);
truncate table t_salary;
--插入测试数据
insert into t_salary(f_depart, f_emp, f_salary)
select 'IT','zhangsan',10000 from dual
union all
select 'IT','lisi',2000 from dual
union all
select 'IT','yangba',2000 from dual
union all
select 'HR','wangwu',3000 from dual
union all
select 'HR','chenjiu',3000 from dual
union all
select 'HR','zhaoliu',10000 from dual;
commit;
--统计各部门F_SALARY总数
SELECT F_DEPART,F_EMP,F_SALARY,SUM(F_SALARY) OVER(PARTITION BY F_SALARY)
FROM T_salary
--统计各部门F_SALARY累计和
SELECT F_DEPART,F_EMP,F_SALARY,SUM(F_SALARY) OVER(PARTITION BY F_SALARY order by F_EMP)
FROM T_salary
--计算每个员员工占所在部门的工资比例
SELECT F_DEPART,F_EMP,F_SALARY,
RATIO_TO_REPORT(F_SALARY) OVER(PARTITION BY F_DEPART) RATIO_SALARY
--计算累计占比
select f_depart,f_emp,f_salary,
sum(f_salary) over(partition by f_depart order by f_emp/* RANGE unbounded preceding*/)/sum(f_salary) over(partition by f_depart) sum_salary
from t_salary;
--计算累计占比
select f_depart,f_emp,f_salary,
sum(f_salary) over(partition by f_depart order by f_emp RANGE unbounded preceding)/sum(f_salary) over(partition by f_depart) sum_salary
from t_salary;
--计算部门所占公司的工资比例
ORACLE OLAP函数:
语法:FUNCTION_NAME(,,...) OVER()
OLAP函数语法四个部分:
1.function函数本身,用于对窗口中的数据进行操作;
2.partition clause用于将结果集分区;
3.order by clause用于对分区中的数据进行排序;
4.windowing clause用于定义function在其上操作的行的集合,即function所影响的范围。
(1)通过如下实例来理解Order by的用法:
select f_depart,f_emp,f_salary,
sum(f_salary) over() sum_salary
from t_salary;
得到的结果集如下:
IT ZHANGSAN 10000 38000
IT LISI 2000 38000
HR WANGWU 3000 38000
HR ZHAOLIU 10000 38000
HR YANGTAO 5000 38000
IT YANGBO 8000 38000
2.下面的语句将是带有order by的查询语句
select f_depart,f_emp,f_salary,
sum(f_salary) over(order by f_emp) sum_salary
from t_salary;
得到的结果集如下:
IT LISI 2000 2000
HR WANGWU 3000 5000
IT YANGBO 8000 13000
HR YANGTAO 5000 18000
IT ZHANGSAN 10000 28000
HR ZHAOLIU 10000 38000
(二)用于排列的函数
select f_depart,f_emp,f_salary,
rank() over(partition by f_depart order by f_salary) rank1,
dense_rank () over(partition by f_depart order by f_salary) rank2,
row_number() over(partition by f_depart order by f_salary) rank2
from t_salary;
上面的例子使用了RANK,dense_rank以及row_number()函数,下面依次简单介绍一下这三个函数的用处。Rank函数即在给定的结果集中按照指定的顺序进行值得排序,而后者dense_rank和前者的区别仅在于前者对于相等的值排名是跳跃的,而后者不。Row_number也是提供的排序的功能。注意区分rownum关键字和row_number函数的区别。
Select * from dept where rownum<=3 order by deptno;
这个查询语句没问题,结果会达到你预期的效果。但是当你试图执行如下语句的时候就可能无法达到你想要的效果!select * from dept where rownum between 2 and 3 order by deptno.原因很简单,这里不再讲述。
(三)开窗语句“rows 2 preceding”
select f_depart,f_emp,f_salary,
sum(f_salary) over(order by f_emp rows 1 preceding) sum_salary
from t_salary;
结果将如下所示:
IT LISI 2000 2000
HR WANGWU 3000 5000
IT YANGBO 8000 11000
HR YANGTAO 5000 13000
IT ZHANGSAN 10000 15000
HR ZHAOLIU 10000 20000
由上可以看出通过使用”rows 1 preceding”,将会使求和窗口变为当前行及其前一行。
“rows n preceding”将当前行及其前面的n行作为一个窗口。
Range unbounded preceding会把当前之前的所有行都包含进来,但当包含partition by column 语句时,则只会将当前partition by语句包含的窗口包含进来。
(四)first_value,last_value分析函数,分别返回结果集中排在第一位和最后一位的两个分析函数。
select f_depart,f_emp,f_salary,
first_value(f_emp) over(partition by f_depart order by f_emp) first_name,
sum(f_salary) over(partition by f_depart order by f_emp) sum_salary
from t_salary;
上述函数执行结果如下:
HR WANGWU 3000 WANGWU 3000
HR YANGTAO 5000 WANGWU 8000
HR ZHAOLIU 10000 WANGWU 18000
IT LISI 2000 LISI 2000
IT YANGBO 8000 LISI 10000
IT ZHANGSAN 10000 LISI 20000