oracle 窗口函数需求总结

什么是窗口函数

        窗口函数也叫作OLAP(Online Analytical Processing,联机分析处理)函数,可以对数据库中的数据进行复杂分析。

窗口函数的通用语法如下:

<窗口函数>over(partition by<用于分组的列名>

order by <用于排序的列名>)

我们看一下这个语法里每部分表示什么。

(1)<窗口函数>的位置可以放两种函数:一种是专用窗口函数,比如用于排名的函数,比如rank()、dense_rank()、row_number();另一种是汇总函数,比如sum()、avg()、count()、max()、min()。

(2)<窗口函数>后面的over关键字括号里的内容有两部分:一个是partition by,表示按某列分组;另一个是order by,表示对分组后的结果按某列排序。

(3)因为窗口函数通常是对where或者group by子句处理后的结果进行操作的,所以窗口函数原则上只能写在select子句中。

窗口函数可以解决这几类经典问题:排名问题、Top N问题、前百分之N问题、累计问题、每组内比较问题、连续问题。这些问题在工作中你会经常遇到,比如,排名问题,对用户搜索关键字按搜索次数排名、对商品按销售量排名。

第一类:排名问题

1.学生成绩排名.

测试数据:

-- 创建表格
CREATE TABLE t_score (
    student_name VARCHAR2(50),
    course_name VARCHAR2(50),
    score NUMBER
);

-- 插入数据
INSERT INTO t_score (student_name, course_name, score) VALUES ('小明', '数学', 85);
INSERT INTO t_score (student_name, course_name, score) VALUES ('小明', '英语', 78);
INSERT INTO t_score (student_name, course_name, score) VALUES ('小明', '物理', 92);
INSERT INTO t_score (student_name, course_name, score) VALUES ('小红', '数学', 90);
INSERT INTO t_score (student_name, course_name, score) VALUES ('小红', '英语', 80);
INSERT INTO t_score (student_name, course_name, score) VALUES ('小李', '数学', 90);
INSERT INTO t_score (student_name, course_name, score) VALUES ('小李', '数学', 60);
INSERT INTO t_score (student_name, course_name, score) VALUES ('小李', '英语', 85);
INSERT INTO t_score (student_name, course_name, score) VALUES ('小李', '物理', 85);
commit;
-- 取得每名学生不同课程的成绩排名.

sql语句实现:

2.去除最大值、最小值后求平均值

测试数据:

-- 创建表格
CREATE TABLE t_salary_table (
    employee_id NUMBER,
    department_id NUMBER,
    salary NUMBER
);

-- 插入数据
INSERT INTO t_salary_table (employee_id, department_id, salary) VALUES (1, 1, 50000);
INSERT INTO t_salary_table (employee_id, department_id, salary) VALUES (2, 1, 52000);
INSERT INTO t_salary_table (employee_id, department_id, salary) VALUES (3, 1, 48000);
INSERT INTO t_salary_table (employee_id, department_id, salary) VALUES (4, 1, 51000);
INSERT INTO t_salary_table (employee_id, department_id, salary) VALUES (5, 1, 49000);
INSERT INTO t_salary_table (employee_id, department_id, salary) VALUES (6, 2, 60000);
INSERT INTO t_salary_table (employee_id, department_id, salary) VALUES (7, 2, 58000);
INSERT INTO t_salary_table (employee_id, department_id, salary) VALUES (8, 2, 62000);
INSERT INTO t_salary_table (employee_id, department_id, salary) VALUES (9, 2, 59000);
INSERT INTO t_salary_table (employee_id, department_id, salary) VALUES (10, 2, 61000);


-- 查询数据
SELECT * FROM t_salary_table;

-- 查询每个部门去除最高、最低薪水后的平均薪水

sql实现:

第二类:TOP N问题

工作中会经常遇到这样的业务问题:

         如何找到每个类别下用户最喜欢的商品?

         如何找到每个类别下用户点击最多的5个商品?

 这类问题其实就是非常经典的Top N问题,也就是在对数据分组后,取每组里的最大值、最小值,或者每组里最大的N行(Top N)数据

1.查询前三名的成绩

数据:

-- 删除表格
DROP TABLE t_score;

-- 创建表格
CREATE TABLE t_score (
    course_id NUMBER,
    student_id NUMBER,
    score NUMBER
);

-- 插入数据
INSERT INTO t_score VALUES (1, 1, 85);
INSERT INTO t_score VALUES (1, 2, 78);
INSERT INTO t_score VALUES (1, 3, 92);
INSERT INTO t_score VALUES (1, 4, 90);
INSERT INTO t_score VALUES (1, 5, 80);
INSERT INTO t_score VALUES (1, 6, 92);
INSERT INTO t_score VALUES (1, 7, 78);
INSERT INTO t_score VALUES (1, 8, 92);
INSERT INTO t_score VALUES (1, 9, 85);
INSERT INTO t_score VALUES (2, 1, 88);
INSERT INTO t_score VALUES (2, 2, 82);
INSERT INTO t_score VALUES (2, 3, 90);
INSERT INTO t_score VALUES (2, 4, 85);
INSERT INTO t_score VALUES (2, 5, 78);
INSERT INTO t_score VALUES (2, 6, 88);
INSERT INTO t_score VALUES (2, 7, 82);
INSERT INTO t_score VALUES (2, 8, 90);
INSERT INTO t_score VALUES (2, 9, 82);

COMMIT;

-- 查询数据
SELECT * FROM t_score;

# todo 题目要求: “成绩表”中记录了学生选修的课程号、学生的学号,以及对应课程的成绩。
为了对学生成绩进行考核,现需要查询每门课程前三名学生的成绩。
# todo 注意:如果出现同样的成绩,则视为同一个名次。

2: 查询排在前两名的工资

数据:

-- 删除表格
DROP TABLE t_employee;

-- 创建表格
CREATE TABLE t_employee (
    emp_id NUMBER,
    emp_name VARCHAR2(50),
    salary NUMBER,
    department_id NUMBER
);

-- 插入数据
INSERT INTO t_employee (emp_id, emp_name, salary, department_id)
SELECT 1, '小明', 50000, 1 FROM dual UNION ALL
SELECT 2, '小红', 52000, 1 FROM dual UNION ALL
SELECT 3, '小李', 48000, 1 FROM dual UNION ALL
SELECT 4, '小张', 60000, 1 FROM dual UNION ALL
SELECT 5, '小王', 58000, 1 FROM dual UNION ALL
SELECT 6, '小刚', 62000, 1 FROM dual UNION ALL
SELECT 7, '小丽', 45000, 2 FROM dual UNION ALL
SELECT 8, '小芳', 47000, 2 FROM dual UNION ALL
SELECT 9, '小晓', 49000, 2 FROM dual UNION ALL
SELECT 10, '小华', 52000, 2 FROM dual UNION ALL
SELECT 11, '小雷', 52000, 2 FROM dual;

COMMIT;

-- 查询数据
SELECT * FROM t_employee;
-- 现在要查找每个部门工资排在前两名的雇员信息,若雇员工资一样,则并列获取。

sql实现:

with t1 as (
    select t_employee.*, dense_rank() over (partition by department_id order by salary desc) dr
    from t_employee
)
select * from t1 where dr<=2

运行:

第三类:累计问题

查询员工的累计薪水

        汇总函数sum()用在窗口函数中,表示对数据进行累计求和

        “rows between <范围起始行> and <范围终止行>”用于指定移动窗口的范围,范围包含起始行和终止行。

        range between  连续月份

        其中,“范围起始行”和“范围终止行”使用特定关键字表示,常用的特定关键字如下。

       • n preceding:当前行的前n行。

       • n following:当前行的后n行。

       • current row:当前行。

       • unbounded preceding:第1行。

       • unbounded following:最后1行。

测试数据:

-- 579. 查询员工的累计薪水
drop table t_employee;
create table t_employee (id int, month int, salary int);
truncate table t_employee;

insert into t_employee values ('1', '1', '20');
insert into t_employee values ('2', '1', '20');
insert into t_employee values ('1', '2', '30');
insert into t_employee values ('2', '2', '30');
insert into t_employee values ('3', '2', '40');
insert into t_employee values ('1', '3', '40');
insert into t_employee values ('3', '3', '60');
insert into t_employee values ('1', '4', '60');
insert into t_employee values ('3', '4', '70');
insert into t_employee values ('1', '7', '90');
insert into t_employee values ('1', '8', '90');

commit;
select * from t_employee order by id, month desc;
-- 编写一个解决方案,在一个统一的表中计算出每个员工的 累计工资汇总 。
-- 员工的 累计工资汇总 可以计算如下:
-- 对于该员工工作的每个月,将 该月 和 前两个月 的工资 加 起来。
    --这是他们当月的 3 个月总工资和 
-- 返回按 id 升序排序 的结果表。如果 id 相等,请按 month 降序排序。
select
    t_employee.*,
    sum(salary)
    over (partition by id order by month desc range between current row and 2 following ) sum_s
from t_employee
order by id,month desc;

第四类:每组内比较问题

1.每组大于平均值

测试数据:


-- 创建表格
CREATE TABLE t_score (
    student_name VARCHAR2(20),
    course_name VARCHAR2(20),
    score NUMBER(3)
);

-- 插入数据
INSERT INTO t_score VALUES ('张三', '语文', 90);
INSERT INTO t_score VALUES ('李四', '语文', 81);
INSERT INTO t_score VALUES ('王朝', '语文', 79);
INSERT INTO t_score VALUES ('马汉', '语文', 88);
INSERT INTO t_score VALUES ('张三', '数学', 85);
INSERT INTO t_score VALUES ('李四', '数学', 86);
INSERT INTO t_score VALUES ('王朝', '数学', 92);
INSERT INTO t_score VALUES ('马汉', '数学', 83);
INSERT INTO t_score VALUES ('张三', '英语', 87);
INSERT INTO t_score VALUES ('李四', '英语', 98);
INSERT INTO t_score VALUES ('王朝', '英语', 93);
INSERT INTO t_score VALUES ('马汉', '英语', 95);

COMMIT;

-- 查询数据
SELECT * FROM t_score;

# todo “成绩表”,记录了每个学生各科的成绩。现在要查找单科成绩高于该科目平均成绩的学生名单。

实现:

-- t1求每个科目的平均值
with t1 as (   select t_score.*, avg(score) over ( partition by course_name) avg
from t_score)
-- 过滤单科成绩高于该科目平均成绩的学生名单
select  course_name,student_name from t1 where score>avg;

2.低于平均薪水的雇员

-- 创建员工表
CREATE TABLE t_employee (
    employeeID INT,
    departmentID INT,
    salary INT
);

-- 插入数据
INSERT INTO t_employee (employeeID, departmentID, salary) VALUES (10001, 1, 60117);
INSERT INTO t_employee (employeeID, departmentID, salary) VALUES (10002, 2, 92102);
INSERT INTO t_employee (employeeID, departmentID, salary) VALUES (10003, 2, 86074);
INSERT INTO t_employee (employeeID, departmentID, salary) VALUES (10004, 1, 66596);
INSERT INTO t_employee (employeeID, departmentID, salary) VALUES (10005, 1, 66961);
INSERT INTO t_employee (employeeID, departmentID, salary) VALUES (10006, 2, 81046);
INSERT INTO t_employee (employeeID, departmentID, salary) VALUES (10007, 2, 94333);
INSERT INTO t_employee (employeeID, departmentID, salary) VALUES (10008, 1, 75286);
INSERT INTO t_employee (employeeID, departmentID, salary) VALUES (10009, 2, 85994);
INSERT INTO t_employee (employeeID, departmentID, salary) VALUES (10010, 1, 76884);
-- 现在公司要找出每个部门低于平均薪水的雇员,然后进行培训来提高雇员工作效率,从而提高雇员薪水。

sql实现:

-- 现在公司要找出每个部门低于平均薪水的雇员,然后进行培训来提高雇员工作效率,从而提高雇员薪水。
with t1 as (
    select t_employee.* ,avg(salary) over ( partition by departmentID ) avg from t_employee
)
select * from t1 where salary < avg order by departmentID;

第五类:连续性问题

1.连续3次为球队得分的球员名单

-- 创建分数表,并为列名增加注释
drop table t_score;

CREATE TABLE t_score (
    team_name VARCHAR2(50),
    player_id INT,
    player_name VARCHAR2(50),
    score INT,
    score_time TIMESTAMP
);

COMMENT ON COLUMN t_score.team_name IS '球队名称';
COMMENT ON COLUMN t_score.player_id IS '球员ID';
COMMENT ON COLUMN t_score.player_name IS '球员姓名';
COMMENT ON COLUMN t_score.score IS '得分';
COMMENT ON COLUMN t_score.score_time IS '得分时间';


INSERT INTO t_score VALUES ('洛杉矶湖人队', 23, '勒布朗·詹姆斯', 3, TO_TIMESTAMP('2023-12-25 10:00:00', 'YYYY-MM-DD HH24:MI:SS'));
INSERT INTO t_score VALUES ('洛杉矶湖人队', 23, '勒布朗·詹姆斯', 3, TO_TIMESTAMP('2023-12-25 10:15:00', 'YYYY-MM-DD HH24:MI:SS'));
INSERT INTO t_score VALUES ('洛杉矶湖人队', 23, '勒布朗·詹姆斯', 1, TO_TIMESTAMP('2023-12-25 10:30:00', 'YYYY-MM-DD HH24:MI:SS'));
INSERT INTO t_score VALUES ('洛杉矶湖人队', 3, '安东尼·戴维斯', 2, TO_TIMESTAMP('2023-12-25 10:32:00', 'YYYY-MM-DD HH24:MI:SS'));
INSERT INTO t_score VALUES ('洛杉矶湖人队', 23, '勒布朗·詹姆斯', 3, TO_TIMESTAMP('2023-12-25 10:45:00', 'YYYY-MM-DD HH24:MI:SS'));
INSERT INTO t_score VALUES ('洛杉矶湖人队', 23, '勒布朗·詹姆斯', 3, TO_TIMESTAMP('2023-12-25 11:00:00', 'YYYY-MM-DD HH24:MI:SS'));
INSERT INTO t_score VALUES ('洛杉矶湖人队', 23, '勒布朗·詹姆斯', 2, TO_TIMESTAMP('2023-12-25 11:15:00', 'YYYY-MM-DD HH24:MI:SS'));
INSERT INTO t_score VALUES ('洛杉矶湖人队', 23, '勒布朗·詹姆斯', 2, TO_TIMESTAMP('2023-12-25 11:30:00', 'YYYY-MM-DD HH24:MI:SS'));
INSERT INTO t_score VALUES ('金州勇士队', 30, '斯蒂芬·库里', 1, TO_TIMESTAMP('2023-12-25 10:10:00', 'YYYY-MM-DD HH24:MI:SS'));
INSERT INTO t_score VALUES ('金州勇士队', 30, '斯蒂芬·库里', 1, TO_TIMESTAMP('2023-12-25 10:25:00', 'YYYY-MM-DD HH24:MI:SS'));
INSERT INTO t_score VALUES ('金州勇士队', 30, '斯蒂芬·库里', 1, TO_TIMESTAMP('2023-12-25 10:40:00', 'YYYY-MM-DD HH24:MI:SS'));
INSERT INTO t_score VALUES ('金州勇士队', 11, '克莱·汤普森', 2, TO_TIMESTAMP('2023-12-25 10:45:00', 'YYYY-MM-DD HH24:MI:SS'));
INSERT INTO t_score VALUES ('金州勇士队', 30, '斯蒂芬·库里', 2, TO_TIMESTAMP('2023-12-25 10:55:00', 'YYYY-MM-DD HH24:MI:SS'));
INSERT INTO t_score VALUES ('金州勇士队', 30, '斯蒂芬·库里', 2, TO_TIMESTAMP('2023-12-25 11:10:00', 'YYYY-MM-DD HH24:MI:SS'));
INSERT INTO t_score VALUES ('金州勇士队', 30, '斯蒂芬·库里', 3, TO_TIMESTAMP('2023-12-25 11:25:00', 'YYYY-MM-DD HH24:MI:SS'));
INSERT INTO t_score VALUES ('金州勇士队', 30, '斯蒂芬·库里', 3, TO_TIMESTAMP('2023-12-25 11:40:00', 'YYYY-MM-DD HH24:MI:SS'));
INSERT INTO t_score VALUES ('金州勇士队', 30, '斯蒂芬·库里', 3, TO_TIMESTAMP('2023-12-25 11:55:00', 'YYYY-MM-DD HH24:MI:SS'));

-- 查询数据
SELECT * FROM t_score;
-- 请你写一个SQL语句,统计出连续3次为球队得分的球员名单。

sql实现:

第一种方法:  lead() 适用小范围连续

with t1 as (   SELECT t_score.*,
       -- 向下拿一行
       lead(player_id,1) over ( partition by team_name order by score_time) rn1,
       -- 向下拿两行
       lead(player_id,1) over ( partition by team_name order by score_time) rn2
FROM t_score)
       -- 过滤连续三次的id
select distinct player_name,player_id,team_name from t1 
where player_id = rn1 and rn1 = rn2;

第二种方法:等差数列 通用

分析:

with t1 as(
select
       Cinema.*,
       row_number() over (partition by free order by seat_id) as rn1,
       seat_id - (row_number() over (partition by free order by seat_id)) as 差值
from Cinema),
t2 as (
select seat_id,count(差值) over (partition by 差值) as 计数 from t1 where free=1)
select seat_id from t2 where 计数>1 order by seat_id;

2.连续空余座位

方法1:

Create table Cinema
(
    seat_id number primary key,
    free    varchar(2)
);

insert into Cinema (seat_id, free)
values ('1', '1');
insert into Cinema (seat_id, free)
values ('2', '0');
insert into Cinema (seat_id, free)
values ('3', '1');
insert into Cinema (seat_id, free)
values ('4', '1');
insert into Cinema (seat_id, free)
values ('5', '1');
-- 查找连续2个及以上座位可用的座位号
select *
from cinema;
with t1 as (   select Cinema.*,
       lead(free,1) over ( order by seat_id) rn1,
       lag(free,1) over ( order by seat_id) rn2
from Cinema)
select seat_id from t1
where free = 1 and rn1 = 1
      -- 当前为1,rn1=null,上一行也为1 --当前座位也是空的
    or(free=1 and rn1 is null and rn2=1)

方法2:

with t1 as(
select
       Cinema.*,
       row_number() over (partition by free order by seat_id) as rn1,
       seat_id - (row_number() over (partition by free order by seat_id)) as 差值
from Cinema),
t2 as (
select seat_id,count(差值) over (partition by 差值) as 计数 from t1 where free=1)
select seat_id from t2 where 计数>1 order by seat_id;

3.连续出现的数字


Create table  Logs (id int, num int);
Truncate table Logs;
insert into Logs (id, num) values ('1', '1');
insert into Logs (id, num) values ('2', '1');
insert into Logs (id, num) values ('3', '1');
insert into Logs (id, num) values ('4', '2');
insert into Logs (id, num) values ('5', '1');
insert into Logs (id, num) values ('6', '2');
insert into Logs (id, num) values ('7', '2');

select * from logs;

# todo 需求: 找出所有至少连续出现三次的数字。
# todo 返回的结果表中的数据可以按 任意顺序 排列。

sql实现:

-- 方法1:
with t1 as (
select
       Logs.*,
       id - (row_number() over (partition by num order by id)) 差值
from Logs),
     t2 as (
select num,count(1) over (partition by 差值,num) as 计数 from t1)
select distinct num from t2 where 计数>=3;


方法2 lag lead
with t1 as (
select
       Logs.*,
       lag(num) over (order by id) lag1,
       lead(num) over (order by id) lead1
from Logs)
select distinct num from t1 where t1.num=lag1 and t1.num=lead1;

运行结果:

  • 20
    点赞
  • 25
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
引用提到了Oracle中的窗口函数窗口函数是一种用于查询和计算的特殊函数。它可以和其他函数(如统计函数、排名函数、错行函数)联合使用,能够在一次查询中进行多步计算。窗口函数的常用属性是partition by和order by,其中partition by类似于group by,用于将结果分成不同的分区,而order by用于指定排序的顺序。 引用给出了一个使用窗口函数的例子,查询每个同学单科成绩和班级相应单科成绩的差值。使用窗口函数可以简化这个查询,代码如下: ```sql select t.*, (t.score-avg(t.score) over( partition by t.subject_id)) as gaps from test_student_score t ``` 这个查询使用了over函数和partition by子句来计算每个同学单科成绩和班级相应单科成绩的差值。 引用中展示了另一个常用的窗口函数示例,用于去重。这个查询使用了row_number()函数配合over函数和partition by子句来对数据进行编号,并筛选出每个分组中的第一行数据。 总结来说,Oracle中的窗口函数是一种强大的工具,可以简化复杂的查询和计算,常用的属性有partition by和order by。它可以和其他函数联合使用,可以满足各种数据分析和计算的需求。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* *2* [Oracle中的窗口函数](https://blog.csdn.net/wyqwilliam/article/details/82555712)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v92^chatsearchT0_1"}}] [.reference_item style="max-width: 50%"] - *3* [oracle的分组排序函数(窗口函数)](https://blog.csdn.net/m0_46636892/article/details/122984409)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v92^chatsearchT0_1"}}] [.reference_item style="max-width: 50%"] [ .reference_list ]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值