0 问题描述
表:Employee
+--------------+---------+
| 列名 | 类型 |
+--------------+---------+
| id | int |
| name | varchar |
| salary | int |
| departmentId | int |
+--------------+---------+
id是此表的主键列。departmentId是Department表中ID的外键。
此表的每一行都表示员工的ID、姓名和工资。它还包含他们所在部门的ID。
表:Department
+-------------+---------+
| 列名 | 类型 |
+-------------+---------+
| id | int |
| name | varchar |
+-------------+---------+
id是此表的主键列。此表的每一行都表示一个部门的ID及其名称。
编写SQL查询以查找每个部门中薪资最高的员工。按任意顺序返回结果表。
查询结果格式如下例所示。
示例 1:
输入:
Employee 表:
+----+-------+--------+--------------+
| id | name | salary | departmentId |
+----+-------+--------+--------------+
| 1 | Joe | 70000 | 1 |
| 2 | Jim | 90000 | 1 |
| 3 | Henry | 80000 | 2 |
| 4 | Sam | 60000 | 2 |
| 5 | Max | 90000 | 1 |
+----+-------+--------+--------------+
Department 表:
+----+-------+
| id | name |
+----+-------+
| 1 | IT |
| 2 | Sales |
+----+-------+
输出:
+------------+----------+--------+
| Department | Employee | Salary |
+------------+----------+--------+
| IT | Jim | 90000 |
| Sales | Henry | 80000 |
| IT | Max | 90000 |
+------------+----------+--------+
1 数据准备
CREATE TABLE Department (
`id` int comment '',
`name` string comment ''
) COMMENT 'xxx'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
insert overwrite table Department
values ('1', 'IT'),
('2', 'Sales');
CREATE TABLE Employee (
`id` int comment '',
`name` string comment '',
salary int,
departmentId int
) COMMENT 'xxx'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
insert overwrite table Employee
values ('1', 'Joe', '70000', '1'),
('2', 'Jim', '90000', '1'),
('3', 'Henry', '80000', '2'),
('4', 'Sam', '60000', '2'),
('5', 'Max', '90000', '1');
2 思路分析
思路一:常规做法
-- MySQL中的语法
select
d.name as department,
e.name as employee,
e.salary as salary
from employee e
join department d on e.departmentId = d.id
where (e.departmentId, e.salary) in
(select departmentId, max(salary) from employee group by departmentId);
-- HiveSQL语法
select
d.name as department,
e.name as employee,
e.salary as salary
from employee e
join department d on e.departmentId = d.id
join (select departmentId, max(salary) m_salary from employee group by departmentId) e2
on e.departmentId = e2.departmentId and e.salary = e2.m_salary;
思路二:开窗函数
select
c.name Department,
b.name Employee,
b.salary
from (
select a.name,
a.salary,
a.departmentId,
rank() over (partition by a.departmentId order by a.salary desc) rk
from Employee a
) b
left join Department c on b.departmentId = c.id
where b.rk = 1;
3 总结
开窗函数简洁一些
参考文章:
https://zhuanlan.zhihu.com/p/615364465
https://blog.csdn.net/thenowaiting/article/details/139876583