力扣：184. 部门工资最高的员工（Python3）

恽劼恒

于 2023-11-29 17:37:56 发布

阅读量340

点赞数 11

分类专栏： LeetCode 文章标签： leetcode 算法 python pandas

本文链接：https://blog.csdn.net/yunjieheng/article/details/134695024

版权

LeetCode 专栏收录该内容

163 篇文章 1 订阅

订阅专栏

题目：

表： Employee

+--------------+---------+
| 列名          | 类型    |
+--------------+---------+
| id           | int     |
| name         | varchar |
| salary       | int     |
| departmentId | int     |
+--------------+---------+
在 SQL 中，id是此表的主键。
departmentId 是 Department 表中 id 的外键（在 Pandas 中称为 join key）。
此表的每一行都表示员工的 id、姓名和工资。它还包含他们所在部门的 id。

表： Department

+-------------+---------+
| 列名         | 类型    |
+-------------+---------+
| id          | int     |
| name        | varchar |
+-------------+---------+
在 SQL 中，id 是此表的主键列。
此表的每一行都表示一个部门的 id 及其名称。

查找出每个部门中薪资最高的员工。
按 任意顺序 返回结果表。
查询结果格式如下例所示。

来源：力扣（LeetCode）
链接：力扣（LeetCode）官网 - 全球极客挚爱的技术成长平台

示例：

示例 1：

输入：

Employee 表:
+----+-------+--------+--------------+
| id | name  | salary | departmentId |
+----+-------+--------+--------------+
| 1  | Joe   | 70000  | 1            |
| 2  | Jim   | 90000  | 1            |
| 3  | Henry | 80000  | 2            |
| 4  | Sam   | 60000  | 2            |
| 5  | Max   | 90000  | 1            |
+----+-------+--------+--------------+
Department 表:
+----+-------+
| id | name  |
+----+-------+
| 1  | IT    |
| 2  | Sales |
+----+-------+

输出：

+------------+----------+--------+
| Department | Employee | Salary |
+------------+----------+--------+
| IT         | Jim      | 90000  |
| Sales      | Henry    | 80000  |
| IT         | Max      | 90000  |
+------------+----------+--------+

解释：Max 和 Jim 在 IT 部门的工资都是最高的，Henry 在销售部的工资最高。

解法：

将Employee表左外连接Department表，根据部门分组找出每个部门的最高薪水。由于最高薪水对应的人数可能不止1人，所以遍历合并后的表，取出每个部门对应最高薪水的人的信息。

知识点：

1.DataFrame.itertuples(self, index=True, name='Pandas') ：返回一个迭代器，为DataFrame中的每行生成一个命名的元组。元组的第1个元素将是行的相应索引值，而其余值是行值。index：是否将将索引作为元组的第1个元素返回，默认为True；name：返回的namedtuple的名称。例如：
data = [[1, 'Joe', 70000, 1], [2, 'Jim', 90000, 1], [3, 'Henry', 80000, 2], [4, 'Sam', 60000, 2], [5, 'Max', 90000, 1]]
employee = pd.DataFrame(data, columns=['id', 'name', 'salary', 'departmentId']).astype({'id': 'Int64', 'name': 'object', 'salary': 'Int64', 'departmentId': 'Int64'})
for item in employee.itertuples():
    print(item)
2.getattr(object, name[, default])：返回对象属性值。object：对象，name：对象属性名。

代码：

import pandas as pd
from collections import defaultdict


def department_highest_salary(employee: pd.DataFrame, department: pd.DataFrame) -> pd.DataFrame:
    dic = defaultdict(list)
    m = pd.merge(employee, department, how='left', left_on='departmentId', right_on='id')[['name_y', 'name_x', 'salary']].sort_values(['name_y', 'salary'], ascending=[True, False], ignore_index=True)
    for item in m.itertuples():
        if len(dic[getattr(item, 'name_y')]) == 0 or getattr(item, 'salary') == dic[getattr(item, 'name_y')][0][0]:
            dic[getattr(item, 'name_y')].append([getattr(item, 'salary'), getattr(item, 'Index')])
    rows = []
    for item in list(map(lambda x: list(zip(*x))[1], dic.values())):
        rows.extend(item)
    return m.iloc[rows].rename(columns={'name_y': 'Department', 'name_x': 'Employee'})