Leetcode 185. 部门工资前三高的所有员工
题目链接:185.部门公司前三高的所有员工
为了方便自己调试,我们可以在本地mysql建库建表
DROP DATEBASE IF EXISTS leetcode185;
CREATE DATABASE leetcode185;
USE leetcode185;
DROP TABLE IF EXISTS Employee;
CREATE TABLE `Employee` (
`Id` int NOT NULL AUTO_INCREMENT,
`Name` varchar(10) NOT NULL,
`Salary` int NOT NULL,
`DepartmentId` int NOT NULL,
PRIMARY KEY (`Id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS Department;
CREATE TABLE `Department` (
`Id` int NOT NULL AUTO_INCREMENT,
`Name` varchar(10) NOT NULL,
PRIMARY KEY (`Id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `Employee`(`Name`, `Salary`, `DepartmentId`) VALUES
('Joe', 85000, 1),
('Henry', 80000, 2),
('Sam', 60000, 2),
('Max', 90000, 1),
('Janet', 69000, 1),
('Randy', 85000, 1),
('Will', 70000, 1);
INSERT INTO `Department`(`Id`, `Name`) VALUES (1, 'IT'), (2, 'Sales');
我们可以查看我们建好的表
mysql> USE leetcode185;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> SHOW TABLES;
+-----------------------+
| Tables_in_leetcode185 |
+-----------------------+
| Department |
| Employee |
+-----------------------+
2 rows in set (0.00 sec)
mysql> SHOW COLUMNS FROM Employee;
+--------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-------------+------+-----+---------+----------------+
| Id | int | NO | PRI | NULL | auto_increment |
| Name | varchar(10) | NO | | NULL | |
| Salary | int | NO | | NULL | |
| DepartmentId | int | NO | | NULL | |
+--------------+-------------+------+-----+---------+----------------+
4 rows in set (0.01 sec)
mysql> SHOW COLUMNS FROM Department;
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| Id | int | NO | PRI | NULL | auto_increment |
| Name | varchar(10) | NO | | NULL | |
+-------+-------------+------+-----+---------+----------------+
2 rows in set (0.00 sec)
mysql> SELECT * FROM Employee;
+----+-------+--------+--------------+
| Id | Name | Salary | DepartmentId |
+----+-------+--------+--------------+
| 1 | Joe | 85000 | 1 |
| 2 | Henry | 80000 | 2 |
| 3 | Sam | 60000 | 2 |
| 4 | Max | 90000 | 1 |
| 5 | Janet | 69000 | 1 |
| 6 | Randy | 85000 | 1 |
| 7 | Will | 70000 | 1 |
+----+-------+--------+--------------+
7 rows in set (0.00 sec)
mysql> SELECT * FROM Department;
+----+-------+
| Id | Name |
+----+-------+
| 1 | IT |
| 2 | Sales |
+----+-------+
2 rows in set (0.01 sec)
mysql>
没什么问题的话我们就可以开始思考。
首先,sql的底层实现肯定也是遍历表。链接查询就是笛卡尔积,说白了就是两层for循环。
我们可以用C来写个伪代码:
有这样一种思路:同一部门,所有人的工资放在一起,排序去重,找出前三大,然后再扫描所有人,看工资是否>=部门的第三大工资。
但是这样并不好实现。可能部门的工资去重之后没有3个。而且按照部门分组也不是很容易。
我们可以这样考虑,对于每个一人是否选择,就是看在同一部门中,比他工资高的人的工资去重之后是否小于3。
# Write your MySQL query statement below
SELECT d.Name AS 'Department', e.Employee, e.Salary
FROM Department d INNER JOIN (
SELECT NAME AS 'Employee', Salary, DepartmentId
FROM Employee e1
WHERE 3 > (
SELECT COUNT(DISTINCT Salary)
FROM Employee e2
WHERE e1.DepartmentId = e2.DepartmentId
AND e2.Salary > e1.Salary
)
) e
ON d.Id = e.DepartmentId;
我们来看看leetcode的输入格式
{"headers": {"Employee": ["Id", "Name", "Salary", "DepartmentId"], "Department": ["Id", "Name"]}, "rows": {"Employee": [[1, "Joe", 85000, 1], [2, "Henry", 80000, 2], [3, "Sam", 60000, 2], [4, "Max", 90000, 1], [5, "Janet", 69000, 1], [6, "Randy", 85000, 1], [7, "Will", 70000, 1]], "Department": [[1, "IT"], [2, "Sales"]]}}
显然是json格式。
我们可以将其进行格式化,便于观察
{
"headers": {
"Employee": ["Id", "Name", "Salary", "DepartmentId"],
"Department": ["Id", "Name"]
},
"rows": {
"Employee": [[1, "Joe", 85000, 1], [2, "Henry", 80000, 2], [3, "Sam", 60000, 2], [4, "Max", 90000, 1], [5, "Janet", 69000, 1], [6, "Randy", 85000, 1], [7, "Will", 70000, 1]],
"Department": [[1, "IT"], [2, "Sales"]]
}
};
完全格式化:
{
"headers": {
"Employee": ["Id", "Name", "Salary", "DepartmentId"],
"Department": ["Id", "Name"]
},
"rows": {
"Employee": [
[1, "Joe", 85000, 1],
[2, "Henry", 80000, 2],
[3, "Sam", 60000, 2],
[4, "Max", 90000, 1],
[5, "Janet", 69000, 1],
[6, "Randy", 85000, 1],
[7, "Will", 70000, 1]
],
"Department": [
[1, "IT"],
[2, "Sales"]
]
}
}
我们可以发现,这个json字符串给我们提供了如下的信息:该数据库中所有表的所有列名,以及所有表的所有行的数据。
headers 记录了所有的表头信息
rows 记录了所有要插入的数据
我们可以写出相应的文法G
这里我们规定<>
尖括号中间的标识符为非终结符,->
表示“生成”,“推导出”,ϵ
表示空串,reg
开头的表示正则表达式//
表示注释
<>
<db_test> -> {} | {<table_defination>, <table_data>}
<table_defination> -> ϵ | "headers": {<table_column_names>}
<table_column_names> -> ϵ | <table_column_names>, <column_name_kv>
<column_name_kv> -> <table_name>: [<column_name_list>]
<table_name> -> <property_name>
<property_name> -> <identifier_name> // 变量标识符
<identifier_name> -> "<identifier>"
<identifier> -> <alpha> | _ | <identifier><alp_num>
<alpha> -> reg[a-zA-Z] // 英文字母
<alp_num> -> reg[a-zA-Z0-9_] // 英文字母或数字或下划线
<column_name_list> -> [<column_names>]
<column_names> -> <column_name> | <column_names>, <column_name>
<column_name> -> <identifier_name>
<table_data> -> ϵ | "row": {<table_row_datas>}
<table_row_datas> -> ϵ | <table_row_datas>, <row_data_kv>
<row_data_kv> -> <table_name>: [<rows_list>]
<rows_list> -> <one_row> | <rows_list>, <one_row>
<one_row> ->[<column_values>]
<column_values> -> <value_type> | <column_values>, <value_type>
<value_type> -> <number> | <str> // json的值类型可以看作只有数字和字符串
<str> -> <identifier_name>
<number> -> [1-9] | <number>[0-9] //假设都是正整数
文法和C解析再说,显然解析json可以直接用python的json模块
#!/usr/bin/python3
import json;
#db_json = '{"headers": {"Employee": ["Id", "Name", "Salary", "DepartmentId"], "Department": ["Id", "Name"]}, "rows": {"Employee": [[1, "Joe", 85000, 1], [2, "Henry", 80000, 2], [3, "Sam", 60000, 2], [4, "Max", 90000, 1], [5, "Janet", 69000, 1], [6, "Randy", 85000, 1], [7, "Will", 70000, 1]], "Department": [[1, "IT"], [2, "Sales"]]}}'
def get_insert_sql_from_json(db_json):
db_obj = json.loads(db_json)
headers = db_obj.get("headers")
rows = db_obj.get("rows")
for tb_name in rows.keys():
print ('DELETE FROM ' + tb_name + ';')
column_names = headers.get(tb_name)
column_count = len(column_names)
if column_count == 0:
continue
all_rows = rows.get(tb_name)
line1 = 'INSERT INTO ' + tb_name + '(' + column_names[0]
for i in range(1, column_count):
line1 += ', ' + column_names[i]
line1 += ') VALUES'
print (line1)
row_count = len(all_rows)
for i in range(0, row_count):
one_row = all_rows[i]
if len(one_row) != column_count:
continue
line2 = '('
line2 += '"'+one_row[0]+'"' if type(one_row[0]) == type('') else str(one_row[0])
for j in range(1, column_count):
line2 += ', '
line2 += '"'+one_row[j]+'"' if type(one_row[j]) == type('') else str(one_row[j])
line2 += ');' if i == row_count-1 else '),'
print (' ',line2)
if __name__ == "__main__":
db_json = input()
get_insert_sql_from_json(db_json)
显然这是我们期望的输出
DELETE FROM Employee;
INSERT INTO Employee(Id, Name, Salary, DepartmentId) VALUES
(1, "Joe", 85000, 1),
(2, "Henry", 80000, 2),
(3, "Sam", 60000, 2),
(4, "Max", 90000, 1),
(5, "Janet", 69000, 1),
(6, "Randy", 85000, 1),
(7, "Will", 70000, 1);
DELETE FROM Department;
INSERT INTO Department(Id, Name) VALUES
(1, "IT"),
(2, "Sales");
可以直接粘贴到navicat中运行。
显然也可以用with open 文件输入输出,或者直接用终端pipe重定向。
至此,这道题引发的思考告一段落。
待续…