目录
1251. 平均售价
Table: Prices
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| product_id | int |
| start_date | date |
| end_date | date |
| price | int |
+---------------+---------+
(product_id,start_date,end_date) 是 Prices 表的主键。
Prices 表的每一行表示的是某个产品在一段时期内的价格。
每个产品的对应时间段是不会重叠的,这也意味着同一个产品的价格时段不会出现交叉。
Table: UnitsSold
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| product_id | int |
| purchase_date | date |
| units | int |
+---------------+---------+
UnitsSold 表没有主键,它可能包含重复项。
UnitsSold 表的每一行表示的是每种产品的出售日期,单位和产品 id。
编写SQL查询以查找每种产品的平均售价。
average_price 应该四舍五入到小数点后两位。
查询结果格式如下例所示:
Prices table:
+------------+------------+------------+--------+
| product_id | start_date | end_date | price |
+------------+------------+------------+--------+
| 1 | 2019-02-17 | 2019-02-28 | 5 |
| 1 | 2019-03-01 | 2019-03-22 | 20 |
| 2 | 2019-02-01 | 2019-02-20 | 15 |
| 2 | 2019-02-21 | 2019-03-31 | 30 |
+------------+------------+------------+--------+
UnitsSold table:
+------------+---------------+-------+
| product_id | purchase_date | units |
+------------+---------------+-------+
| 1 | 2019-02-25 | 100 |
| 1 | 2019-03-01 | 15 |
| 2 | 2019-02-10 | 200 |
| 2 | 2019-03-22 | 30 |
+------------+---------------+-------+
Result table:
+------------+---------------+
| product_id | average_price |
+------------+---------------+
| 1 | 6.96 |
| 2 | 16.96 |
+------------+---------------+
平均售价 = 产品总价 / 销售的产品数量。
产品 1 的平均售价 = ((100 * 5)+(15 * 20) )/ 115 = 6.96
产品 2 的平均售价 = ((200 * 15)+(30 * 30) )/ 230 = 16.96
SELECT a.product_id, round(SUM(a.units * b.price) / SUM(a.units), 2) AS average_price
FROM UnitsSold a
JOIN Prices b
ON (a.product_id = b.product_id
AND a.purchase_date >= b.start_date
AND a.purchase_date <= b.end_date)
GROUP BY product_id;
生成数据代码如下:
CREATE TABLE `prices` (
`product_id` INT ( 11 ) NOT NULL,
`start_date` date NOT NULL,
`end_date` date NOT NULL,
`price` INT ( 10 ) DEFAULT NULL,
PRIMARY KEY ( `product_id`, `start_date`, `end_date` ) USING BTREE
) ENGINE = INNODB DEFAULT CHARSET = utf8;
CREATE TABLE `unitssold` (
`product_id` int(11) DEFAULT NULL,
`purchase_date` date DEFAULT NULL,
`units` int(255) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO prices (product_id,start_date,end_date,price) VALUES (1,'2019-02-17','2019-02-28',5);
INSERT INTO prices (product_id,start_date,end_date,price) VALUES (1,'2019-03-01','2019-03-22',20);
INSERT INTO prices (product_id,start_date,end_date,price) VALUES (2,'2019-02-01','2019-02-20',15);
INSERT INTO prices (product_id,start_date,end_date,price) VALUES (2,'2019-02-21','2019-03-31',30);
INSERT INTO unitssold (product_id,purchase_date,units) VALUES (1,'2019-02-25',100);
INSERT INTO unitssold (product_id,purchase_date,units) VALUES (1,'2019-03-01',15);
INSERT INTO unitssold (product_id,purchase_date,units) VALUES (2,'2019-02-10',200);
INSERT INTO unitssold (product_id,purchase_date,units) VALUES (2,'2019-03-22',30);
571. 给定数字的频率查询中位数
Numbers 表保存数字的值及其频率。
+----------+-------------+
| Number | Frequency |
+----------+-------------|
| 0 | 7 |
| 1 | 1 |
| 2 | 3 |
| 3 | 1 |
+----------+-------------+
在此表中,数字为 0, 0, 0, 0, 0, 0, 0, 1, 2, 2, 2, 3,所以中位数是 (0 + 0) / 2 = 0。
+--------+
| median |
+--------|
| 0.0000 |
+--------+
请编写一个查询来查找所有数字的中位数并将结果命名为 median 。
解题思路:中位数顺序 = (总个数+1)/2(参考地址)
SELECT c.*,AVG(number) median
FROM (SELECT a.Number, a.Frequency, SUM(b.frequency) cumttl
FROM Numbers a,Numbers b
WHERE a.Number >= b.Number
GROUP BY a.Number
ORDER BY a.Number
) c
WHERE cumttl = FLOOR((SELECT (SUM(frequency)+1)/2 num FROM numbers))
OR cumttl = CEILING((SELECT (SUM(frequency)+1)/2 num FROM numbers));
生成数据代码如下:
CREATE TABLE `numbers` (
`Number` int(11) NOT NULL,
`Frequency` int(255) DEFAULT NULL,
PRIMARY KEY (`Number`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO numbers (Number,Frequency)VALUES(0,7);
INSERT INTO numbers (Number,Frequency)VALUES(1,1);
INSERT INTO numbers (Number,Frequency)VALUES(2,3);
INSERT INTO numbers (Number,Frequency)VALUES(3,1);
1571. 仓库经理
表:Warehouse
+--------------+---------+
| Column Name | Type |
+--------------+---------+
| name | varchar |
| product_id | int |
| units | int |
+--------------+---------+
(name, product_id) 是该表主键.
该表的行包含了每个仓库的所有商品信息.
表: Products
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| product_id | int |
| product_name | varchar |
| Width | int |
| Length | int |
| Height | int |
+---------------+---------+
product_id 是该表主键.
该表的行包含了每件商品以英尺为单位的尺寸(宽度, 长度和高度)信息.
写一个 SQL 查询来报告, 每个仓库的存货量是多少立方英尺.
仓库名
存货量
返回结果没有顺序要求.
查询结果如下例所示.
Warehouse 表:
+------------+--------------+-------------+
| name | product_id | units |
+------------+--------------+-------------+
| LCHouse1 | 1 | 1 |
| LCHouse1 | 2 | 10 |
| LCHouse1 | 3 | 5 |
| LCHouse2 | 1 | 2 |
| LCHouse2 | 2 | 2 |
| LCHouse3 | 4 | 1 |
+------------+--------------+-------------+
Products 表:
+------------+--------------+------------+----------+-----------+
| product_id | product_name | Width | Length | Height |
+------------+--------------+------------+----------+-----------+
| 1 | LC-TV | 5 | 50 | 40 |
| 2 | LC-KeyChain | 5 | 5 | 5 |
| 3 | LC-Phone | 2 | 10 | 10 |
| 4 | LC-T-Shirt | 4 | 10 | 20 |
+------------+--------------+------------+----------+-----------+
Result 表:
+----------------+------------+
| WAREHOUSE_NAME | VOLUME |
+----------------+------------+
| LCHouse1 | 12250 |
| LCHouse2 | 20250 |
| LCHouse3 | 800 |
+----------------+------------+
Id为1的商品(LC-TV)的存货量为 5x50x40 = 10000
Id为2的商品(LC-KeyChain)的存货量为 5x5x5 = 125
Id为3的商品(LC-Phone)的存货量为 2x10x10 = 200
Id为4的商品(LC-T-Shirt)的存货量为 4x10x20 = 800
仓库LCHouse1: 1个单位的LC-TV + 10个单位的LC-KeyChain + 5个单位的LC-Phone.
总存货量为: 1*10000 + 10*125 + 5*200 = 12250 立方英尺
仓库LCHouse2: 2个单位的LC-TV + 2个单位的LC-KeyChain.
总存货量为: 2*10000 + 2*125 = 20250 立方英尺
仓库LCHouse3: 1个单位的LC-T-Shirt.
总存货量为: 1*800 = 800 立方英尺.
(参考地址)
select name warehouse_name,sum(Width*Length*Height*units) volume
from warehouse w
left join Products p
on w.product_id= p.product_id
group by w.name;
生成数据代码如下:
CREATE TABLE `warehouse` (
`name` varchar(255) NOT NULL,
`product_id` int(11) NOT NULL,
`units` int(255) DEFAULT NULL,
PRIMARY KEY (`name`,`product_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `products` (
`product_id` int(11) NOT NULL,
`product_name` varchar(255) DEFAULT NULL,
`Width` int(11) DEFAULT NULL,
`Length` int(11) DEFAULT NULL,
`Height` int(11) DEFAULT NULL,
PRIMARY KEY (`product_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO Warehouse(name,product_id,units) VALUES('LCHouse1',1,1);
INSERT INTO Warehouse(name,product_id,units) VALUES('LCHouse1',2,10);
INSERT INTO Warehouse(name,product_id,units) VALUES('LCHouse1',3,5);
INSERT INTO Warehouse(name,product_id,units) VALUES('LCHouse2',1,2);
INSERT INTO Warehouse(name,product_id,units) VALUES('LCHouse2',2,2);
INSERT INTO Warehouse(name,product_id,units) VALUES('LCHouse3',4,1);
INSERT INTO Products(product_id,product_name,Width,Length,Height) VALUES(1,'LC-TV',5,50,40);
INSERT INTO Products(product_id,product_name,Width,Length,Height) VALUES(2,'LC-KeyChain',5,5,5);
INSERT INTO Products(product_id,product_name,Width,Length,Height) VALUES(3,'LC-Phone',2,10,10);
INSERT INTO Products(product_id,product_name,Width,Length,Height) VALUES(4,'LC-T-Shirt',4,10,20);
597.好友申请 I :总体通过率
在 Facebook 或者 Twitter 这样的社交应用中,人们经常会发好友申请也会收到其他人的好友申请。现在给如下两个表:
表: friend_request
| sender_id | send_to_id |request_date|
|-----------|------------|------------|
| 1 | 2 | 2016_06-01 |
| 1 | 3 | 2016_06-01 |
| 1 | 4 | 2016_06-01 |
| 2 | 3 | 2016_06-02 |
| 3 | 4 | 2016-06-09 |
表: request_accepted
| requester_id | accepter_id |accept_date |
|--------------|-------------|------------|
| 1 | 2 | 2016_06-03 |
| 1 | 3 | 2016-06-08 |
| 2 | 3 | 2016-06-08 |
| 3 | 4 | 2016-06-09 |
| 3 | 4 | 2016-06-10 |
写一个查询语句,求出好友申请的通过率,用 2 位小数表示。通过率由接受好友申请的数目除以申请总数。
对于上面的样例数据,你的查询语句应该返回如下结果。
|accept_rate|
|-----------|
| 0.80|
注意:
通过的好友申请不一定都在表 friend_request 中。你只需要统计总的被通过的申请数(不管它们在不在表 friend_request 中),并将它除以申请总数,得到通过率
一个好友申请发送者有可能会给接受者发几条好友申请,也有可能一个好友申请会被通过好几次。这种情况下,重复的好友申请只统计一次。
如果一个好友申请都没有,通过率为 0.00 。
解释: 总共有 5 个申请,其中 4 个是不重复且被通过的好友申请,所以成功率是 0.80 。
进阶:
你能写一个查询语句得到每个月的通过率吗?
你能求出每一天的累计通过率吗?
解题思路:
- 如果一个好友申请都没有,则会返回 NULL ,此时需要通过IFNULL()函数将其转为0;
- 发送和接受的好友申请都有可能存在重复条目,需要通过DISTINCT或group by去重;
- 通过ROUND()函数保留2位小数。
SELECT ROUND(IFNULL(
(SELECT COUNT(*) accept_count FROM
(SELECT * FROM request_accepted GROUP BY requester_id,accepter_id) a) /
(SELECT COUNT(*) FROM
(SELECT * FROM friend_request GROUP BY sender_id,send_to_id) r)
, 0)
, 2) AS accept_rate;
SELECT ROUND(IFNULL(
(SELECT COUNT(DISTINCT requester_id, accepter_id) FROM request_accepted) /
(SELECT COUNT(DISTINCT sender_id, send_to_id) FROM friend_request)
, 0)
, 2) AS accept_rate;
生成数据代码如下:
CREATE TABLE `friend_request` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`sender_id` int(11) DEFAULT NULL,
`send_to_id` int(11) DEFAULT NULL,
`request_date` date DEFAULT NULL,
PRIMARY KEY (`id`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `request_accepted` (
`requester_id` int(11) NOT NULL,
`accepter_id` int(11) DEFAULT NULL,
`accept_date` date DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO friend_request(sender_id, send_to_id,request_date) VALUES(1,2,'2016_06-01');
INSERT INTO friend_request(sender_id, send_to_id,request_date) VALUES(1,3,'2016_06-01');
INSERT INTO friend_request(sender_id, send_to_id,request_date) VALUES(1,4,'2016_06-01');
INSERT INTO friend_request(sender_id, send_to_id,request_date) VALUES(2,3,'2016_06-02');
INSERT INTO friend_request(sender_id, send_to_id,request_date) VALUES(3,4,'2016_06-09');
INSERT INTO request_accepted(requester_id, accepter_id,accept_date) VALUES(1,2,'2016_06-03');
INSERT INTO request_accepted(requester_id, accepter_id,accept_date) VALUES(1,3,'2016_06-08');
INSERT INTO request_accepted(requester_id, accepter_id,accept_date) VALUES(2,3,'2016_06-08');
INSERT INTO request_accepted(requester_id, accepter_id,accept_date) VALUES(3,4,'2016_06-09');
INSERT INTO request_accepted(requester_id, accepter_id,accept_date) VALUES(3,4,'2016_06-10');
1204. 最后一个能进入电梯的人
表: Queue
+-------------+---------+
| Column Name | Type |
+-------------+---------+
| person_id | int |
| person_name | varchar |
| weight | int |
| turn | int |
+-------------+---------+
person_id 是这个表的主键。
该表展示了所有等待电梯的人的信息。
表中 person_id 和 turn 列将包含从 1 到 n 的所有数字,其中 n 是表中的行数。
电梯最大载重量为 1000。
写一条 SQL 查询语句查找最后一个能进入电梯且不超过重量限制的 person_name 。题目确保队列中第一位的人可以进入电梯 。
查询结果如下所示 :
Queue 表
+-----------+-------------------+--------+------+
| person_id | person_name | weight | turn |
+-----------+-------------------+--------+------+
| 5 | George Washington | 250 | 1 |
| 3 | John Adams | 350 | 2 |
| 6 | Thomas Jefferson | 400 | 3 |
| 2 | Will Johnliams | 200 | 4 |
| 4 | Thomas Jefferson | 175 | 5 |
| 1 | James Elephant | 500 | 6 |
+-----------+-------------------+--------+------+
Result 表
+-------------------+
| person_name |
+-------------------+
| Thomas Jefferson |
+-------------------+
为了简化,Queue 表按 turn 列由小到大排序。
上例中 George Washington(id 5), John Adams(id 3) 和 Thomas Jefferson(id 6) 将可以进入电梯,因为他们的体重和为 250 + 350 + 400 = 1000。
Thomas Jefferson(id 6) 是最后一个体重合适并进入电梯的人。
生成数据代码如下:
CREATE TABLE `Queue` (
`person_id` int NOT NULL,
`person_name` varchar(255) DEFAULT NULL,
`weight` int DEFAULT NULL,
`turn` int DEFAULT NULL,
PRIMARY KEY (`person_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO Queue(person_id,person_name,weight,turn) VALUES (5,'George Washington',250,1);
INSERT INTO Queue(person_id,person_name,weight,turn) VALUES (3,'John Adams',350,2);
INSERT INTO Queue(person_id,person_name,weight,turn) VALUES (6,'Thomas Jefferson',400,3);
INSERT INTO Queue(person_id,person_name,weight,turn) VALUES (2,'Will Johnliams ',200,4);
INSERT INTO Queue(person_id,person_name,weight,turn) VALUES (4,'Thomas Jefferson',175,5);
INSERT INTO Queue(person_id,person_name,weight,turn) VALUES (1,'James Elephant',500,6);
解法一:自连接
SELECT a.person_name
FROM Queue a, Queue b
WHERE a.turn >= b.turn
GROUP BY a.person_id
HAVING SUM(b.weight) <= 1000
ORDER BY a.turn DESC
LIMIT 1;
解法二:窗口函数
SELECT person_name FROM
(SELECT q.*,sum(weight) over (ORDER BY turn) as sum FROM Queue q) t
WHERE t.sum<=1000 ORDER BY t.sum desc LIMIT 1;
解法三:自定义变量
SELECT person_name FROM
(
SELECT q.*,(@sum := @sum + weight) as sum FROM Queue q,(SELECT @sum :=0) tmp ORDER BY turn
) t
WHERE t.sum<=1000 ORDER BY t.sum desc LIMIT 1;
解法四:子查询
SELECT person_name FROM
(
WITH Queue_order AS (SELECT * FROM Queue ORDER BY turn)
SELECT b.* ,
(SELECT sum(a.weight) FROM Queue_order a WHERE a.turn <= b.turn ) as sum
FROM Queue_order b
) t
WHERE t.sum<=1000 ORDER BY t.sum desc LIMIT 1;
1435. 制作会话柱状图 (简单)
+---------------------+---------+
| Column Name | Type |
+---------------------+---------+
| session_id | int |
| duration | int |
+---------------------+---------+
session_id 是该表主键
duration 是用户访问应用的时间, 以秒为单位
你想知道用户在你的 app 上的访问时长情况。因此决定统计访问时长区间分别为 "[0-5>", "[5-10>", "[10-15>" 和 "15 or more" (单位:分钟)的会话数量,并以此绘制柱状图。
写一个SQL查询来报告(访问时长区间,会话总数)。结果可用任何顺序呈现。
下方为查询的输出格式:
Sessions 表:
+-------------+---------------+
| session_id | duration |
+-------------+---------------+
| 1 | 30 |
| 2 | 199 |
| 3 | 299 |
| 4 | 580 |
| 5 | 1000 |
+-------------+---------------+
Result 表:
+--------------+--------------+
| bin | total |
+--------------+--------------+
| [0-5> | 3 |
| [5-10> | 1 |
| [10-15> | 0 |
| 15 or more | 1 |
+--------------+--------------+
对于 session_id 1,2 和 3 ,它们的访问时间大于等于 0 分钟且小于 5 分钟。
对于 session_id 4,它的访问时间大于等于 5 分钟且小于 10 分钟。
没有会话的访问时间大于等于 10 分钟且小于 15 分钟。
对于 session_id 5, 它的访问时间大于等于 15 分钟。
解法一:首先得到每个响应时间的对应的区间
SELECT session_id,
CASE WHEN duration <= 300 AND duration >= 0 THEN '[0-5>'
WHEN duration <= 600 AND duration > 300 THEN '[5-10>'
WHEN duration <=900 AND duration >600 THEN '[10-15>'
WHEN duration >900 THEN '15 or more'
END as bin,duration
FROM Sessions
需要构造所有的区间段然后进行左连接
SELECT t1.bin, ifnull(t2.total, 0) as total
FROM
(
select '[0-5>' as bin union
select '[5-10>' as bin union
select '[10-15>' as bin union
select '15 or more' as bin
) as t1
left join (
SELECT
t.bin,count(*) total
FROM
(
SELECT session_id,
CASE WHEN duration <= 300 AND duration >= 0 THEN '[0-5>'
WHEN duration <= 600 AND duration > 300 THEN '[5-10>'
WHEN duration <=900 AND duration >600 THEN '[10-15>'
WHEN duration >900 THEN '15 or more'
END as bin,duration
FROM Sessions
) t
GROUP BY t.bin
) as t2
on t1.bin = t2.bin;
解法二:
select '[0-5>' as bin, count(*) total from Sessions where duration/60 >= 0 and duration/60 < 5
union
select '[5-10>' as bin, count(*) total from Sessions where duration/60 >= 5 and duration/60 < 10
union
select '[10-15>' as bin, count(*) total from Sessions where duration/60 >= 10 and duration/60 < 15
union
select '15 or more' as bin, count(*) total from Sessions where duration/60 >= 15;