问题:
编写SQL查询以计算从 2020年1月至3月至2020年10月至12月 的每三个月窗口的 average_ride_distance 和 average_ride_duration 。将 average_ride_distance 和 average_ride_duration 四舍五入至 小数点后两位 。
通过将三个月的总 ride_distance 相加并除以 3 来计算 average_ride_distance 。average_ride_duration 的计算方法与此类似。
返回按 month 升序排列的结果表,其中 month 是起始月份的编号(一月为 1,二月为 2 …)。
Table: Drivers
+-------------+---------+
| Column Name | Type |
+-------------+---------+
| driver_id | int |
| join_date | date |
+-------------+---------+
driver_id是该表的主键。
该表的每一行均包含驾驶员的ID以及他们加入Hopper公司的日期。
Table: Rides
+--------------+---------+
| Column Name | Type |
+--------------+---------+
| ride_id | int |
| user_id | int |
| requested_at | date |
+--------------+---------+
ride_id是该表的主键。
该表的每一行均包含行程ID(ride_id),用户ID(user_id)以及该行程的日期(requested_at)。
该表中可能有一些不被接受的乘车请求。
Table: AcceptedRides
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| ride_id | int |
| driver_id | int |
| ride_distance | int |
| ride_duration | int |
+---------------+---------+
ride_id是该表的主键。
该表的每一行都包含已接受的行程信息。
表中的行程信息都在“Rides”表中存在。
查询结果格式如下例所示。
示例1:
输入:
Drivers table:
+-----------+------------+
| driver_id | join_date |
+-----------+------------+
| 10 | 2019-12-10 |
| 8 | 2020-1-13 |
| 5 | 2020-2-16 |
| 7 | 2020-3-8 |
| 4 | 2020-5-17 |
| 1 | 2020-10-24 |
| 6 | 2021-1-5 |
+-----------+------------+
Rides table:
+---------+---------+--------------+
| ride_id | user_id | requested_at |
+---------+---------+--------------+
| 6 | 75 | 2019-12-9 |
| 1 | 54 | 2020-2-9 |
| 10 | 63 | 2020-3-4 |
| 19 | 39 | 2020-4-6 |
| 3 | 41 | 2020-6-3 |
| 13 | 52 | 2020-6-22 |
| 7 | 69 | 2020-7-16 |
| 17 | 70 | 2020-8-25 |
| 20 | 81 | 2020-11-2 |
| 5 | 57 | 2020-11-9 |
| 2 | 42 | 2020-12-9 |
| 11 | 68 | 2021-1-11 |
| 15 | 32 | 2021-1-17 |
| 12 | 11 | 2021-1-19 |
| 14 | 18 | 2021-1-27 |
+---------+---------+--------------+
AcceptedRides table:
+---------+-----------+---------------+---------------+
| ride_id | driver_id | ride_distance | ride_duration |
+---------+-----------+---------------+---------------+
| 10 | 10 | 63 | 38 |
| 13 | 10 | 73 | 96 |
| 7 | 8 | 100 | 28 |
| 17 | 7 | 119 | 68 |
| 20 | 1 | 121 | 92 |
| 5 | 7 | 42 | 101 |
| 2 | 4 | 6 | 38 |
| 11 | 8 | 37 | 43 |
| 15 | 8 | 108 | 82 |
| 12 | 8 | 38 | 34 |
| 14 | 1 | 90 | 74 |
+---------+-----------+---------------+---------------+
输出:
+-------+-----------------------+-----------------------+
| month | average_ride_distance | average_ride_duration |
+-------+-----------------------+-----------------------+
| 1 | 21.00 | 12.67 |
| 2 | 21.00 | 12.67 |
| 3 | 21.00 | 12.67 |
| 4 | 24.33 | 32.00 |
| 5 | 57.67 | 41.33 |
| 6 | 97.33 | 64.00 |
| 7 | 73.00 | 32.00 |
| 8 | 39.67 | 22.67 |
| 9 | 54.33 | 64.33 |
| 10 | 56.33 | 77.00 |
+-------+-----------------------+-----------------------+
解释:
到1月底-->平均骑行距离=(0+0+63)/3=21,平均骑行持续时间=(0+0+38)/3=12.67
到2月底-->平均骑行距离=(0+63+0)/3=21,平均骑行持续时间=(0+38+0)/3=12.67
到3月底-->平均骑行距离=(63+0+0)/3=21,平均骑行持续时间=(38+0+0)/3=12.67
到4月底-->平均骑行距离=(0+0+73)/3=24.33,平均骑行持续时间=(0+0+96)/3=32.00
到5月底-->平均骑行距离=(0+73+100)/3=57.67,平均骑行持续时间=(0+96+28)/3=41.33
到6月底-->平均骑行距离=(73+100+119)/3=97.33,平均骑行持续时间=(96+28+68)/3=64.00
到7月底-->平均骑行距离=(100+119+0)/3=73.00,平均骑行持续时间=(28+68+0)/3=32.00
到8月底-->平均骑行距离=(119+0+0)/3=39.67,平均骑行持续时间=(68+0+0)/3=22.67
9月底-->平均骑行距离=(0+0+163)/3=54.33,平均骑行持续时间=(0+0+193)/3=64.33
到10月底-->平均骑行距离=(0+163+6)/3=56.33,平均骑行持续时间=(0+193+38)/3=77.00
-- 运行环境 mysql8
drop table if EXISTS Drivers;
Create table If Not Exists Drivers (driver_id int, join_date date);
drop table if EXISTS Rides;
Create table If Not Exists Rides (ride_id int, user_id int, requested_at date);
drop table if EXISTS AcceptedRides;
Create table If Not Exists AcceptedRides (ride_id int, driver_id int, ride_distance int, ride_duration int);
insert into Drivers values (10,'2019-12-10');
insert into Drivers values (8,'2020-1-13');
insert into Drivers values (5,'2020-2-16');
insert into Drivers values (7,'2020-3-8');
insert into Drivers values (4,'2020-5-17');
insert into Drivers values (1,'2020-10-24');
insert into Drivers values (6,'2021-1-5');
insert into Rides values (6,75,'2019-12-9');
insert into Rides values (1,54,'2020-2-9');
insert into Rides values (10,63,'2020-3-4');
insert into Rides values (19,39,'2020-4-6');
insert into Rides values (3,41,'2020-6-3');
insert into Rides values (13,52,'2020-6-22');
insert into Rides values (7,69,'2020-7-16');
insert into Rides values (17,70,'2020-8-25');
insert into Rides values (20,81,'2020-11-2');
insert into Rides values (5,57,'2020-11-9');
insert into Rides values (2,42,'2020-12-9');
insert into Rides values (11,68,'2021-1-11');
insert into Rides values (15,32,'2021-1-17');
insert into Rides values (12,11,'2021-1-19');
insert into Rides values (14,18,'2021-1-27');
insert into AcceptedRides values (10,10,63,38);
insert into AcceptedRides values (13,10,73,96);
insert into AcceptedRides values (7,8,100,28);
insert into AcceptedRides values (17,7,119,68);
insert into AcceptedRides values (20,1,121,92);
insert into AcceptedRides values (5,7,42,101);
insert into AcceptedRides values (2,4,6,38);
insert into AcceptedRides values (11,8,37,43);
insert into AcceptedRides values (15,8,108,82);
insert into AcceptedRides values (12,8,38,34);
insert into AcceptedRides values (14,1,90,74);
select * from Drivers;
select * from Rides;
select * from AcceptedRides;
with recursive t3 as (select '2020-01-01' n
union all select date_add(n,interval 1 month) -- 得到 (2020年的 1到12月 但是 会多一个 2021年1yue 不要紧后面会滤掉)
from t3 where year(n) =2020),
t1 as (
select
sum(ride_distance) ride_distance,sum(ride_duration) ride_duration,month(requested_at) `month` -- 得到应有的每月骑行距离和骑行持续时间
from
(
select
a.ride_distance,a.ride_duration,r.requested_at
from
AcceptedRides a left join Rides r
using(ride_id)
)s1
where date_format(requested_at,'%Y') = '2020'
group by date_format(requested_at,'%Y-%m')
)
select
month,average_ride_distance,average_ride_duration
from
(
select
month ,round(sum(ride_distance) over(order by month rows between current row and 2 following ) /3,2) average_ride_distance,round(sum(ride_duration) over(order by month rows between current row and 2 following ) /3,2) average_ride_duration
from
(
select ride_distance,ride_duration,month `month` -- 上下拼接 取到完整的12个月
from
t1
union all
select
0 ride_distance,0 ride_duration, month(n) `month` -- 没有的月份骑行距离和骑行持续时间置零
from
t3
where month(n) not in (select month from t1 ) and year(n) = 2020
) s1
)ss1
where month not in(11,12) -- 最后筛选掉 11 12月的内容