目录
题目:
表: Matches
+-------------+------+
| Column Name | Type |
+-------------+------+
| player_id | int |
| match_day | date |
| result | enum |
+-------------+------+
(player_id, match_day) 是该表的主键(具有唯一值的列的组合)。
每一行包括了:参赛选手 id、 比赛时间、 比赛结果。
比赛结果(result)的枚举类型为 ('Win', 'Draw', 'Lose')。
要求:
选手的 连胜数 是指连续获胜的次数,且没有被平局或输球中断。
编写解决方案来计算每个参赛选手最多的连胜数。
结果可以以 任何顺序 返回。
准备工作:
1.创建数据库
Create table If Not Exists Matches (player_id int, match_day date, result ENUM('Win', 'Draw', 'Lose'));
Truncate table Matches;
insert into Matches (player_id, match_day, result) values ('1', '2022-01-17', 'Win');
insert into Matches (player_id, match_day, result) values ('1', '2022-01-18', 'Win');
insert into Matches (player_id, match_day, result) values ('1', '2022-01-25', 'Win');
insert into Matches (player_id, match_day, result) values ('1', '2022-01-31', 'Draw');
insert into Matches (player_id, match_day, result) values ('1', '2022-02-08', 'Win');
insert into Matches (player_id, match_day, result) values ('2', '2022-02-06', 'Lose');
insert into Matches (player_id, match_day, result) values ('2', '2022-02-08', 'Lose');
insert into Matches (player_id, match_day, result) values ('3', '2022-03-30', 'Win');
2.查看表数据:
3.分析题目:
在分析图中加了一些测试数据。会和后面的结果不同。
代码实现:
根据分析分步进行代码实现。
第一步和第二步:
第一步和第二步用到的都是相同的语法,所以可以将这两步的代码合并编写。
第一步通过对id的分区然后对日期进行排序,可以判断是否是连续的日期,进而判断是否连续赢还是输。
第二步:通过对id和result进行分组然后排名,确定那些是连续赢还是其他状态
select *,
row_number() over (partition by player_id order by match_day) rn1,
row_number() over (partition by player_id,result)rn2
from matches;
第三步:
通过对日期的排名和result的分组排名做差,通过做差然后对做差的值和id进行分组可以把连续的分在一组里面。
with t1 as ( select *,
row_number() over (partition by player_id order by match_day) rn1,
row_number() over (partition by player_id,result)rn2
from matches )
select *,(rn1-rn2)diff FROM t1;
第四步:
通过if函数if(result = 'win', 1, 0)当赢了比赛则返回1(赢了计数1),否则返回0(输了返回0)
with t1 as ( select *,
row_number() over (partition by player_id order by match_day) rn1,
row_number() over (partition by player_id,result)rn2
from matches )
select *,
(rn1-rn2)diff,
if(result='Win',1,0)result2
FROM t1;
第五步:
通过对id和做差(diff字段)获得的值进行分组,然后对判断返回出来的值(result2)进行求和
with t1 as ( select *,
row_number() over (partition by player_id order by match_day) rn1,
row_number() over (partition by player_id,result)rn2
from matches ),
t2 as ( select *,
(rn1-rn2)diff,
if(result='Win',1,0)result2
FROM t1 )
select player_id,
sum(result2)cnt
FROM t2 group by player_id,diff
;
第六步:
通过求和出来的值然后我们求最大的连胜所以利用max()函数,得出最终结果。
with t1 as (select *,
row_number() over (partition by player_id order by match_day) rn1,
row_number() over (partition by player_id,result) rn2
from matches),
t2 as (select *,
(rn1 - rn2) diff,
if(result = 'Win', 1, 0) result2
FROM t1),
t3 as (select player_id,
sum(result2) cnt
FROM t2
group by player_id, diff)
select player_id,
max(cnt)longest_streak
FROM t3
group by player_id
;
因为在分析中在id=3和id=4加了几组测试数据,分析的结果没有错误,创建数据库的和接下来的代码执行和最终结果都没有问题。
进阶:
通过上面的讲解我们知道怎么处理连续问题,那么我们通过这道题然后进阶一下
需求:如果我们想要计算最长的连续不输的次数(即获胜或者平局),你该如何调整?
首先我们仍然需要分析:
第一步:
仍然要对每个人的日期进行排序,这样做的意义在于日期连续,方便后面判断连续不输
row_number() over (partition by player_id order by match_day) as rn1
第二步:
需要对结果进行分组排名,那么我们现在结果变了那我们就需要先对结果进行处理通过
if()函数,让result="win"和"Draw"返回"Not_Lose"否则返回"Lose",我们可以先返回这个结果,然后分组,或者直接对if()进行分组。这么我们对if()进行分组
row_number() over (partition by player_id, if(result!='Lose', 'Not_Lose', 'Lose') order by match_day) as rn2
第三步:
我们会发现连续的不输的排名会和通过日期排名的一样,这时候我们让这两个的值进行做差,然后连续不输的会做差出相同的值。后面方便分组。
然后我们通过sum(if(result!='Lose', 1, 0))可以得到不输的积分
with t1 as (
select
player_id, match_day, result,
row_number() over (partition by player_id order by match_day) as rn1,
row_number() over (partition by player_id, if(result!='Lose', 'Not_Lose', 'Lose') order by match_day) as rn2
from matches
)
, t2 as (
select
player_id,
(rn1 - rn2) as diff,
sum(if(result!='Lose', 1, 0)) cnt
from t1
group by player_id, (rn1 - rn2)
)
select *from t2
到这里没完,我们可以想象一下,如果id=1的人前面连续不输5把,然后输一把,然后接着连续不输6把,那么在上图的第一二行中间应该加一个id=1 cnt=6。所以接下面我们需要对cnt求最大值。
第四步:
求每个人的最大连续不输的次数
with t1 as (
select
player_id, match_day, result,
row_number() over (partition by player_id order by match_day) as rn1,
row_number() over (partition by player_id, if(result!='Lose', 'Not_Lose', 'Lose') order by match_day) as rn2
from matches
)
, t2 as (
select
player_id,
(rn1 - rn2) as diff,
sum(if(result!='Lose', 1, 0)) cnt
from t1
group by player_id, (rn1 - rn2)
)
select
player_id,
max(cnt) as 最大不输的次数
from t2
group by player_id
;
总结:
对于连续问题,最方便的就是进行开窗然后排名做差解题。如果不会开窗函数,可以看我开窗函数的详细分析。最后,这道题非常的有水平,希望大家可以好好的琢磨。