问题:
X city built a new stadium, each day many people visit it and the stats are saved as these columns: id, visit_date, people
Please write a query to display the records which have 3 or more consecutive rows and the amount of people more than 100(inclusive).
For example, the table stadium
:
+------+------------+-----------+ | id | visit_date | people | +------+------------+-----------+ | 1 | 2017-01-01 | 10 | | 2 | 2017-01-02 | 109 | | 3 | 2017-01-03 | 150 | | 4 | 2017-01-04 | 99 | | 5 | 2017-01-05 | 145 | | 6 | 2017-01-06 | 1455 | | 7 | 2017-01-07 | 199 | | 8 | 2017-01-08 | 188 | +------+------------+-----------+
For the sample data above, the output is:
+------+------------+-----------+ | id | visit_date | people | +------+------------+-----------+ | 5 | 2017-01-05 | 145 | | 6 | 2017-01-06 | 1455 | | 7 | 2017-01-07 | 199 | | 8 | 2017-01-08 | 188 | +------+------------+-----------+
Note:
Each day only have one row record, and the dates are increasing with id increasing.
在解决这个问题前,我们先来看leetcode上相似的另一道题:
603. Consecutive Available Seats
Several friends at a cinema ticket office would like to reserve consecutive available seats.
Can you help to query all the consecutive available seats order by the seat_id using the following cinema
table?
| seat_id | free | |---------|------| | 1 | 1 | | 2 | 0 | | 3 | 1 | | 4 | 1 | | 5 | 1 |
Your query should return the following result for the sample case above.
| seat_id | |---------| | 3 | | 4 | | 5 |
这道题是在column2(free)满足条件=1的情况下找至少连续2次的seat_id。一般使用self join,因为最终的答案要包含所有的seat_id, 若使用join条件seat_id1-seat_id2 = 1则最初的id:1和最终的id:5很难取到,所以需要用abs(seat_id1-seat_id2) = 1,得到table 如下:
seat_id | free | seat_id | free |
1 | 1 | 2 | 0 |
2 | 0 | 1 | 1 |
2 | 0 | 3 | 1 |
3 | 1 | 2 | 0 |
3 | 1 | 4 | 1 |
4 | 1 | 3 | 1 |
4 | 1 | 5 | 1 |
5 | 1 | 4 | 1 |
去除free = 0的值即可:
seat_id | free | seat_id | free |
3 | 1 | 4 | 1 |
4 | 1 | 3 | 1 |
4 | 1 | 5 | 1 |
5 | 1 | 4 | 1 |
coding:
select distinct c1.seat_id
from cinema c1, cinema c2
where abs(c1.seat_id - c2.seat_id) = 1
and c1.free = 1
and c2.free = 1
order by c1.seat_id
Leetcode上另外一道相似的题,也是找至少相邻3次及其以上的值:
180. Consecutive Numbers
Write a SQL query to find all numbers that appear at least three times consecutively.
+----+-----+ | Id | Num | +----+-----+ | 1 | 1 | | 2 | 1 | | 3 | 1 | | 4 | 2 | | 5 | 1 | | 6 | 2 | | 7 | 2 | +----+-----+
For example, given the above Logs
table, 1
is the only number that appears consecutively for at least three times.
+-----------------+ | ConsecutiveNums | +-----------------+ | 1 | +-----------------+
3次及其以上,则self join 2次:
Id | Num | Id | Num | Id | Num |
1 | 1 | 2 | 1 | 3 | 1 |
coding:
select distinct l1.Num as ConsecutiveNums
from Logs l1, Logs l2, Logs l3
where l1.Id = l2.Id - 1
and l2.Id = l3.Id - 1
and l1.Num = l2.Num
and l2.Num = l3.Num
回到最初的问题:
所知的table stadium如下:
+------+------------+-----------+ | id | visit_date | people | +------+------------+-----------+ | 1 | 2017-01-01 | 10 | | 2 | 2017-01-02 | 109 | | 3 | 2017-01-03 | 150 | | 4 | 2017-01-04 | 99 | | 5 | 2017-01-05 | 145 | | 6 | 2017-01-06 | 1455 | | 7 | 2017-01-07 | 199 | | 8 | 2017-01-08 | 188 | +------+------------+-----------+
需要得到的结果如下:
+------+------------+-----------+ | id | visit_date | people | +------+------------+-----------+ | 5 | 2017-01-05 | 145 | | 6 | 2017-01-06 | 1455 | | 7 | 2017-01-07 | 199 | | 8 | 2017-01-08 | 188 | +------+------------+-----------+
这道题不仅需要找三次及其以上的id,同时要输出所有满足条件的id值。也就是说self join需要2次。
首先filter out people小于100的,得到table如下:
id | date | people |
2 | 2-Jan | 109 |
3 | 3-Jan | 150 |
5 | 5-Jan | 145 |
6 | 6-Jan | 1455 |
7 | 7-Jan | 199 |
8 | 8-Jan | 188 |
我们先来看一个简单的table,如下table正好符合我们需要的至少3次以上的连续性:
id | date | people |
2 | 2-Jan | 109 |
3 | 3-Jan | 150 |
4 | 4-Jan | 100 |
要想得到这个结果我们self join 2次希望得到的table如下:
id | date | people | id | date | people | id | date | people |
2 | 2-Jan | 109 | 3 | 3-Jan | 150 | 4 | 4-Jan | 100 |
3 | 3-Jan | 150 | 2 | 2-Jan | 109 | 4 | 3-Jan | 150 |
4 | 4-Jan | 100 | 3 | 3-Jan | 150 | 2 | 2-Jan | 109 |
即self join的条件为:
(s1.id + 1 = s2.id and s2.id + 1 = s3.id) OR (s1.id - 1 = s2.id and s1.id + 1 = s3.id) OR (s1.id -1 =s2.id and s2.id-1 = s3.id)
这个条件分别把连续3次的id值作为self join的第一个table:
最小值第一位时:s1.id + 1 = s2.id and s2.id + 1 = s3.id
中间值第一位时:s1.id - 1 = s2.id and s1.id + 1 = s3.id
最大值第一位时:s1.id -1 =s2.id and s2.id-1 = s3.id
把这个条件用到我们filter出来的table中:
id | date | people |
2 | 2-Jan | 109 |
3 | 3-Jan | 150 |
5 | 5-Jan | 145 |
6 | 6-Jan | 1455 |
7 | 7-Jan | 199 |
8 | 8-Jan | 188 |
得到如下结果:
id | date | people | id | date | people | id | date | people |
5 | 5-Jan | 145 | 6 | 6-Jan | 1455 | 7 | 7-Jan | 199 |
6 | 6-Jan | 1455 | 5 | 5-Jan | 145 | 7 | 7-Jan | 199 |
7 | 7-Jan | 199 | 6 | 6-Jan | 1455 | 5 | 5-Jan | 145 |
6 | 6-Jan | 1455 | 7 | 7-Jan | 199 | 8 | 8-Jan | 188 |
7 | 7-Jan | 199 | 6 | 6-Jan | 1455 | 8 | 8-Jan | 188 |
8 | 8-Jan | 188 | 7 | 7-Jan | 199 | 6 | 6-Jan | 1455 |
最终的coding如下:
select distinct s1.id, s1.visit_date, s1.people
from
stadium s1, stadium s2, stadium s3
where s1.people >= 100
and s2.people>= 100
and s3.people >= 100
and
((s1.id + 1=s2.id and s2.id+1 = s3.id)
or (s1.id - 1 = s2.id and s1.id + 1 = s3.id)
or (s1.id - 1 = s2.id and s2.id -1 = s3.id)
)
order by s1.id