Table Accounts
:
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| id | int |
| name | varchar |
+---------------+---------+
the id is the primary key for this table.
This table contains the account id and the user name of each account.
Table Logins
:
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| id | int |
| login_date | date |
+---------------+---------+
There is no primary key for this table, it may contain duplicates.
This table contains the account id of the user who logged in and the login date. A user may log in multiple times in the day.
Write an SQL query to find the id and the name of active users.
Active users are those who logged in to their accounts for 5 or more consecutive days.
Return the result table ordered by the id.
The query result format is in the following example:
Accounts table:
+----+----------+
| id | name |
+----+----------+
| 1 | Winston |
| 7 | Jonathan |
+----+----------+
Logins table:
+----+------------+
| id | login_date |
+----+------------+
| 7 | 2020-05-30 |
| 1 | 2020-05-30 |
| 7 | 2020-05-31 |
| 7 | 2020-06-01 |
| 7 | 2020-06-02 |
| 7 | 2020-06-02 |
| 7 | 2020-06-03 |
| 1 | 2020-06-07 |
| 7 | 2020-06-10 |
+----+------------+
Result table:
+----+----------+
| id | name |
+----+----------+
| 7 | Jonathan |
+----+----------+
User Winston with id = 1 logged in 2 times only in 2 different days, so, Winston is not an active user.
User Jonathan with id = 7 logged in 7 times in 6 different days, five of them were consecutive days, so, Jonathan is an active user.
Follow up question:
Can you write a general solution if the active users are those who logged in to their accounts for n
or more consecutive days?
解题思路:
对logins 表根据id和日期进行排序,并根据日期顺序添加序号rank,同一天多次登录的rank必须相同,可以使用rank_dense()。如果某个id连续5天登录,那么在同一id的情况下,一定会存在相差4天的rank,同时对应的登录日期也相差4天。
WITH tmp AS (SELECT DISTINCT id, login_date,
DENSE_RANK() OVER (PARTITION BY id ORDER BY login_date) AS rnk
FROM logins)
SELECT DISTINCT t1.id
FROM tmp t1 JOIN tmp t2
ON t2.rnk = t1.rnk + 4 AND t1.id = t2.id
WHERE DATEDIFF(t2.login_date, t1.login_date) = 4)
/*如果同一id对应的rank相差4天,同时登陆的日期也相差4天*/
找出了id,则可以知道name