解大厂SQL常见面试题:怎样用SQL找出连续登录天数大于7天的用户?

想进入大厂的小伙伴是不是经常被问到类似这样的题目:怎样用SQL找出连续登录天数大于7天的用户?

看得题目后一头雾水,不知道从哪里下手,最后垂头丧气,丢掉了offer。其实这个问题不难,关键点我已经列出来了,下面我使用GBase 8a来给大家讲解一下。

SQL都是标准SQL,其他数据库也是一样的。原理和步骤我都分解好了,反复看3遍,你一定可以通过。

目标:构造一个辅助列,假如某个用户连续登录,这个列就是同一个值

做的上面的目标,就可以方便的得到正确答案。大厂的offer就在向你招手!来吧,走起。

假设有3个用户,id分别是1001,1002,1003;分别登录了13,15和3天。

第一步 建表,插入数据

CREATE TABLE login_log ( id integer, login_date date );
insert into login_log values(1001,'2021-01-01');
insert into login_log values(1001,'2021-01-02');
insert into login_log values(1001,'2021-01-04');
insert into login_log values(1001,'2021-01-05');
insert into login_log values(1001,'2021-01-06');
insert into login_log values(1001,'2021-01-07');
insert into login_log values(1001,'2021-01-08');
insert into login_log values(1001,'2021-01-09');
insert into login_log values(1001,'2021-01-10');
insert into login_log values(1001,'2021-01-11');
insert into login_log values(1001,'2021-01-13');
insert into login_log values(1001,'2021-01-15');
insert into login_log values(1001,'2021-01-16');
insert into login_log values(1002,'2021-01-01');
insert into login_log values(1002,'2021-01-02');
insert into login_log values(1002,'2021-01-03');
insert into login_log values(1002,'2021-01-04');
insert into login_log values(1002,'2021-01-05');
insert into login_log values(1002,'2021-01-06');
insert into login_log values(1002,'2021-01-07');
insert into login_log values(1002,'2021-01-08');
insert into login_log values(1002,'2021-01-09');
insert into login_log values(1002,'2021-01-10');
insert into login_log values(1002,'2021-01-11');
insert into login_log values(1002,'2021-01-12');
insert into login_log values(1002,'2021-01-13');
insert into login_log values(1002,'2021-01-16');
insert into login_log values(1002,'2021-01-17');
insert into login_log values(1003,'2021-01-4');
insert into login_log values(1003,'2021-01-5');
insert into login_log values(1003,'2021-01-6'); 

第二步 编写SQL如下:
 

select id ,t,count(t) from ( select id,ADDDATE(login_date,INTERVAL -rn DAY) t FROM ( select id,login_date, row_number() over (partition by id order by login_date) rn from login_log ) q1 ) q group by id,t having count(t) > 7 ;

执行后就得到了正确答案,如下所示,是不是很简单?

gbase> select id ,t,count(t) from (
    -> SELECT id,ADDDATE(login_date,INTERVAL -rn DAY) t FROM (
    ->   select id,login_date,row_number() 
    ->   over(partition by id order by login_date) rn from login_log 
    ->   ) q1 
    -> ) q 
    -> group by id,t  having count(t) > 7
    -> ;
+------+------------+----------+
| id   | t          | count(t) |
+------+------------+----------+
| 1001 | 2021-01-01 |        8 |
| 1002 | 2020-12-31 |       13 |
+------+------------+----------+
2 rows in set (Elapsed: 00:00:00.14)

第三步 如果不理解,我来详细分解一下:

1 使用row_number函数添加辅助列,此时同一id的如果连续登录, login_date 和 row_number 就会同步增长,他们的差值 t列 一定相同
请注意下面的rn列是对于相同的id是连续增长的。请仔细观察: id是1001这些数据对应的t列的值,连续登录的记录t值是相同的。

gbase> select                                                                                                               
    ->     id,login_date,
    ->     row_number() over(partition by id order by login_date) rn,
    ->     adddate(login_date,INTERVAL - row_number() over(partition by id order by login_date) DAY) t
    -> from login_log ;
+------+------------+----+------------+
| id   | login_date | rn | t          |
+------+------------+----+------------+
| 1001 | 2021-01-01 |  1 | 2020-12-31 |
| 1001 | 2021-01-02 |  2 | 2020-12-31 |
| 1001 | 2021-01-04 |  3 | 2021-01-01 |
| 1001 | 2021-01-05 |  4 | 2021-01-01 |
| 1001 | 2021-01-06 |  5 | 2021-01-01 |
| 1001 | 2021-01-07 |  6 | 2021-01-01 |
| 1001 | 2021-01-08 |  7 | 2021-01-01 |
| 1001 | 2021-01-09 |  8 | 2021-01-01 |
| 1001 | 2021-01-10 |  9 | 2021-01-01 |
| 1001 | 2021-01-11 | 10 | 2021-01-01 |
| 1001 | 2021-01-13 | 11 | 2021-01-02 |
| 1001 | 2021-01-15 | 12 | 2021-01-03 |
| 1001 | 2021-01-16 | 13 | 2021-01-03 |
| 1003 | 2021-01-04 |  1 | 2021-01-03 |
| 1003 | 2021-01-05 |  2 | 2021-01-03 |
| 1003 | 2021-01-06 |  3 | 2021-01-03 |
| 1002 | 2021-01-01 |  1 | 2020-12-31 |
| 1002 | 2021-01-02 |  2 | 2020-12-31 |
| 1002 | 2021-01-03 |  3 | 2020-12-31 |
| 1002 | 2021-01-04 |  4 | 2020-12-31 |
| 1002 | 2021-01-05 |  5 | 2020-12-31 |
| 1002 | 2021-01-06 |  6 | 2020-12-31 |
| 1002 | 2021-01-07 |  7 | 2020-12-31 |
| 1002 | 2021-01-08 |  8 | 2020-12-31 |
| 1002 | 2021-01-09 |  9 | 2020-12-31 |
| 1002 | 2021-01-10 | 10 | 2020-12-31 |
| 1002 | 2021-01-11 | 11 | 2020-12-31 |
| 1002 | 2021-01-12 | 12 | 2020-12-31 |
| 1002 | 2021-01-13 | 13 | 2020-12-31 |
| 1002 | 2021-01-16 | 14 | 2021-01-02 |
| 1002 | 2021-01-17 | 15 | 2021-01-02 |
+------+------------+----+------------+
31 rows in set (Elapsed: 00:00:00.02)

2 执行下面的sql,获得关键的结果。

 

SELECT id,login_date,ADDDATE(login_date,INTERVAL -rn DAY) t FROM ( select id,login_date, row_number() over(partition by id order by login_date) rn from login_log ) q1 ;

注意下面的结果,对于单个id,login_date列连续记录对应的t列,即ADDDATE(login_date,INTERVAL -rn DAY)是相同的
一定要看懂上面这句话!!!反复的看!!!
如果还不能理解,请仔细看下面id为1001的数据。

考虑对t列进行count 使用having过滤,此时回到第二步 就得到了答案。

gbase> SELECT id,login_date,ADDDATE(login_date,INTERVAL -rn DAY) t FROM 
    -> (
    ->   select 
    ->     id,login_date,
    ->     row_number() over(partition by id order by login_date) rn 
    ->   from login_log 
    -> ) q1 ;
+------+------------+------------+
| id   | login_date | t          |
+------+------------+------------+
| 1001 | 2021-01-01 | 2020-12-31 |
| 1001 | 2021-01-02 | 2020-12-31 |
| 1001 | 2021-01-04 | 2021-01-01 |
| 1001 | 2021-01-05 | 2021-01-01 |
| 1001 | 2021-01-06 | 2021-01-01 |
| 1001 | 2021-01-07 | 2021-01-01 |
| 1001 | 2021-01-08 | 2021-01-01 |
| 1001 | 2021-01-09 | 2021-01-01 |
| 1001 | 2021-01-10 | 2021-01-01 |
| 1001 | 2021-01-11 | 2021-01-01 |
| 1001 | 2021-01-13 | 2021-01-02 |
| 1001 | 2021-01-15 | 2021-01-03 |
| 1001 | 2021-01-16 | 2021-01-03 |
| 1002 | 2021-01-01 | 2020-12-31 |
| 1002 | 2021-01-02 | 2020-12-31 |
| 1002 | 2021-01-03 | 2020-12-31 |
| 1002 | 2021-01-04 | 2020-12-31 |
| 1002 | 2021-01-05 | 2020-12-31 |
| 1002 | 2021-01-06 | 2020-12-31 |
| 1002 | 2021-01-07 | 2020-12-31 |
| 1002 | 2021-01-08 | 2020-12-31 |
| 1002 | 2021-01-09 | 2020-12-31 |
| 1002 | 2021-01-10 | 2020-12-31 |
| 1002 | 2021-01-11 | 2020-12-31 |
| 1002 | 2021-01-12 | 2020-12-31 |
| 1002 | 2021-01-13 | 2020-12-31 |
| 1002 | 2021-01-16 | 2021-01-02 |
| 1002 | 2021-01-17 | 2021-01-02 |
| 1003 | 2021-01-04 | 2021-01-03 |
| 1003 | 2021-01-05 | 2021-01-03 |
| 1003 | 2021-01-06 | 2021-01-03 |
+------+------------+------------+
31 rows in set (Elapsed: 00:00:00.12)
  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值