判断日期是否连续

hql统计连续登录三天及以上的用户

数据提供

 用户ID、登入日期
 user01,2018-02-28
 user01,2018-03-01
 user01,2018-03-02
 user01,2018-03-04
 user01,2018-03-05
 user01,2018-03-06
 user01,2018-03-07
 user02,2018-03-01
 user02,2018-03-02
 user02,2018-03-03
 user02,2018-03-06

输出字段

user_id | times | start_date | end_date

创建表

 create table lxdl(
 user_id string comment '用户ID'
 ,login_date date comment '登入日期'
 )
 row format delimited fields terminated by ',';

加载数据

load data local inpath '/opt/module/datas/dsjsfc/user.txt' into table lxdl;

验证数据

select * from lxdl;
+---------------+------------------+--+
| lxdl.user_id  | lxdl.login_date  |
+---------------+------------------+--+
|  user01       | 2018-02-28       |
|  user01       | 2018-03-01       |
|  user01       | 2018-03-02       |
|  user01       | 2018-03-04       |
|  user01       | 2018-03-05       |
|  user01       | 2018-03-06       |
|  user01       | 2018-03-07       |
|  user02       | 2018-03-01       |
|  user02       | 2018-03-02       |
|  user02       | 2018-03-03       |
|  user02       | 2018-03-06       |
+---------------+------------------+--+

思路

1.根据user_id分区,日期排序

select 
	user_id,login_date,
	row_number() over(partition by user_id order by login_date) as rn 
from 
	lxdl;

输出

+-------------+-------------+-------+--+
| user_id     |	login_date	|	rn  |  
+-------------+-------------+-------+--+
| user01	  |	2018-02-28	|	1	|
| user01	  |	2018-03-01	|	2   |
| user01	  |	2018-03-02	|	3	|
| user01	  |	2018-03-04	|	4	|
| user01	  |	2018-03-05	|	5	|
| user01	  |	2018-03-06	|	6	|
| user01	  |	2018-03-07	|	7	|
| user02	  |	2018-03-01	|	1	|
| user02	  |	2018-03-02	|	2	|
| user02	  |	2018-03-03	|	3	|
| user02	  |	2018-03-06	|	4	|
+-------------+-------------+-------+--+

2.用日期与rn想减,得到的差值如果相等,则说明日期是连续的

select 
	user_id,login_date,
	date_sub(login_date,rn) as date_diff 
from 
	(
    select 
		user_id,login_date,
		row_number() over(partition by user_id order by login_date) as rn 
	from 
		lxdl
	) t1;

输出

+-------------+-------------+-------------+--+
| user_id     |	login_date	|  date_diff  |  
+-------------+-------------+-------------+--+
| user01	  |	2018-02-28	| 2018-02-27  |
| user01	  |	2018-03-01	| 2018-02-27  |   
| user01	  |	2018-03-02	| 2018-02-27  |
| user01	  |	2018-03-04	| 2018-02-28  |
| user01	  |	2018-03-05	| 2018-02-28  |
| user01	  |	2018-03-06	| 2018-02-28  |
| user01	  |	2018-03-07	| 2018-02-28  |
| user02	  |	2018-03-01	| 2018-02-28  |
| user02	  |	2018-03-02	| 2018-02-28  |
| user02	  |	2018-03-03	| 2018-02-28  |
| user02	  |	2018-03-06	| 2018-03-02  |
+-------------+-------------+-------------+--+

3.根据user_id和date_diff分组,最小登录日期即为此次连续登录的开始日期start_date,最大登录日期即为结束日期end_date,登录次数即为分组后的count(1)

select 
	user_id,count(1) as times,
	min(login_date) as start_date,
	max(login_date) as end_date
from
	(
        select 
        	user_id,login_date,
        	date_sub(login_date,rn) as date_diff 
        from 
        (
            select 
				user_id,login_date,
				row_number() over(partition by user_id order by login_date) as rn 
			from 
				lxdl) 
        t1) t2 
group by user_id,date_diff 
having times >=3;

输出

+-----------+-----------+----------------+--------------+--+
|	user_id	|	times	|	start_date	 |	end_date   	|
+-----------+-----------+----------------+--------------+--+	
| 	user01	|	3		|	2018-02-28	 |	2018-03-02	|
| 	user01	|	4		|	2018-03-04	 |	2018-03-07	|	
| 	user02	|	3		|	2018-03-01	 |	2018-03-03	|
+-----------+-----------+----------------+--------------+--+
  • 6
    点赞
  • 17
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值