工作中清洗日志数据的时候有时候会有这样的场景,要看用户连续登陆的情况,或者商家连续有单,用户持续下单的数据,这时候可以写个UDF,如果嫌麻烦的可以参考以下sql,可以计算出用户连续登陆的天数,diff = 1就是连续登陆2天的,大于1就是连续登陆多天以上的。
--假定数据形式是:dt(登录日期-yyyymmdd类型) uid(用户id)
select uid
from (
SELECT uid
,num
,MAX(dt) - MIN(dt) diff
FROM (
SELECT a.uid
,a.dt
,dt - rn as num
FROM (
SELECT uid
,dt
,row_number () over (PARTITION BY uid ORDER BY dt) rn
FROM table_name
GROUP BY uid
,dt
) a1
) a2
GROUP BY uid
,num
) t3
where diff = 1
group by uid
;