一、LAG与LEAD
LAG(col,n,default) 用于统计窗口内往上第n行值
第一个参数为列名,第二个参数为往上第n行(可选,默认为1),第三个参数为默认值(当往上第n行为NULL时候,取默认值,如不指定,则为NULL)
LEAD与LAG相反 ,LEAD(col,n,DEFAULT) 用于统计窗口内往下第n行值
第一个参数为列名,第二个参数为往下第n行(可选,默认为1),第三个参数为默认值(当往下第n行为NULL时候,取默认值,如不指定,则为NULL)
例子:获取用户这次下单与下次会话的时间,统计时间差
select session_id, user_id, session_create_time,
LEAD(session_create_time,1) over (order by session_create_time asc) as next_row
from dwb.dwb_pulsar_c_inappropriate_hour
where user_id = '1105835577';
获取上一次与这一次下单的间隔时间
select session_id, user_id, session_create_time,
LEAD(session_create_time,1) over (order by session_create_time asc) as next_row,
datediff(LEAD(session_create_time, 1) over(order by session_create_time asc),session_create_time) as diff_days
from dwb.dwb_pulsar_c_inappropriate_hour
where user_id = '1105835577';
二、ROW_NUMBER()排序实现
SELECT
user_id, session_id, session_create_time,
ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY session_create_time DESC) AS rank
FROM dwb.dwb_pulsar_c_inappropriate_hour
ORDER BY user_id;
排序后取出user_id分组,session_create_time 最前面的结果
其他方式实现
方法一: ROW_NUMBER() OVER(PARTITION BY
SELECT t.user_id, t.session_id, t.session_create_time
FROM(
SELECT
user_id, session_id, session_create_time,
ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY session_create_time DESC) AS rank
FROM dwb.dwb_pulsar_c_inappropriate_hour
)t
WHERE t.rank = 1 ORDER BY t.user_id;
方法二: rowNumberInAllBlocks()函数
SELECT t.user_id, t.session_id, t.session_create_time
FROM(
SELECT
user_id, session_id, session_create_time,
rowNumberInAllBlocks() AS rank
FROM dwb.dwb_pulsar_c_inappropriate_hour
)t
LIMIT 1 BY user_id;
方法三:max()函数
SELECT t.user_id, t.session_id, t.session_create_time
FROM
dwb.dwb_pulsar_c_inappropriate_hour a
left join (
SELECT
user_id, max(session_create_time) session_create_time
FROM dwb.dwb_pulsar_c_inappropriate_hour
GROUP BY user_id
)t
on a.user_id = t.user_id
and a.session_create_time = t.session_create_time;