用户登录视频网站,有时当日未观看电影,但需要分析用户喜欢的电影题材,就需要补充近30日内最新观看电影的题材。
用户ID | 登录日期 | 观看电影 |
---|---|---|
1 | 2023-01-05 | 爱情 |
1 | 2023-01-17 | |
1 | 2023-01-29 | 恐怖 |
1 | 2023-02-15 | 科幻 |
1 | 2023-03-05 | |
2 | 2023-01-07 | 传记 |
2 | 2023-02-25 | 记录 |
2 | 2023-02-26 |
要点:使用开窗函数按时间戳排序,通过range between来限制时间范围。日期与电影题材拼接,若电影题材为空,拼接结果为null,取最大值,就是用户最近观看的电影题材。
select
id,
dt,
dt_timestamp,
type,
substr(last_30d_type,11) last_30d_type
from
(
select
id,
dt,
unix_timestamp(dt,'yyyy-MM-dd') dt_timestamp,
type,
max(concat(dt,type)) over (partition by id order by unix_timestamp(dt,'yyyy-MM-dd') range between 2505600 preceding and current row) as last_30d_type
from
(
select 1 as id, '2023-01-05' as dt, '爱情' as type
union all select 1 as id, '2023-01-17' as dt, null as type
union all select 1 as id, '2023-01-29' as dt, '恐怖' as type
union all select 1 as id, '2023-02-15' as dt, '科幻' as type
union all select 1 as id, '2023-03-05' as dt, null as type
union all select 2 as id, '2023-01-07' as dt, '传记' as type
union all select 2 as id, '2023-02-25' as dt, '记录' as type
union all select 2 as id, '2023-02-26' as dt, null as type
) a
) b