构建历史活跃快照表
现有当日活跃表和历史活跃登录表
数据如下:
当日活跃表:td_ac
> select * from td_ac;
OK
uid dt
u1 2022-04-01
u2 2022-04-01
**历史活跃登录表:ld_ac **
hive> select * from ld_ac;
OK
uid start_time end_time
u1 2022-03-01 2022-03-28
u3 2022-03-02 2022-03-31
要求输出:
uid 首次登录时间 末次登录时间
u1 2022-03-01 2022-04-01
u2 2022-04-01 2022-04-01
u3 2022-03-02 2022-03-31
实现思路:
使用 full join 全表关联
三种可能:
1、当天活跃 历史活跃
有 有 ⇒ 忠诚用户
有 没有 ⇒ 新增用户
没有 有 ⇒ //摇摆用户
hql代码实现:
select
case when a.uid is not null then a.uid else b.uid end as uid,
case
//当天有、历史也有这个用户
when a.uid is not null and b.uid is not null then b.start_time
//新增用户
when b.uid is null then a.dt
//当天没来、历史有的用户
when a.uid is null then b.start_time
end as start_time, //首次登录时间
case
when a.uid is not null and b.uid is not null then a.dt
when b.uid is null then a.dt
when a.uid is null then b.end_time
end as end_time //末次登录时间
from
td_ac a
full join
ld_ac b
on a.uid = b.uid;
输出结果
OK
u1 2022-03-01 2022-04-01
u2 2022-04-01 2022-04-01
u3 2022-03-02 2022-03-31