第12章 需求六:流失用户数
流失用户:最近7天未登录我们称之为流失用户
12.1 DWS层
使用日活明细表dws_uv_detail_day作为DWS层数据
12.2 ADS层
1)建表语句
hive (gmall)>
drop table if exists ads_wastage_count;
create external table ads_wastage_count(
dt
string COMMENT ‘统计日期’,
wastage_count
bigint COMMENT ‘流失设备数’
)
row format delimited fields terminated by ‘\t’
location ‘/warehouse/gmall/ads/ads_wastage_count’;
2)导入2019-02-20数据
hive (gmall)>
insert into table ads_wastage_count
select
‘2019-02-20’,
count(*)
from
(
select mid_id
from dws_uv_detail_day
group by mid_id
having max(dt)<=date_add(‘2019-02-20’,-7)
)t1;
12.3 编写脚本
1)在hadoop102的/home/atguigu/bin目录下创建脚本
[atguigu@hadoop102 bin]$ vim ads_wastage_log.sh
在脚本中编写如下内容
#!/bin/bash
if [ -n “$1” ];then
do_date=$1
else
do_date=date -d "-1 day" +%F
fi
hive=/opt/module/hive/bin/hive
APP=gmall
echo “-----------导入日期$do_date-----------”
sql="
insert into table "
A
P
P
"
.
a
d
s
w
a
s
t
a
g
e
c
o
u
n
t
s
e
l
e
c
t
′
APP".ads_wastage_count select '
APP".adswastagecountselect′do_date’,
count(*)
from
(
select mid_id
from "
A
P
P
"
.
d
w
s
u
v
d
e
t
a
i
l
d
a
y
g
r
o
u
p
b
y
m
i
d
i
d
h
a
v
i
n
g
m
a
x
(
d
t
)
<
=
d
a
t
e
a
d
d
(
′
APP".dws_uv_detail_day group by mid_id having max(dt)<=date_add('
APP".dwsuvdetaildaygroupbymididhavingmax(dt)<=dateadd(′do_date’,-7)
)t1;
"
h
i
v
e
−
e
"
hive -e "
hive−e"sql"
2)增加脚本执行权限
[atguigu@hadoop102 bin]$ chmod 777 ads_wastage_log.sh
3)脚本使用
[atguigu@hadoop102 module]$ ads_wastage_log.sh 2019-02-20
4)查询结果
hive (gmall)> select * from ads_wastage_count;
5)脚本执行时间
企业开发中一般在每日凌晨30分~1点