第10章 需求四:沉默用户数
沉默用户:指的是只在安装当天启动过,且启动时间是在一周前
10.1 DWS层
使用日活明细表dws_uv_detail_day作为DWS层数据
10.2 ADS层
1)建表语句
hive (gmall)>
drop table if exists ads_slient_count;
create external table ads_slient_count(
dt
string COMMENT ‘统计日期’,
slient_count
bigint COMMENT ‘沉默设备数’
)
row format delimited fields terminated by ‘\t’
location ‘/warehouse/gmall/ads/ads_slient_count’;
2)导入2019-02-20数据
hive (gmall)>
insert into table ads_slient_count
select
‘2019-02-20’ dt,
count() slient_count
from
(
select mid_id
from dws_uv_detail_day
where dt<=‘2019-02-20’
group by mid_id
having count()=1 and min(dt)<date_add(‘2019-02-20’,-7)
) t1;
3)查询导入数据
hive (gmall)> select * from ads_slient_count;
10.3 编写脚本
1)在hadoop102的/home/atguigu/bin目录下创建脚本
[atguigu@hadoop102 bin]$ vim ads_slient_log.sh
在脚本中编写如下内容
#!/bin/bash
hive=/opt/module/hive/bin/hive
APP=gmall
if [ -n “$1” ];then
do_date=$1
else
do_date=date -d "-1 day" +%F
fi
echo “-----------导入日期$do_date-----------”
sql="
insert into table “
A
P
P
"
.
a
d
s
s
l
i
e
n
t
c
o
u
n
t
s
e
l
e
c
t
′
APP".ads_slient_count select '
APP".adsslientcountselect′do_date’ dt,
count() slient_count
from
(
select
mid_id
from "
A
P
P
"
.
d
w
s
u
v
d
e
t
a
i
l
d
a
y
w
h
e
r
e
d
t
<
=
′
APP".dws_uv_detail_day where dt<='
APP".dwsuvdetaildaywheredt<=′do_date’
group by mid_id
having count()=1 and min(dt)<=date_add(’$do_date’,-7)
)t1;”
h
i
v
e
−
e
"
hive -e "
hive−e"sql"
2)增加脚本执行权限
[atguigu@hadoop102 bin]$ chmod 777 ads_slient_log.sh
3)脚本使用
[atguigu@hadoop102 module]$ ads_slient_log.sh 2019-02-20
4)查询结果
hive (gmall)> select * from ads_slient_count;
5)脚本执行时间
企业开发中一般在每日凌晨30分~1点