(6)在comm数据库下创建一个名为dim_area的外部表,如果表已存在,则先删除;另外,要求指定表的存储路径为HDFS的/behavior/dim/dim_area目录,字段分隔符为
“\t”;字段类型如下表所示;
表8 字段类型表
答:
drop table if exists comm.dim_area;
create external table comm.dim_area(city string,province string,area string) row format delimited fields terminated by ‘\t’ location ‘/behavior/dim/dim_area’;
(7)使用load data子句将本地/root/eduhq/data目录下的“dim_date_2023.txt”和“dim_area.txt”文件分别加载到外部表dim_date和dim_area中;
答:
load data local inpath ‘/root/eduhq/data/dim_date_2023.txt’ into table comm.dim_date;
load data local inpath ‘/root/eduhq/data/dim_area.txt’ into table comm.dim_area;
(8)分别查看外部表dim_date和dim_area的前3行数据;
答:
select * from comm.dim_date limit 3;
select * from comm.dim_area limit 3;
(9)分别统计外部表dim_date和dim_area数据总行数;
答:
select count() from comm.dim_date;
select count() from comm.dim_area;
(10)统计不同省份用户访问量;将统计结果导出到本地文件系统的/root/eduhq/result/ads_user_pro目录下