3.实现数据仓库

1. 先登录hive数据库

hive

2.创建weblog数据库

create database weblog;

use weblog;

2.1创建表

create table ods_weblog_origin(valid string,remote_addr string,remote_user string,time_local string,request string,status string,body_bytes_sent string,http_referer string,http_user_agent string) partitioned by (datestr string) row format delimited fields terminated by ‘\001’;

3、导入数据

load data inpath ‘/output/’ overwrite into table ods_weblog_origin partition(datestr=‘20130918’);

查看是否导入成功

展示前5条数据

select * from ods_weblog_origin limit 5;

4.1创建 _明细表

create table ods_weblog_detail(valid string,remote_addr string,remote_user string,time_local string,daystr string,timestr string,month string,day string,hour string,request string,status string,body_bytes_sent string,http_referer string,ref_host string,ref_path string,ref_query string,ref_query_id string,http_user_agent string) partitioned by (datestr string);

4.2创建解析url的_临时中间表

create table t_ods_tmp_referurl as SELECT a.,b. FROM ods_weblog_origin a LATERAL VIEW parse_url_tuple(regexp_replace(http_referer, “”",""),‘HOST’, ‘PATH’,‘QUERY’, ‘QUERY:id’) b as host, path, query, query_id;

4.3创建解析时间的_临时中间表

create table t_ods_tmp_detail as select b.*,substring(time_local,0,10) as daystr,substring(time_local,12) as tmstr,substring(time_local,6,2) as month,substring(time_local,9,2) as day,substring(time_local,11,3) as hour from t_ods_tmp_referurl b;

5.设置动态分区

//开启分区功能

set hive.exec.dynamic.partition=true;

//允许所有分区字段都可以使用动态分区

set hive.exec.dynamic.partition.mode=nonstrict;

6.向ods_weblog_detail_表(4.1)中加载数据

insert overwrite table ods_weblog_detail partition(datestr) select distinct otd.valid,otd.remote_addr,otd.remote_user,otd.time_local,otd.daystr,otd.tmstr,otd.month,otd.day,otd.hour,otr.request,otr.status,otr.body_bytes_sent,otr.http_referer,otr.host,otr.path,otr.query,otr.query_id,otr.http_user_agent,otd.daystr from t_ods_tmp_detail as otd,t_ods_tmp_referurl as otr where otd.remote_addr=otr.remote_addr and otd.time_local=otr.time_local and otd.body_bytes_sent=otd.body_bytes_sent and otd.request=otr.request;

  • 2
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值