hive建表,根据remoteIp进行分桶 根据requestmethod进行分区
hive> create table partition_cluster_accsslog
> ( remoteIp string,
> loginRemoteName string,
> authrizedName string,
> responseCode int,
> contentBytes int,
> handleTime int,
> timestamps bigint,
> requesturl string,
> requestprotocol string,
> refer string,
> browsername string)
> partitioned by (requestmethod string)
> clustered by (remoteIp) sorted by (handleTime) into 2 buckets
> row format delimited fields terminated by '\t'
> LOCATION '/bike/log/partition_cluster_accesslog';
OK
Time taken: 0.332 seconds
hive> set hive.enforce.bucketing;
hive.enforce.bucketing=true
hive> set hive.enforce.bucketing;
hive.enforce.bucketing=true
hive> insert into partition_cluster_accsslog partition( requestmethod='GET')
> select
> remoteip,loginremotename,authrizedname,responseCode,contentBytes,handleTime,timestamps,requesturl,requestprotocol,refer,browsername
> from accesslog where requestmethod='GET';
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.

最低0.47元/天 解锁文章
547

被折叠的 条评论
为什么被折叠?



