hive分区以及动态分区

hive静态分区

(hive分区主要是为了提高检索效率,内部表和外部表都可以创建分区)
在这里插入图片描述

hive静态分区操作

静态分区操作数据

1,小明1,lol-book-move,beijing:shangxuetang-shanghai:pudong
2,小明2,lol-book-move,beijing:shangxuetang-shanghai:pudong
3,小明3,lol-book-move,beijing:shangxuetang-shanghai:pudong
4,小明4,lol-book,beijing:shangxuetang-shanghai:pudong
5,小明5,lol-book-move,beijing:shangxuetang-shanghai:pudong
6,小明6,lol-book-move,beijing:shangxuetang-shanghai:pudong
7,小明7,lol-book-move,beijing:shangxuetang-shanghai:pudong

Hive 在创建表时添加分区partition
在表定义时指定对应的partition字段
a、不分区建表

create table psn(
id int,
name string,
likes array<string>,
address map<string,string>
)
ROW FORMAT delimited 
fields terminated by ','            //hdfs存放的数据字段之间的分隔符(记得是存储在hdfs里的,不是select* from 查询显示的分割)
COLLECTION ITEMS TERMINATED BY '-'  //hdfs存放的数据集合之间的分割符
map keys terminated by ':' ;        //hdfs存放的数据map之间的分割符
location '/root/data/'              //hdfs集群存放的数据

b、单分区建表语句:

--单分区建表
create  table psn5(
id int,
name string,
likes array<string>,
addres map<string,string>
)
partitioned by (age int)
row format delimited
fields terminated by ','
collection items terminated by '-'
map keys terminated by ':'

--添加数据
load data local inpath '/root/data/data' into table psn5 partition (age =10);

c、 双分区建表语句:

--多分区建表
create  table psn6(
id int,
name string,
likes array<string>,
addres map<string,string>
)
partitioned by (age int,sex string)
row format delimited
fields terminated by ','
collection items terminated by '-'
map keys terminated by ':'

--添加数据
load data local inpath '/root/data/data' into table psn6 partition (age=10,sex='body');
--增加分区
alter table psn6 add partition (age =30,sex='girl');
--多分区查询
select * from psn6 where  age=30 and  sex='girl';

(表已创建,在此基础上添加分区):

ALTER TABLE table_name ADD [IF NOT EXISTS] PARTITION partition_spec  [LOCATION 'location1'] partition_spec [LOCATION 'location2'] ...;
partition_spec:
: (partition_column = partition_col_value, partition_column = partition_col_value, ...)
例:
ALTER TABLE day_table ADD PARTITION (dt='2008-08-08', hour='08')

Hive删除分区语法:

ALTER TABLE table_name DROP partition_spec, partition_spec,...
partition_spec:
: (partition_column = partition_col_value, partition_column = partition_col_value, ...)
用户可以用 ALTER TABLE DROP PARTITION 来删除分区。
内部表中、对应分区的元数据和数据将被一并删除。
例:
ALTER TABLE day_hour_table DROP PARTITION (dt='2008-08-08', hour='09');

hive动态分区操作

动态分区操作数据

1,小明1,12,man,book-move,beijing:shangxuetang-shanghai:pudong
2,小明2,13,man,book-move,beijing:shangxuetang-shanghai:pudong
3,小明3,12,boy,book-move,beijing:shangxuetang-shanghai:pudong
4,小明4,12,man,book,beijing:shangxuetang-shanghai:pudong
5,小明5,13,boy,book-move,beijing:shangxuetang-shanghai:pudong
6,小明6,12,boy,book-move,beijing:shangxuetang-shanghai:pudong
7,小明7,13,man,book-move,beijing:shangxuetang-shanghai:pudong

普通建表

create table psn7(
id int,
name string ,
age int,
sex string,
likes array<string>,
addres map<string,string>
)
row format delimited 
fields terminated by ','
collection items terminated by '-'
map keys terminated by ':'

建分区表

create table psn_part(
id int,
name string ,
likes array<string>,
addres map<string,string>
)
partitioned by (age int,sex string)
row format delimited 
fields terminated by ','
collection items terminated by '-'
map keys terminated by ':'

插入数据

使用模式
from 
insert
select 

动态分区语句
from psn7
insert overwrite  table psn_part partition (age,sex)
select id,name,likes,addres,age,sex

在插入数据的时候,会报错如下:

FAILED: SemanticException [Error 10096]: Dynamic partition strict mode requires at least one static partition column. To turn this off set hive.exec
.dynamic.partition.mode=nonstrict

解决方案:

set hive.exec.dynamic.partition.mode=nonstrict;

参考文档链接 https://hugh-wangp.iteye.com/blog/1612268
https://www.cnblogs.com/luogankun/p/4111145.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值