hive分区测试:
hive分区建立,静态、动态、混合分区,在混合中静态分区要在动态分区之前。
删除时间段分区
在partition()里面逻辑符用逗号, 比如时间段分区
-- 批量删除分区数据
alter table tmp_test.tmptable drop partition(dt>='20191101',dt<='20191130')
分区数据写入
创建一个分区测试表
use tmp_test;
drop table if exists tmp_test.cs_partitionTable_20190813;
CREATE EXTERNAL TABLE IF NOT EXISTS tmp_test.cs_partitionTable_20190813
(id int
,name string
)
partitioned by (pid1 int,pid2 int)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\u001'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
location '/data/test/tmp/cs_partitionTable_20190813';
待写入数据表
use tmp_test;
drop table if exists tmp_test.cs_partitiondTable_insertdata_20190813;
CREATE EXTERNAL TABLE IF NOT EXISTS tmp_test.cs_partitiondTable_insertdata_20190813
(id int
,name string
,pid1 int
,pid2 int
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\u001'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
location '/data/test/tmp/cs_partitiondTable_insertdata_20190813';
写入数据:
insert overwrite table tmp_test.cs_partitiondTable_insertdata_20190813
select 1 as id,'香港' as name,1 as pid1,1 as pid2
union all
select 2 as id,'台湾' as name,2 as pid1,2 as pid2
union all
select 3 as id,'澳门' as name,3 as pid1,3 as pid2
union all
select 4 as id,'大陆' as name,2 as pid1,4 as pid2
- 静态分区
插入语句:
from tmp_test.cs_partitiondTable_insertdata_20190813
insert overwrite table tmp_test.cs_partitionTable_20190813
partition (pid1 = 1,pid2 = 1)
select id,name where pid1 = 1 and pid2 = 1
insert overwrite table tmp_test.cs_partitionTable_20190813
partition (pid1 = 2,pid2 = 2)
select id,name where pid1 = 2 and pid2 = 2 ;
- 动态分区
动态分区功能默认情况下没有开启。需要配置参数设定,默认严格模式执行情况下需要至少一列分区字段是静态的。
插入语句:
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nostrict;
set hive.exec.max.dynamic.partitions.pernode=1000;
insert overwrite table tmp_test.cs_partitionTable_20190813
partition (pid1,pid2)
select id,name,pid1,pid2
from tmp_test.cs_partitiondTable_insertdata_20190813
结果:
show partitions tmp_test.cs_partitionTable_20190813
- 混合分区
在动态分区的基础上前置一个静态分区,插入语句:
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nostrict;
set hive.exec.max.dynamic.partitions.pernode=1000;
insert overwrite table tmp_ihotel.cs_partitionTable_20190813
partition (pid1=1,pid2)
select id,name,pid1,pid2
from tmp_ihotel.cs_partitiondTable_insertdata_20190813
where pid1 = 1
关于分区,静态分区,动态分区,混合分区基本写法一直,主要是需要设置dynamic参数