1.一般导入分区数据用:
insert into table target_table partition (store_day=20200303)
select column1,column2 from source_table where store_day=20200303;
但如果有很多个分区,逐个导的话很麻烦,而且每个分区一个job,要执行很多个jpb,效率很低。
2.批量导入可以这样
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true;
insert into table target_table partition (store_day)
select column1,column2,store_day from source_table where store_day >= 20190101 distribute by store_day;
3.两者区别
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true;
这两句不能少,select 中也要加上partition 列,末尾加上distribute by。