1)不显示的给出分区名,根据列的取值自动建立对应分区(多少种取值,多少种分区),所以需要限制最大分区数:
SET hive.exec.dynamic.partition=true;
SET hive.exec.max.dynamic.partitions.pernode=1000;
SET hive.exec.max.dynamic.partitions=3000;
2)分区全部使用动态分区,还要设置为nonstrict模式,否则无法执行
set hive.exec.dynamic.partition.mode=nonstrict;
3)动态分区按位置来对应,跟名称无关。所以查询select中必须在最后字段按动态分区顺序给出对应字段
insert overwrite table tb_pmp_raw_log_analysis_count partition (day='2016-05-17', media,type)
select advertiser_id,ad_plan_id,crt_id,hour,ad_place_id,city_bidrequest,device,network,os,category,channel,ad_type,rtb_type,price,count(1) as cnt,media, 1as type
from (
select
split(all,'\\\\|~\\\\|')[41] as advertiser_id,
split(all,'\\\\|~\\\\|')[10] as ad_plan_id,
split(all,'\\\\|~\\\\|')[11] as crt_id,
substr(split(all,'\\\\|~\\\\|')[0],12,2) ashour,
ca