遇到的情况
我们在hive shell中使用
insert overwrite table table_p partition(time) select id,time from table_info;
进行动态分区写入数据,然后在reduce的时候报错
hive.exec.max.dynamic.partitions.pernode. Maximum was set to: 100
Caused by: org.apache.hadoop.hive.ql.metadata.HiveFatalException: [Error 20004]: Fatal error occurred when node tried to create too many dynamic partitions. The maximum number of dynamic partitions is controlled by hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode. Maximum was set to: 100 at org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynOutPaths(FileSinkOperator.java:897) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:677) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45) at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244) ... 7 more
原因
hive shell中默认的参数
hive.exec.max.dynamic.partitions.pernode
hive.exec.max.dynamic.partitions
hive.exec.max.dynamic.partitions.pernode.Maximum
hive.exec.max.created.files
是100
也就是只允许一次性生成100个分区。
对于大数据量的分区来说 肯定是不够用的。
解决方式
查看参数
set hive.exec.max.dynamic.partitions.pernode;
set hive.exec.max.dynamic.partitions.pernode.Maximum;
set hive.exec.max.dynamic.partitions;
set hive.exec.max.created.files;
设置参数
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.max.dynamic.partitions.pernode=100000;
set hive.exec.max.dynamic.partitions.pernode.Maximum=100000;
set hive.exec.max.dynamic.partitions=100000;
set hive.exec.max.created.files=100000;
设置完毕后马上重新运行insert即可。
insert overwrite table table_p partition(time) select id,time from table_info;
注意
我们在运行一个长时间的insert overwrite时reduce步骤报错hive.exec.max.dynamic.partitions.pernode. Maximum was set to: 100,这个时候 重新打开一个hive shell输入set 参数是无效的。
因为set 参数是临时生效的,只针对当前的hive shell生效,所以必须把任务停止后,重新设置参数后马上运行insert overwrite。
我们发现 在当前set 参数后 另外打开一个hive shell查看参数仍是默认值。