某一次Hive执行中一直报错
报错:
Task ID:
task_1689196987662_0011_m_000000
URL:
http://hadoop01:8088/taskdetails.jsp?jobid=job_1689196987662_0011&tipid=task_1689196987662_0011_m_000000
-----
Diagnostic Messages for this Task:
Error: Java heap space
2023-07-13T05:56:13,115 INFO [HiveServer2-Background-Pool: Thread-854] impl.YarnClientImpl: Killed application application_1689196987662_0011
2023-07-13T05:56:13,117 INFO [HiveServer2-Background-Pool: Thread-854] reexec.ReOptimizePlugin: ReOptimization: retryPossible: false
2023-07-13T05:56:13,117 ERROR [HiveServer2-Background-Pool: Thread-854] ql.Driver: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
2023-07-13T05:56:13,117 INFO [HiveServer2-Background-Pool: Thread-854] ql.Driver: MapReduce Jobs Launched:
2023-07-13T05:56:13,117 WARN [HiveServer2-Background-Pool: Thread-854] mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
2023-07-13T05:56:13,117 INFO [HiveServer2-Background-Pool: Thread-854] ql.Driver: Stage-Stage-1: Map: 1 Reduce: 1 HDFS Read: 0 HDFS Write: 0 FAIL
2023-07-13T05:56:13,117 INFO [HiveServer2-Background-Pool: Thread-854] ql.Driver: Total MapReduce CPU Time Spent: 0 msec
2023-07-13T05:56:13,117 INFO [HiveServer2-Background-Pool: Thread-854] ql.Driver: Completed executing command(queryId=root_20230713055552_9ec6aa7e-e658-423b-bcf2-137bcf1a7a18); Time taken: 20.295 seconds
2023-07-13T05:56:13,117 INFO [HiveServer2-Background-Pool: Thread-854] ql.Driver: Concurrency mode is disabled, not creating a lock manager
2023-07-13T05:56:13,118 ERROR [HiveServer2-Background-Pool: Thread-854] operation.Operation: Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:335) ~[hive-service-3.1.2.jar:3.1.2]
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQL....
分析出错误原因是JVM资源分配不够
解决方案:
方法一:本地模式
在执行Hive之前,将Hive设置为本地运行模式
set hive.exec.mode.local.auto=true;
2.2 方法二:Yarn资源分配
进入yarn-site.xml修改yarn的分配资源量
vim hadoop-3.1.3/etc/hadoop/yarn-site.xml
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>4096</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>4096</value>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>2.1</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx1024m</value>
</property>
方法三:Hive堆栈资源分配
进入hive-env.sh可以发现默认分配给Hive堆栈的资源是256MB,这个才是最根本的问题
vim apache-hive-3.1.3-bin/conf/hive-env.sh
虚拟内存限制
在 yarn-site.xml 中添加如下配置:
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>