内存溢出
Client端内存溢出
Client端发生内存溢出执行下面的看是很简单的一条sql语句:
hive> select count(1) from test_tb_1_1;Query ID = hdfs_20180802104347_615d0836-cf41-475d-9bec-c62a1f408b21
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapreduce.job.reduces=
FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. Java heap space
报错原因:
该语句会进行全表全分区扫描,如果该表的分区数很多,数据量很大,可能就会出现客户端内存不足的报错。
注:客户端报内存溢出的判断依据,通过查看客户端输出来的日志中,还没有打印出作业的application id 信息(信息样式如下)就报内存溢出的异常了,在ResourceManager上也查看不到该作业的任何信息。
由于是客户端,在启动hive的时候就要指定好参数,启动之后修改不了,因此需要在启动hive命令之前,先修改环境变量
export HIVE_CLIENT_OPTS="-Xmx1536m -XX:MaxDirectMemorySize=512m"
默认值是 1.5G , 可以根据需要调大一些。 例如:
export HIVE_CLIENT_OPTS="-Xmx2g -XX:MaxDirectMemorySize=512m"
ApplicationMaster端内存溢出
为了演示,先将am的内存调小到 512M。 之后客户端输入如下的报错信息:set yarn.app.mapreduce.am.resource.mb=512;
select count(1) from (select num, count(1) from default.test_tb_1_1 where part>180 and part<190 group by num) a;Query ID = hdfs_20180802155500_744b90db-8a64-460f-a5cc-1428ae61931b
Total jobs = 2
Launching Job 1 out of 2
Number of reduce tasks not specified. Estimated from input data size: 849
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapreduce.job.reduces=
Starting Job = job_1533068988045_266336, Tracking URL = http://hadoop-jrsy-rm02.pekdc1.jdfin.local:8088/proxy/application_1533068988045_266336/
Kill Command = /soft/home/hadoop/bin/hadoop job -kill job_1533068988045_266336
Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0
2018-08-02 15:55:48,910 Stage-1 map = 0%, reduce = 0%
Ended Job = job_1533068988045_266336 with errors
Error during job, obtaining debugging information...
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
从报错信息中可以看出有application id 信息,说明任务提交到了ResourceManager上,不是client端导致的,但是有没有看到container容器失败的信息,直接0%后就失败,初步可以判断是由于application master内存不足导致的,
进一步确定需要两种途径都行:
1)在ResourceManager的界面查看程序的运行信息,可以看到如下的报错信息:
Current usage: 348.2 MB of 512 MB physical memory used; 5.2 GB of 1.5 GB virtual memory used. Killing container.
说明内存溢出了,Application application_1533068988045_266336 failed 2 times due to AM Container for appattempt_1533068988045_266336_000002 exited with exitCode: -103
可以确定是application master发生的内存溢出。