参考
ERROR:is running beyond physical memory limits.
Tuning yarn
Yarn下Mapreduce的内存参数理解
Yarn下Mapreduce的内存参数理解&xml参数配置
前因
使用Hadoop的streming.jar遇到问题
问题1:
18/10/13 19:40:56 INFO input.FileInputFormat: Total input files to process : 701930
18/10/13 20:04:22 INFO retry.RetryInvocationHandler: java.io.IOException: com.google.protobuf.ServiceException: java.lang.OutOfMemoryError: GC overhead limit exceeded, while invoking ClientNamenodeProtocolTranslatorPB.getBlockLocations over 2.master.mz/192.168.10.224:8020. Trying to failover immediately.
18/10/13 20:05:04 INFO mapreduce.JobSubmitter: Cleaning up the staging area /user/admonitor/.staging/job_1539157945372_30633
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.lang.String.substring(String.java:1933)
at java.util.Formatter.parse(Formatter.java:2567)
at java.util.Formatter.format(Formatter.java:2501)
at java.util.Formatter.format(Formatter.java:2455)
at java.lang.String.format(String.java:2940)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:471)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
outofmemory,GC,读取大量小文件getBlockLocations时出错
添加参数HADOOP_CLIENT_OPTS,该选项增加的参数,将会作用于多个命令,如fs, dfs, fsck, distcp等
HADOOP_CLIENT_OPTS="-Xmx8192M" hadoop jar $stream_jar ...
问题2:
Container [pid=100823,containerID=container_e39_1539157945372_36692_01_000527] is running 108359680B beyond the 'PHYSICAL' memory limit. Current usage: 1.1 GB of 1 GB physical memory used; 3.1 GB of 2.1 GB virtual memory used. Killing container.
物理内存和虚拟内存不足
添加内存,需要注意的是需要判断是map还是reduce过程出现的内存不足
-Dmapreduce.map.memory.mb=8192 \
-Dmapreduce.map.java.opts=-Xmx7168M \
-Dmapreduce.reduce.memory.mb=4096 \
-Dmapreduce.reduce.java.opts=-Xmx3072M \
关于yarn下的内存参数配置
参数描述
name | 默认值 | 描述 |
---|---|---|
yarn.nodemanager.resource.memory-mb | 8GB | Amount of physical memory, in MB, that can be allocated for containers. If set to -1 and yarn.nodemanager.resource.detect-hardware-capabilities is true, it is automatically calculated(in case of Windows and Linux). In other cases, the default is 8192MB. |
yarn.nodemanager.vmem-pmem-ratio | 2.1 | 虚拟内存率,是占task所用内存的百分比,默认值为2.1倍。 |
yarn.scheduler.minimum-allocation-mb | 1G | 单个container可申请的最小与最大内存 |
yarn.scheduler.maximum-allocation-mb | 8G | |
mapreduce.map.memory.mb | 设置container大小 | |
mapreduce.reduce.memory.mb | ||
mapreduce.map.java.opts | 设置container启动jvm相关参数,比memory.mb小,一般设置为0.75倍的memory.mb | |
mapreduce.reduce.java.opts |