将spark任务运行与yarn上出现以下错误:
scala> 18/11/21 16:20:11 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!
18/11/21 16:20:11 ERROR client.TransportClient: Failed to send RPC 5346982634168622865 to /192.168.88.155:58312: java.nio.channels.ClosedChannelException
java.nio.channels.ClosedChannelException
at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source)
18/11/21 16:20:11 ERROR cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Sending RequestExecutors(0,0,Map(),Set()) to AM was unsuccessful
java.io.IOException: Failed to send RPC 5346982634168622865 to /192.168.88.155:58312: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.TransportClient.lambda$sendRpc$2(TransportClient.java:237)
1.因为spark on yarn,首先查看ResourceMangaer的日志:
2018-11-21 16:20:12,048 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Null container completed...
2018-11-21 16:20:14,714 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Null container completed...
说明container出现了问题,但具体原因尚不知
2.查看NodeMangaer日志:
2018-11-21 16:19:49,777 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Container [pid=7203,containerID=container_1542787555988_0001_01_000001] is running beyond virtual memory limits. Current usage: 173.8 MB of 1 GB physical memory used; 2.3 GB of 2.1 GB virtual memory used. Killing container.
很明显的看出container使用的虚拟内存超过了设置的2.1G
container使用的虚拟内存是由以下公式计算的:
虚拟内存=yarn.scheduler.minimum-allocation-mb * yarn.nodemanager.vmem-pmem-ratio
参数都是在yarn-site.xml中配置的,如果需要使用的虚拟内存总量超过这个公式计算的值,就会Killing container.
此外,我的yarn.scheduler.minimum-allocation-mb值并没有设置,因此默认为1G,yarn.nodemanager.vmem-pmem-ratio也没设置,默认为2.1,所以就出现了日志中的用了1G里的360M物理内存,用了2.1G里的2.4G虚拟内存。
然后修改yarn-site.xml如下配置
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>9000</value>
<discription>每个任务最多可用内存,默认8182MB</discription>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>3072</value>
<discription>每个任务最小可用内存</discription>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>3</value> ###物理内存和虚拟内存比率
</property>