spark作业执行失败分析

spark作业执行失败,重新执行的时候,查看sparkui,发现存在大量失败的task,执行结束后,通过yarn-ui看到报错日志如下:

User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: 
ShuffleMapStage 1 (javaRDD at SumDeliveryIndexFactory.java:628) has failed the maximum allowable 
number of times: 4. Most recent failure reason: org.apache.spark.shuffle.FetchFailedException: 
Failed to connect to xxxx/10.136.22.22:34192

由报错可见,Failed to connect to xxxx/10.136.22.22:34192,连接10.136.22.22失败。进入10.136.22.22主机,查看nodemanager日志,于是看到了以下的错误信息:running beyond physical memory limits.Killing container。可见,由于使用的物理内存超出了container的内存大小,被强制kill了。

解决办法:spark-submit 添加参数,调大spark.yarn.executor.memoryOverhead=4G

错误日志:

2017-11-14 11:33:07,273 INFO 
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory 
usage of ProcessTree 236569 for container-id container_e31_1510205192678_147416_02_000024: 
40.7 GB of 40 GB physical memory used; 41.9 GB of 84 GB virtual memory used
2017-11-14 11:33:07,273 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Process 
tree for container: container_e31_1510205192678_147416_02_000024 has processes older than 1 
iteration runningover the configured limit. Limit=42949672960, current usage = 43653300224
2017-11-14 11:33:07,274 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: 
Container [pid=236569,containerID=container_e31_1510205192678_147416_02_000024] is running beyond 
physical memory limits. Current usage: 40.7 GB of 40 GB physical memory used; 41.9 GB of 84 GB 
virtual memory used. Killing container.    

参考博客:

1、Spark Executor在YARN上的内存分配
	http://blog.csdn.net/hammertank/article/details/48346285

2、yarn is running beyond physical memory limits 问题解决
	http://blog.csdn.net/oaimm/article/details/25298691

3、Yarn简单介绍及内存配置 
    http://blog.chinaunix.net/uid-28311809-id-4383551.html      

 

转载于:https://my.oschina.net/sniperLi/blog/1574280

  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值