Container killed by YARN for exceeding memory limits. xx GB of xx GB physical memory used.

最新推荐文章于 2022-03-15 15:49:35 发布

功夫猫熊yeah

最新推荐文章于 2022-03-15 15:49:35 发布

阅读量1.1k

点赞数

分类专栏： spark

本文链接：https://blog.csdn.net/weixin_39031707/article/details/102911522

版权

spark 专栏收录该内容

28 篇文章 0 订阅

订阅专栏

spark跑任务,偶尔报如下错误:

===

19/11/03 07:40:27 ERROR
YarnClusterScheduler: Lost executor 28 on ip-10-19-201-115.ec2.internal:
Container killed by YARN for exceeding memory limits. 24.0 GB of 24 GB physical memory used.
Consider boosting spark.yarn.executor.memoryOverhead or disabling
yarn.nodemanager.vmem-check-enabled because of YARN-4714.

19/11/03 07:40:27 WARN TaskSetManager:
Lost task 1124.0 in stage 45.0 (TID 36556, ip-10-19-201-115.ec2.internal,
executor 28): ExecutorLostFailure (executor 28 exited caused by one of the
running tasks) Reason: Container killed by YARN for exceeding memory
limits. 24.0 GB of 24 GB physical memory
used. Consider boosting spark.yarn.executor.memoryOverhead or disabling
yarn.nodemanager.vmem-check-enabled because of YARN-4714.

===
解决方案 aws工程师给的方案如下:

a.) Increase memory overhead(增加 memory overhead配置)

b.) Reduce the number of executor cores(减少 excutor core)

c.) Increase the number of partitions (增加分区数量)

d.) Increase driver and executor memory(增加driver节点的excutor core)

根据建议我们修改后的spark-submit命令如下:
调整前:

spark-submit --deploy-mode cluster --master yarn --driver-memory 10G --executor-memory 20G --conf spark.executor.memoryOverhead=4096 --conf spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=2 --conf spark.network.timeout=300s --conf spark.executor.heartbeatInterval=100s --conf spark.driver.maxResultSize=3G --executor-cores 14 --conf spark.default.parallelism=800 ........

调整后:

spark-submit --deploy-mode cluster --master yarn --driver-memory 10G --executor-memory 19G --conf spark.executor.memoryOverhead=5120 --conf spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=2 --conf spark.network.timeout=300s --conf spark.executor.heartbeatInterval=100s --conf spark.driver.maxResultSize=3G --executor-cores 14 --conf spark.default.parallelism=800 ........

修改了:
–executor-memory 19G --conf spark.executor.memoryOverhead=5120
上面两个参数,估计应该可以,因为是偶发问题,所以需要过几天看看会不会有问题,如果有问题就把cores调小,把partitions调大

功夫猫熊yeah

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
Container killed by YARN for exceeding memory limits. xx GB of xx GB physical memory used.

spark跑任务,偶尔报如下错误:===19/11/03 07:40:27 ERRORYarnClusterScheduler: Lost executor 28 on ip-10-19-201-115.ec2.internal:Container killed by YARN for exceeding memory limits. 24.0 GB of 24 GB physical ...
复制链接

扫一扫