Spark worker内存不足导致任务失败，报错Likely due to containers exceeding thresholds, or network issues

最新推荐文章于 2024-05-24 08:42:36 发布

春_

最新推荐文章于 2024-05-24 08:42:36 发布

阅读量2.5k

点赞数

分类专栏：遇到过的BUG 文章标签： spark 内存

本文链接：https://blog.csdn.net/weixin_43736084/article/details/124589893

版权

遇到过的BUG 专栏收录该内容

35 篇文章 6 订阅

订阅专栏

报错:

Lost executor 33 on xx.xx.xx.152: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.

原因：

由于spark某节点可用内存不足导致整个任务失败，在执行日志中找到可以上面的报错信息。

我这里应该是提交了多个任务后内存占用超过了spark可用内存，导致报错，有个任务提交占用45g内存，而spark配置的可申请内存是80G，所以导致了任务失败。

解决：

将spark-env.sh中的SPARK_WORKER_MEMORY参数调大一些，需要注意服务器内存，因为我这里可用内存还有156g，目前spark只用了80，所以直接调大了。
如果内存不足的话，就将提交任务时申请的内存executor-memory调小，保证内存够用。

spark-submit --master spark://xx.xx.xx.xx:7077 --class $main --deploy-mode client --driver-memory $driver_mem --executor-memory $exec_mem --executor-cores $exec_cores --total-executor-cores $total_core --conf spark.driver.maxResultSize=0 --conf spark.memory.fraction=0.7 --conf spark.memory.storageFraction=$storageFraction --conf spark.memory.offHeap.enabled=true --conf spark.memory.offHeap.size=5g --conf spark.executor.memoryOverhead=5G --conf spark.speculation=true --conf spark.network.timeout=3000 --conf spark.executor.extraJavaOptions="-XX:+UseG1GC -XX:-TieredCompilation -XX:G1HeapRegionSize=16m -XX:InitiatingHeapOccupancyPercent=55 -XX:SoftRefLRUPolicyMSPerMB=0 -XX:-UseCompressedClassPointers -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=256m -XX:ReservedCodeCacheSize=512m -XX:+UseCodeCacheFlushing -XX:ParallelGCThreads=20 -XX:ConcGCThreads=20 -Xms20g -XX:+PrintGCDetails -XX:+PrintGCTimeStamps" --conf spark.driver.extraJavaOptions="-XX:+UseG1GC" --jars $jars xxx-1.0.jar $date1 $max $date2  >> log/$log_file

春_

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Spark worker内存不足导致任务失败，报错Likely due to containers exceeding thresholds, or network issues

报错:Lost executor 33 on xx.xx.xx.152: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.原因：由于spark某节点可用内存不足导致整个任务失败，在执行日志中找到可以上面的报错信息。我这里应该是提交了多个任务后内存占用超过了spark可用内存，导致报
复制链接

扫一扫

专栏目录