spark 运行内存异常及参数调整

最新推荐文章于 2024-04-28 07:59:10 发布

dub_lys

最新推荐文章于 2024-04-28 07:59:10 发布

阅读量1.5k

点赞数

文章标签： spark 内存

本文链接：https://blog.csdn.net/dub_lys/article/details/75442406

版权

主要异常信息:org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 0

解决方法：加大executor内存，减少executor个数，加大executor并发度

主要异常信息:ExecutorLostFailure (executor 3 exited caused by one of the running tasks) Reason: Container killed by YARN for exceeding memory limits. 61.0 GB of 61 GB physical memory used

解决方法：移除RDD缓存操作，增加该JOB的spark.storage.memoryFraction系数值，增加该job的spark.yarn.executor.memoryOverhead值

下面解释执行一次提交参数的解释

spark-shell --master yarn --driver-memory 5G   --executor-cores 1 --queue test --num-executors 13
就好比我这个在yarn上边跑的任务：
driver是5G，这个没什么说的。
executor-cores 1 表示一个executor使用使用1个核的cpu
num-executors 13 表示我集群开13个executor
也就是最终要占用14个核，因为driver也要占一个。
计算内存呢，用yarn管理的yarn-ste.xml上设置一个容器设置的5G内存：所以我的计算内存是executor数乘以每个容器的内存大小，也就是13＊5=65G计算内存

每个节点可以起一个或多个Executor。
    每个Executor由若干core组成，每个Executor的每个core一次只能执行一个Task。
    每个Task执行的结果就是生成了目标RDD的一个partiton。
Task被执行的并发度 = Executor数目 * 每个Executor核数
至于partition的数目：
    对于数据读入阶段，例如sc.textFile，输入文件被划分为多少InputSplit就会需要多少初始Task。
    在Map阶段partition数目保持不变。
    在Reduce阶段，RDD的聚合会触发shuffle操作，聚合后的RDD的partition数目跟具体操作有关，例如repartition操作会聚合成指定分区数，还有一些算子是可配置的。