SPARK任务提交资源分配

spark-defaults.conf 

spark.master=yarn       
# spark_deploy_mode=cluster
spark.driver.memory=1G
spark.executor.memory=1G
spark.testing.reservedMemory=0     
# 去除1.5*300M 最小提交内存限制,环境资源紧张时可以设为0

yarn-site.xml 

<property>
    <name>yarn.scheduler.increment-allocation-mb</name>
    <value>128</value>
    <description>内存规整化单位,默认是1024,这意味着,如果一个Container请求资源是1.5GB,则将被调度器规整化为ceiling(1.5 GB / 1GB) * 1G=2GB</description>
</property>

<property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>128</value>
    <description>容器可以请求的最小物理内存量(以 MiB 为单位)</description>
</property>

 

package org.apache.spark.deploy.yarn

object YarnSparkHadoopUtil {
  // Additional memory overhead
  // 10% was arrived at experimentally. In the interest of minimizing memory waste while covering
  // the common cases. Memory overhead tends to grow with container size.
  val MEMORY_OVERHEAD_FACTOR = 0.10
  val MEMORY_OVERHEAD_MIN = 384L
  // 资源紧张可将384调为128或256,重新编译
spark.yarn.executor.memoryOverhead:max(executorMemory*0.1,384m)
spark.yarn.driver.memoryOverhead:max(driverMemory*0.1,384m)

driver、executor执行的时候,用的内存可能会超过executor-memoy,所以会为executor额外预留一部分内存,memoryOverhead代表了这部分内存,计算方法如下代码

package org.apache.spark.deploy.yarn


private[spark] class Client(

  private val isClusterMode = sparkConf.get("spark.submit.deployMode", "client") == "cluster"

  // AM related configurations
  private val amMemory = if (isClusterMode) {
    sparkConf.get(DRIVER_MEMORY).toInt
  } else {
    sparkConf.get(AM_MEMORY).toInt
  }
  
  private val amMemoryOverhead = {
    val amMemoryOverheadEntry = if (isClusterMode) DRIVER_MEMORY_OVERHEAD else AM_MEMORY_OVERHEAD
    sparkConf.get(amMemoryOverheadEntry).getOrElse(
      math.max((MEMORY_OVERHEAD_FACTOR * amMemory).toLong, MEMORY_OVERHEAD_MIN)).toInt
  }
 
  // Executor related configurations
  private val executorMemory = sparkConf.get(EXECUTOR_MEMORY)
  private val executorMemoryOverhead = sparkConf.get(EXECUTOR_MEMORY_OVERHEAD).getOrElse(
    math.max((MEMORY_OVERHEAD_FACTOR * executorMemory).toLong, MEMORY_OVERHEAD_MIN)).toInt

提交任务脚本

#!/bin/bash
source /etc/profile
driverMem=128M
#增加driverCores  内存不变
driverCores=1
#增加numExecutor  内存*并行度
numExecutor=1
#增加executorCores 内存不变
executorCores=1
executorMem=128M
nohup spark-submit --master yarn --driver-memory $driverMem --driver-cores $driverCores --num-executors $numExecutor --executor-cores $executorCores --executor-memory $executorMem --class className jarName.jar >> logName.log 2>&1 &

# 19/11/22 16:55:37 INFO yarn.Client: Will allocate AM container, with 512 MB memory including 384 MB overhead

实际占用资源为

driver   overhead max(128M*0.1,384m)=384m
AM container = driver + overhead =128M + 384m = 512m
executor overhead max(128M*0.1,384m)=384m
nodemanager  = executor + overhead = 128M + 384m = 512m
all = 512m + 512m = 1024m

 

总结:

在yarn 最小分配128m,内存增量128m

cluster方式提交,设置 reservedMemory为0

不修改源码284限制,不指定 overhead 大小

提交spark任务最少需要2 cpu 1024M 内存

 

 

 

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值