Driver在Cluster模式下的启动、两种不同的资源调度方式源码彻底解析、资源调度内幕总结

1、资源调度的方法:schedule()

private def schedule(): Unit = {
if (state != RecoveryState.ALIVE) { return }//判断Master的状态,只有在ALIVE的状态下才能够对应用程序调度资源
// Drivers take strict precedence over executors Driver的启动优先于Executor的启动
val shuffledWorkers = Random.shuffle(workers) // Randomization helps balance drivers 将注册给Master的Worker打乱顺序,以便于实现公平调度
for (worker <- shuffledWorkers if worker.state == WorkerState.ALIVE) {//判断Worker的状态,必须是ALIVE
for (driver <- waitingDrivers) {
if (worker.memoryFree >= driver.desc.mem && worker.coresFree >= driver.desc.cores) {//判断Worker是否符合Driver的core和mem要求
launchDriver(worker, driver)//启动Driver
waitingDrivers -= driver
}
}
}
startExecutorsOnWorkers()//然后在Worker启动Executor
}
说明:当SparkSubmit指定Driver在Cluster模式的情况下,此时Driver会加入waitingDrivers等待列表中,在Client模式下,不需要加入到waitingDrivers等待列表,因为提交Driver(提交应用程序)的时候就会启动Driver

2、launchDriver方法源码:
Master发指令给Worker,让远程的Worker启动Driver
private def launchDriver(worker: WorkerInfo, driver: DriverInfo) {
logInfo(“Launching driver ” + driver.id + ” on worker ” + worker.id)
worker.addDriver(driver)
driver.worker = Some(worker)
worker.endpoint.send(LaunchDriver(driver.id, driver.desc))
driver.state = DriverState.RUNNING
}
3、开始启动Executor
Spark默认为应用程序启动Executor的方式是FIFO的方 式,也就是所有提交的应用程序都是放在调度的等待队列中的,先进先出,只有满足了前面应用程序的资源分配的基础上才能够满足下一个应用程序资源的分配;源码支持:Master类->startExecutorsOnWorkers方法

private def startExecutorsOnWorkers(): Unit = {
// Right now this is a very simple FIFO scheduler. We keep trying to fit in the first app
// in the queue, then the second app, etc.
for (app <- waitingApps if app.coresLeft > 0) {

val coresPerExecutor: Option[Int] = app.desc.coresPerExecutor //一个Executor中core数量
// Filter out workers that don’t have enough resources to launch an executor
val usableWorkers = workers.toArray.filter(_.state == WorkerState.ALIVE)
.filter(worker => worker.memoryFree >= app.desc.memoryPerExecutorMB &&
worker.coresFree >= coresPerExecutor.getOrElse(1))
.sortBy(_.coresFree).reverse //具体分配Executor之前要对要求Worker必须是ALIVE的状态且必须满足Application对每个Worker(Executor)的内存和Cores的要求,并且在此基础上进行排序产生计算资源(cores)由大到小的usableWorkers数据结构
val assignedCores = scheduleExecutorsOnWorkers(app, usableWorkers, spreadOutApps)

// Now that we’ve decided how many cores to allocate on each worker, let’s allocate them
for (pos <- 0 until usableWorkers.length if assignedCores(pos) > 0) {
allocateWorkerResourceToExecutors(
app, assignedCores(pos), coresPerExecutor, usableWorkers(pos))
}
}
}
4、scheduleExecutorsOnWorkers 每一个Executor具体分配计划
private val spreadOutApps = conf.getBoolean(“spark.deploy.spreadOut”, true)
为应用程序分配Executors有两种方式,第一种方式是尽可能在集群的所有Worker上分配Executor,这种方式往往会带来潜在的更好的数据本地性;(只是顺便带来潜在的更好的数据本地性,他只是考虑最大化的使用集群的机器的并发资源以能够让应用程序更好的并发处理);第二种方式是在尽可能少的节点上分配Executor
private def scheduleExecutorsOnWorkers(
app: ApplicationInfo,
usableWorkers: Array[WorkerInfo],
spreadOutApps: Boolean): Array[Int] = {
val coresPerExecutor = app.desc.coresPerExecutor
val minCoresPerExecutor = coresPerExecutor.getOrElse(1)
val oneExecutorPerWorker = coresPerExecutor.isEmpty
val memoryPerExecutor = app.desc.memoryPerExecutorMB
val numUsable = usableWorkers.length
val assignedCores = new ArrayInt // Number of cores to give to each worker
val assignedExecutors = new ArrayInt // Number of new executors on each worker
var coresToAssign = math.min(app.coresLeft, usableWorkers.map(_.coresFree).sum)

def canLaunchExecutor(pos: Int): Boolean = {
val keepScheduling = coresToAssign >= minCoresPerExecutor
val enoughCores = usableWorkers(pos).coresFree - assignedCores(pos) >= minCoresPerExecutor

// If we allow multiple executors per worker, then we can always launch new executors.
// Otherwise, if there is already an executor on this worker, just give it more cores.
val launchingNewExecutor = !oneExecutorPerWorker || assignedExecutors(pos) == 0//当我们没有配置coresPerExecutor参数,那么他会使用当前Worker中的所有Core,并且只是产生一个Executor,当我们配置了coresPerExecutor参数,那么他会使用当前Worker的指定cores数量,并且一个Worker中会产生多个Executor
if (launchingNewExecutor) {
val assignedMemory = assignedExecutors(pos) * memoryPerExecutor
val enoughMemory = usableWorkers(pos).memoryFree - assignedMemory >= memoryPerExecutor
val underLimit = assignedExecutors.sum + app.executors.size < app.executorLimit
keepScheduling && enoughCores && enoughMemory && underLimit
} else {
// We’re adding cores to an existing executor, so no need
// to check memory and executor limits
keepScheduling && enoughCores
}
}

// Keep launching executors until no more workers can accommodate any
// more executors, or if we have reached this application’s limits
var freeWorkers = (0 until numUsable).filter(canLaunchExecutor)
while (freeWorkers.nonEmpty) {
freeWorkers.foreach { pos =>
var keepScheduling = true
while (keepScheduling && canLaunchExecutor(pos)) {
coresToAssign -= minCoresPerExecutor
assignedCores(pos) += minCoresPerExecutor

// If we are launching one executor per worker, then every iteration assigns 1 core
// to the executor. Otherwise, every iteration assigns cores to a new executor.
if (oneExecutorPerWorker) {
assignedExecutors(pos) = 1
} else {
assignedExecutors(pos) += 1
}

// Spreading out an application means spreading out its executors across as
// many workers as possible. If we are not spreading out, then we should keep
// scheduling executors on this worker until we use all of its resources.
// Otherwise, just move on to the next worker.
if (spreadOutApps) {
keepScheduling = false
}
}
}
freeWorkers = freeWorkers.filter(canLaunchExecutor)
}
assignedCores
}

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值