Spark2.2源码之Master主备切换机制

#Spark2.2源码之Master主备切换机制

注意:Spark在standalone运行模式下,可以配置spark master的HA,当active master节点宕机,就能把standby master切换成active。
主备切换的机制有2种:
(1)基于文件系统的切换——在active master挂掉后,手动切换到standby master节点上。
(2)基于zookeeper的切换——自动切换master。

大概流程:
1:Master接收ElectedLeader消息后,使用持久化引擎来读取storedApp、storedDriver、storedWorker,这三个集合只要有一个不为空,说明需要恢复,继续下面的逻辑
2:调用beginRecovery方法开始恢复,主要是把stored里的数据进行注册,然后把app和worker的状态重置成未知、并且向app所在的driver和worker发送消息
3:Master接收到回复消息后,会更改app和worker状态为可用
4:最后调用completeRecovery方法完成主备切换的数据恢复.

1、case ElectedLeader classPath:org.apache.spark.deploy.master.Master
开始恢复的入口
//选举Master
 case ElectedLeader =>
 //使用持久化引擎来读取这三个序列,如果有数据,说明需要恢复,反之不需要
      val (storedApps, storedDrivers, storedWorkers) = persistenceEngine.readPersistedData(rpcEnv)
      state = if (storedApps.isEmpty && storedDrivers.isEmpty && storedWorkers.isEmpty) {
        RecoveryState.ALIVE
      } else {
        RecoveryState.RECOVERING
      }
      logInfo("I have been elected leader! New state: " + state)
      if (state == RecoveryState.RECOVERING) {
	    //开始恢复
        beginRecovery(storedApps, storedDrivers, storedWorkers)
        recoveryCompletionTask = forwardMessageThread.schedule(new Runnable {
          override def run(): Unit = Utils.tryLogNonFatalError {
            //向自己发送完成恢复消息,会跳转到下面的那个case CompleteRecovery 
            self.send(CompleteRecovery)
          }
        }, WORKER_TIMEOUT_MS, TimeUnit.MILLISECONDS)
      }
//完成恢复
case CompleteRecovery => completeRecovery()

2、beginRecovery classPath:org.apache.spark.deploy.master.Master

private def beginRecovery(storedApps: Seq[ApplicationInfo], storedDrivers: Seq[DriverInfo],
      storedWorkers: Seq[WorkerInfo]) {
	//把状态设置为未知,先把app注册到内存,并且向该app所在的driver发送变更消息
	//driver(StandaloneAppClient)接收到消息会回复消息(MasterChangeAcknowledged)
    for (app <- storedApps) {
      logInfo("Trying to recover app: " + app.id)
      try {
        registerApplication(app)
        app.state = ApplicationState.UNKNOWN
        app.driver.send(MasterChanged(self, masterWebUiUrl))
      } catch {
        case e: Exception => logInfo("App " + app.id + " had exception on reconnect")
      }
    }
    //把driver添加到缓存(疑问:这个driver通信不了怎么办?)
    for (driver <- storedDrivers) {
      // Here we just read in the list of drivers. Any drivers associated with now-lost workers
      // will be re-launched when we detect that the worker is missing.
      drivers += driver
    }
	//把状态设置为未知,先把worker注册到内存,并且向driver发送变更消息
	worker接收到消息会回复消息(WorkerSchedulerStateResponse),并把当前worker包含的executors和drivers信息返回回来。
    for (worker <- storedWorkers) {
      logInfo("Trying to recover worker: " + worker.id)
      try {
        registerWorker(worker)
        worker.state = WorkerState.UNKNOWN
        worker.endpoint.send(MasterChanged(self, masterWebUiUrl))
      } catch {
        case e: Exception => logInfo("Worker " + worker.id + " had exception on reconnect")
      }
    }
  }

**3、case ElectedLeader *classPath:org.apache.spark.deploy.master.Master*** : 接收driver和worker的消息回复
	//driver回复了消息后,改变app状态为等待中
case MasterChangeAcknowledged(appId) =>
      idToApp.get(appId) match {
        case Some(app) =>
          logInfo("Application has been re-registered: " + appId)
          app.state = ApplicationState.WAITING
        case None =>
          logWarning("Master change ack from unknown app: " + appId)
      }
	  //统计下缓存数据里的workers和apps里未知状态的个数是否为0,如果是0,则可以进行完成恢复操作
      if (canCompleteRecovery) { completeRecovery() }
      
	//worker回复消息,状态重置为存活
    case WorkerSchedulerStateResponse(workerId, executors, driverIds) =>
      idToWorker.get(workerId) match {
        case Some(worker) =>
          logInfo("Worker has been re-registered: " + workerId)
          worker.state = WorkerState.ALIVE
		  //在刚才worker发送过来的executors里面筛选出来是属于将要恢复的app下的executor。
          val validExecutors = executors.filter(exec => idToApp.get(exec.appId).isDefined)
          //循环合格的executors。并把app、exec、worker在内存里关联上.
          for (exec <- validExecutors) {
            val app = idToApp.get(exec.appId).get
            val execInfo = app.addExecutor(worker, exec.cores, Some(exec.execId))
            worker.addExecutor(execInfo)
            execInfo.copyState(exec)
          }
		  //driverIds里存放的是sotredDriver数据
		  //循环筛选出需要恢复的driver,并且绑定worker
          for (driverId <- driverIds) {
            drivers.find(_.id == driverId).foreach { driver =>
              driver.worker = Some(worker)
              driver.state = DriverState.RUNNING
              worker.drivers(driverId) = driver
            }
          }
        case None =>
          logWarning("Scheduler state from unknown worker: " + workerId)
      }
	  //原理同上
      if (canCompleteRecovery) { completeRecovery() }

**4、completeRecovery( ) *classPath:org.apache.spark.deploy.master.Master*** : 完成恢复方法,就是移除掉那些状态未知的组件,以及重新分配下没人认领的driver
private def completeRecovery() {
    //保证只恢复一次,如果状态不在恢复中、则不作任何操作
    // Ensure "only-once" recovery semantics using a short synchronization period.
    if (state != RecoveryState.RECOVERING) { return }
    state = RecoveryState.COMPLETING_RECOVERY

    //把那些在上个步骤没有回复消息给Master的组件(worker,app)给移除掉
    // Kill off any workers and apps that didn't respond to us.
    workers.filter(_.state == WorkerState.UNKNOWN).foreach(removeWorker)
    apps.filter(_.state == ApplicationState.UNKNOWN).foreach(finishApplication)

    //重新调度那些没有被任何worker认领的driver
    // Reschedule drivers which were not claimed by any workers
    drivers.filter(_.worker.isEmpty).foreach { d =>
      logWarning(s"Driver ${d.id} was not found after master recovery")
      if (d.desc.supervise) {
        logWarning(s"Re-launching ${d.id}")
        relaunchDriver(d)
      } else {
        removeDriver(d.id, DriverState.ERROR, None)
        logWarning(s"Did not re-launch ${d.id} because it was not supervised")
      }
    }
	//更改本Master为存活状态
    state = RecoveryState.ALIVE
    schedule()
    logInfo("Recovery complete - resuming operations!")
  }
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值