#Spark2.2源码之Master主备切换机制
注意:Spark在standalone运行模式下,可以配置spark master的HA,当active master节点宕机,就能把standby master切换成active。
主备切换的机制有2种:
(1)基于文件系统的切换——在active master挂掉后,手动切换到standby master节点上。
(2)基于zookeeper的切换——自动切换master。 。大概流程:
1:Master接收ElectedLeader消息后,使用持久化引擎来读取storedApp、storedDriver、storedWorker,这三个集合只要有一个不为空,说明需要恢复,继续下面的逻辑
2:调用beginRecovery方法开始恢复,主要是把stored里的数据进行注册,然后把app和worker的状态重置成未知、并且向app所在的driver和worker发送消息
3:Master接收到回复消息后,会更改app和worker状态为可用
4:最后调用completeRecovery方法完成主备切换的数据恢复.
-
1、case ElectedLeader classPath:org.apache.spark.deploy.master.Master
- 开始恢复的入口
//选举Master
case ElectedLeader =>
//使用持久化引擎来读取这三个序列,如果有数据,说明需要恢复,反之不需要
val (storedApps, storedDrivers, storedWorkers) = persistenceEngine.readPersistedData(rpcEnv)
state = if (storedApps.isEmpty && storedDrivers.isEmpty && storedWorkers.isEmpty) {
RecoveryState.ALIVE
} else {
RecoveryState.RECOVERING
}
logInfo("I have been elected leader! New state: " + state)
if (state == RecoveryState.RECOVERING) {
//开始恢复
beginRecovery(storedApps, storedDrivers, storedWorkers)
recoveryCompletionTask = forwardMessageThread.schedule(new Runnable {
override def run(): Unit = Utils.tryLogNonFatalError {
//向自己发送完成恢复消息,会跳转到下面的那个case CompleteRecovery
self.send(CompleteRecovery)
}
}, WORKER_TIMEOUT_MS, TimeUnit.MILLISECONDS)
}
//完成恢复
case CompleteRecovery => completeRecovery()
2、beginRecovery classPath:org.apache.spark.deploy.master.Master
private def beginRecovery(storedApps: Seq[ApplicationInfo], storedDrivers: Seq[DriverInfo],
storedWorkers: Seq[WorkerInfo]) {
//把状态设置为未知,先把app注册到内存,并且向该app所在的driver发送变更消息
//driver(StandaloneAppClient)接收到消息会回复消息(MasterChangeAcknowledged)
for (app <- storedApps) {
logInfo("Trying to recover app: " + app.id)
try {
registerApplication(app)
app.state = ApplicationState.UNKNOWN
app.driver.send(MasterChanged(self, masterWebUiUrl))
} catch {
case e: Exception => logInfo("App " + app.id + " had exception on reconnect")
}
}
//把driver添加到缓存(疑问:这个driver通信不了怎么办?)
for (driver <- storedDrivers) {
// Here we just read in the list of drivers. Any drivers associated with now-lost workers
// will be re-launched when we detect that the worker is missing.
drivers += driver
}
//把状态设置为未知,先把worker注册到内存,并且向driver发送变更消息
worker接收到消息会回复消息(WorkerSchedulerStateResponse),并把当前worker包含的executors和drivers信息返回回来。
for (worker <- storedWorkers) {
logInfo("Trying to recover worker: " + worker.id)
try {
registerWorker(worker)
worker.state = WorkerState.UNKNOWN
worker.endpoint.send(MasterChanged(self, masterWebUiUrl))
} catch {
case e: Exception => logInfo("Worker " + worker.id + " had exception on reconnect")
}
}
}
**3、case ElectedLeader *classPath:org.apache.spark.deploy.master.Master*** : 接收driver和worker的消息回复
//driver回复了消息后,改变app状态为等待中
case MasterChangeAcknowledged(appId) =>
idToApp.get(appId) match {
case Some(app) =>
logInfo("Application has been re-registered: " + appId)
app.state = ApplicationState.WAITING
case None =>
logWarning("Master change ack from unknown app: " + appId)
}
//统计下缓存数据里的workers和apps里未知状态的个数是否为0,如果是0,则可以进行完成恢复操作
if (canCompleteRecovery) { completeRecovery() }
//worker回复消息,状态重置为存活
case WorkerSchedulerStateResponse(workerId, executors, driverIds) =>
idToWorker.get(workerId) match {
case Some(worker) =>
logInfo("Worker has been re-registered: " + workerId)
worker.state = WorkerState.ALIVE
//在刚才worker发送过来的executors里面筛选出来是属于将要恢复的app下的executor。
val validExecutors = executors.filter(exec => idToApp.get(exec.appId).isDefined)
//循环合格的executors。并把app、exec、worker在内存里关联上.
for (exec <- validExecutors) {
val app = idToApp.get(exec.appId).get
val execInfo = app.addExecutor(worker, exec.cores, Some(exec.execId))
worker.addExecutor(execInfo)
execInfo.copyState(exec)
}
//driverIds里存放的是sotredDriver数据
//循环筛选出需要恢复的driver,并且绑定worker
for (driverId <- driverIds) {
drivers.find(_.id == driverId).foreach { driver =>
driver.worker = Some(worker)
driver.state = DriverState.RUNNING
worker.drivers(driverId) = driver
}
}
case None =>
logWarning("Scheduler state from unknown worker: " + workerId)
}
//原理同上
if (canCompleteRecovery) { completeRecovery() }
**4、completeRecovery( ) *classPath:org.apache.spark.deploy.master.Master*** : 完成恢复方法,就是移除掉那些状态未知的组件,以及重新分配下没人认领的driver
private def completeRecovery() {
//保证只恢复一次,如果状态不在恢复中、则不作任何操作
// Ensure "only-once" recovery semantics using a short synchronization period.
if (state != RecoveryState.RECOVERING) { return }
state = RecoveryState.COMPLETING_RECOVERY
//把那些在上个步骤没有回复消息给Master的组件(worker,app)给移除掉
// Kill off any workers and apps that didn't respond to us.
workers.filter(_.state == WorkerState.UNKNOWN).foreach(removeWorker)
apps.filter(_.state == ApplicationState.UNKNOWN).foreach(finishApplication)
//重新调度那些没有被任何worker认领的driver
// Reschedule drivers which were not claimed by any workers
drivers.filter(_.worker.isEmpty).foreach { d =>
logWarning(s"Driver ${d.id} was not found after master recovery")
if (d.desc.supervise) {
logWarning(s"Re-launching ${d.id}")
relaunchDriver(d)
} else {
removeDriver(d.id, DriverState.ERROR, None)
logWarning(s"Did not re-launch ${d.id} because it was not supervised")
}
}
//更改本Master为存活状态
state = RecoveryState.ALIVE
schedule()
logInfo("Recovery complete - resuming operations!")
}