Master实际上可以配置两个,Spark原生的standalone模式支持Master主备切换,也就是说,当Active Master节点挂掉的时候,我们可以将Stand Master切换为Active Master
Spark Master主备切换可以基于两种机制,一种是基于文件系统的,一种是基于Zookeeper的,基于文件系统的主备切换机制,需要再Active Master挂掉之后,由我们手动去切换Standby Master上,而基于Zookeeper的主备雀环机制,可以实现自动切换Master
所以,这里说的Master主备切换机制,实际上指的就是,在Active Master挂掉之后,切换到Standby Master时,会做哪些操作
流程图
主备切换机制原理剖析.png
源码解释
持久化引擎有ZooKeeperPersistenceEngine,FileSystemPersistenceEngine
持久化引擎创建再preStart()方法里面
// 创建持久化引擎
val (persistenceEngine_, leaderElectionAgent_) = RECOVERY_MODE match {
// Zookeeper类型的持久化引擎
case "ZOOKEEPER" =>
logInfo("Persisting recovery state to ZooKeeper")
val zkFactory =
new ZooKeeperRecoveryModeFactory(conf, SerializationExtension(context.system))
(zkFactory.createPersistenceEngine(), zkFactory.createLeaderElectionAgent(this))
// 本地系统类型的持久化引擎
case "FILESYSTEM" =>
val fsFactory =
new FileSystemRecoveryModeFactory(conf, SerializationExtension(context.system))
(fsFactory.createPersistenceEngine(), fsFactory.createLeaderElectionAgent(this))
// 自定义类型的持久化引擎
case "CUSTOM" =>
val clazz = Class.forName(conf.get("spark.deploy.recoveryMode.factory"))
val factory = clazz.getConstructor(conf.getClass, Serialization.getClass)
.newInstance(conf, SerializationExtension(context.system))
.asInstanceOf[StandaloneRecoveryModeFactory]
(factory.createPersistenceEngine(), factory.createLeaderElectionAgent(this))
case _ =>
(new BlackHolePersistenceEngine(), new MonarchyLeaderAgent(this))
}
persistenceEngine = persistenceEngine_
leaderElectionAgent = leaderElectionAgent_
}
使用持久化引擎去读取持久化的storedApps, storedDrivers, storedWorkers
判断,如果storedApps, storedDrivers, storedWorkers有任何一个是非空的
将持久化的Application、Driver、Worker的信息重新注册,注册到Master内部的内存缓存结构中
case ElectedLeader => {
// 从持久化引擎中获取数据,app,driver,worker等信息
val (storedApps, storedDrivers, storedWorkers) = persistenceEngine.readPersistedData()
state = if (storedApps.isEmpty && storedDrivers.isEmpty && storedWorkers.isEmpty) {
// 如果app,driver,wroker是空的,RecoveryState 设置为ALIVE
RecoveryState.ALIVE
} else {
// 有一个不为空 设置为RECOVERING
RecoveryState.RECOVERING
}
logInfo("I have been elected leader! New state: " + state)
// 判断状态如果为RECOVERING 恢复中
if (state == RecoveryState.RECOVERING) {
// 将storedApps,storedDrier,storeWorkers重新注册到master内部缓存结构中
beginRecovery(storedApps, storedDrivers, storedWorkers)
recoveryCompletionTask = context.system.scheduler.scheduleOnce(WORKER_TIMEOUT millis, self,
CompleteRecovery)
}
}
详细看下上面代码的beginRecovery()方法
// 开始恢复
def beginRecovery(storedApps: Seq[ApplicationInfo], storedDrivers: Seq[DriverInfo],
storedWorkers: Seq[WorkerInfo]) {
for (app <- storedApps) {
logInfo("Trying to recover app: " + app.id)
try {
//重新注册application
registerApplication(app)
//将application状态设置为unknown
app.state = ApplicationState.UNKNOWN
//向driver发送masterChanged消息
app.driver ! MasterChanged(masterUrl, masterWebUiUrl)
} catch {
case e: Exception => logInfo("App " + app.id + " had exception on reconnect")
}
}
//将storedDrivers重新加入内存缓存中
for (driver <- storedDrivers) {
// Here we just read in the list of drivers. Any drivers associated with now-lost workers
// will be re-launched when we detect that the worker is missing.
drivers += driver
}
//将storedWorkers重新加入内存缓存中
for (worker <- storedWorkers) {
logInfo("Trying to recover worker: " + worker.id)
try {
//重新注册worker
registerWorker(worker)
//将worker状态修改为unknown
worker.state = WorkerState.UNKNOWN
//向work发用masterChanged
worker.actor ! MasterChanged(masterUrl, masterWebUiUrl)
} catch {
case e: Exception => logInfo("Worker " + worker.id + " had exception on reconnect")
}
}
}
详细看下registerWorker()和registerApplication()方法
// 注册Application
def registerApplication(app: ApplicationInfo): Unit = {
val appAddress = app.driver.path.address
if (addressToApp.contains(appAddress)) {
logInfo("Attempted to re-register application at same address: " + appAddress)
return
}
//spark测量系统通注册appsource
applicationMetricsSystem.registerSource(app.appSource)
//将APP加入内存缓存中
apps += app
idToApp(app.id) = app
actorToApp(app.driver) = app
addressToApp(appAddress) = app
//等待调度的队列
waitingApps += app
}
def registerWorker(worker: WorkerInfo): Boolean = {
// There may be one or more refs to dead workers on this same node (w/ different ID's),
// remove them.
//在同一个节点上可能有一个或多个死掉的worker(不同ID),删除它们。
workers.filter { w =>
(w.host == worker.host && w.port == worker.port) && (w.state == WorkerState.DEAD)
}.foreach { w =>
workers -= w
}
val workerAddress = worker.actor.path.address
if (addressToWorker.contains(workerAddress)) {
val oldWorker = addressToWorker(workerAddress)
if (oldWorker.state == WorkerState.UNKNOWN) {
// A worker registering from UNKNOWN implies that the worker was restarted during recovery.
// The old worker must thus be dead, so we will remove it and accept the new worker.
//从UNKNOWN注册的worker意味着worker在恢复期间重新启动。
//因此,老worker必须死亡,所以我们会把它删除并接受新的worker。
removeWorker(oldWorker)
} else {
logInfo("Attempted to re-register worker at same address: " + workerAddress)
return false
}
}
//保存workerInfo到wokers(hashmap)中
workers += worker
//保存worker的id到idToWorker(hashmap)中
idToWorker(worker.id) = worker
//将work端点的地址保存起来
addressToWorker(workerAddress) = worker
true
}
最后看一下completeRecovery()方法
// 完成恢复
def completeRecovery() {
// Ensure "only-once" recovery semantics using a short synchronization period.
//使用短的同步时间确保“只有一次”恢复语义。
synchronized {
//清理机制:1.从内存缓存结构中移除。2.从相关的组件的内存中移除。3.从持久化存储中移除
if (state != RecoveryState.RECOVERING) { return }
//将状态修改为正在恢复
state = RecoveryState.COMPLETING_RECOVERY
}
// Kill off any workers and apps that didn't respond to us.
// 过滤出来任何对我们没有回应的worker和Apps,根据workstate和applicationstate判断是否为unknown
// 然后分别执行removerWorker和finishApplication,来删除worker和application
// 删除worker
workers.filter(_.state == WorkerState.UNKNOWN).foreach(removeWorker)
//删除application
apps.filter(_.state == ApplicationState.UNKNOWN).foreach(finishApplication)
// Reschedule drivers which were not claimed by any workers
//重新调度 那些没有回应worker的 drivers
drivers.filter(_.worker.isEmpty).foreach { d =>
logWarning(s"Driver ${d.id} was not found after master recovery")
if (d.desc.supervise) {
logWarning(s"Re-launching ${d.id}")
//重新启动driver
relaunchDriver(d)
} else {
//删除driver
removeDriver(d.id, DriverState.ERROR, None)
logWarning(s"Did not re-launch ${d.id} because it was not supervised")
}
}
//将state转为alive,代表恢复完成
state = RecoveryState.ALIVE
//重新调用schedule()恢复完成
schedule()
logInfo("Recovery complete - resuming operations!")
}