- org.apache.spark.deploy.master.Master
- 我们先看Master的伴生对象,此处是Java进程的入口(被
start-master.sh
启动)private[deploy] object Master extends Logging {
val SYSTEM_NAME = "sparkMaster"
val ENDPOINT_NAME = "Master"
def main(argStrings: Array[String]) {
Thread.setDefaultUncaughtExceptionHandler(new SparkUncaughtExceptionHandler(
exitOnUncaughtException = false))
Utils.initDaemon(log)
val conf = new SparkConf
val args = new MasterArguments(argStrings, conf)
val (rpcEnv, _, _) = startRpcEnvAndEndpoint(args.host, args.port, args.webUiPort, conf)
rpcEnv.awaitTermination()
}
def startRpcEnvAndEndpoint(
host: String,
port: Int,
webUiPort: Int,
conf: SparkConf): (RpcEnv, Int, Option[Int]) = {
val securityMgr = new SecurityManager(conf)
val rpcEnv = RpcEnv.create(SYSTEM_NAME, host, port, conf, securityMgr)
val masterEndpoint = rpcEnv.setupEndpoint(ENDPOINT_NAME,
new Master(rpcEnv, rpcEnv.address, webUiPort, securityMgr, conf))
val portsResponse = masterEndpoint.askSync[BoundPortsResponse](BoundPortsRequest)
(rpcEnv, portsResponse.webUIPort, portsResponse.restPort)
}
}
- 此处代码,相对来说还是比较简单的。Shell调用
start-master.sh
后,会启动一个Java进程。传入的参数则被MasterArguments进行了解析,最重要的参数是host、port、webUiPort。 - 接着,就会调用startRpcEnvAndEndpoint(…),开始创建NettyRpcEnv与Master,并将Master注册进RpcEnv。
- 创建NettyRpcEnv是利用的NettyRpcEnvFactory调用create(…)
- Master则是直接被new实例化,此时该RpcEndpoint的构造器被调用
- 注册Master则是调用了setupEndpoint(…),进而调用了dispatcher的registerRpcEndpoint(…)方法:
- 为Master创建了一个EndpointData,包含一个Inbox。Inbox实例化时顺带将OnStart消息放入了队列。
- 将EndpointData放入了receivers队列中,后续会被MessageLoop取出
- 因此,我们可以看到,Master被实例化时,先调用了其构造器。接着,将其注册入RpcEnv时,其Inbox中放入了第一条消息OnStart。然后,该消息OnStart将被MessageLoop取出并处理,调用了Master这个Endpoint的onStart方法。也就是说Master的生命周期前面部分是:constructor -> onStart -> …
- Worker在启动时,是需要注册到Master的,我们来详细看看此部分代码。
- Worker的onStart()中调用的registerWithMaster()方法如下
private def registerWithMaster() {
registrationRetryTimer match {
case None =>
registered = false
registerMasterFutures = tryRegisterAllMasters()
connectionAttemptCount = 0
registrationRetryTimer = Some(forwordMessageScheduler.scheduleAtFixedRate(
new Runnable {
override def run(): Unit = Utils.tryLogNonFatalError {
Option(self).foreach(_.send(ReregisterWithMaster))
}
},
INITIAL_REGISTRATION_RETRY_INTERVAL_SECONDS,
INITIAL_REGISTRATION_RETRY_INTERVAL_SECONDS,
TimeUnit.SECONDS))
case Some(_) =>
logInfo("Not spawning another attempt to register with the master, since there is an" +
" attempt scheduled already.")
}
}
- 接着,再看tryRegisterAllMasters()的代码
private def tryRegisterAllMasters(): Array[JFuture[_]] = {
masterRpcAddresses.map { masterAddress =>
registerMasterThreadPool.submit(new Runnable {
override def run(): Unit = {
try {
logInfo("Connecting to master " + masterAddress + "...")
val masterEndpoint = rpcEnv.setupEndpointRef(masterAddress, Master.ENDPOINT_NAME)
sendRegisterMessageToMaster(masterEndpoint)
} catch {
case ie: InterruptedException =>
case NonFatal(e) => logWarning(s"Failed to connect to master $masterAddress", e)
}
}
})
}
}
- 再看sendRegisterMessageToMaster(…)方法
private def sendRegisterMessageToMaster(masterEndpoint: RpcEndpointRef): Unit = {
masterEndpoint.send(RegisterWorker(
workerId,
host,
port,
self,
cores,
memory,
workerWebUiUrl,
masterEndpoint.address))
}
- 此处,正式向Master发送了消息RegisterWorker,进行注册
- 快速查看技巧:利用’ctrl+鼠标左键’点击RegisterWorker,看到case class RegisterWorker。再次利用’ctrl+鼠标左键’点击RegisterWorker,IDEA会为我们展示出什么地方使用了它。可以看到IDEA展示的部分:
Worker.scala <- masterEndpoint.send(RegisterWorker(
,此处是Worker发送该消息的代码处Master.scala <- case RegisterWorker(
,此处既是Master接收到该消息的地方
- 利用上面的技巧,我们可以快速地在RpcEndpoint的代码之间跳转,方便了对其交互流程的查看。(消息通信的具体实现,请看RpcEndpoint、RpcEnv)
- 此时,我们来到了Master的receive方法,代码如下
override def receive: PartialFunction[Any, Unit] = {
case RegisterWorker(
id, workerHost, workerPort, workerRef, cores, memory, workerWebUiUrl, masterAddress) =>
logInfo("Registering worker %s:%d with %d cores, %s RAM".format(
workerHost, workerPort, cores, Utils.megabytesToString(memory)))
if (state == RecoveryState.STANDBY) {
workerRef.send(MasterInStandby)
} else if (idToWorker.contains(id)) {
workerRef.send(RegisterWorkerFailed("Duplicate worker ID"))
} else {
val worker = new WorkerInfo(id, workerHost, workerPort, cores, memory,
workerRef, workerWebUiUrl)
if (registerWorker(worker)) {
persistenceEngine.addWorker(worker)
workerRef.send(RegisteredWorker(self, masterWebUiUrl, masterAddress))
schedule()
} else {
val workerAddress = worker.endpoint.address
logWarning("Worker registration failed. Attempted to re-register worker at same " +
"address: " + workerAddress)
workerRef.send(RegisterWorkerFailed("Attempted to re-register worker at same address: "
+ workerAddress))
}
}
}
- Master收到消息后,需要检测本节点的状态是否是STANDBY、是否已经注册该Worker,如果没问题,那么调用registerWorker(…),将worker添加到本节点,最后会回复Worker一个消息RegisteredWorker
- 跟随着RegisteredWorker消息,我们来到Worker接收消息处。Worker中先是receive被调用,再匹配到RegisterWorkerResponse,接着调用了handleRegisterResponse(…)方法,代码如下
private def handleRegisterResponse(msg: RegisterWorkerResponse): Unit = synchronized {
msg match {
case RegisteredWorker(masterRef, masterWebUiUrl, masterAddress) =>
if (preferConfiguredMasterAddress) {
logInfo("Successfully registered with master " + masterAddress.toSparkURL)
} else {
logInfo("Successfully registered with master " + masterRef.address.toSparkURL)
}
registered = true
changeMaster(masterRef, masterWebUiUrl, masterAddress)
forwordMessageScheduler.scheduleAtFixedRate(new Runnable {
override def run(): Unit = Utils.tryLogNonFatalError {
self.send(SendHeartbeat)
}
}, 0, HEARTBEAT_MILLIS, TimeUnit.MILLISECONDS)
if (CLEANUP_ENABLED) {
logInfo(
s"Worker cleanup enabled; old application directories will be deleted in: $workDir")
forwordMessageScheduler.scheduleAtFixedRate(new Runnable {
override def run(): Unit = Utils.tryLogNonFatalError {
self.send(WorkDirCleanup)
}
}, CLEANUP_INTERVAL_MILLIS, CLEANUP_INTERVAL_MILLIS, TimeUnit.MILLISECONDS)
}
val execs = executors.values.map { e =>
new ExecutorDescription(e.appId, e.execId, e.cores, e.state)
}
masterRef.send(WorkerLatestState(workerId, execs.toList, drivers.keys.toSeq))
case RegisterWorkerFailed(message) =>
if (!registered) {
logError("Worker registration failed: " + message)
System.exit(1)
}
case MasterInStandby =>
}
}
- 最后,Master将会收到WorkerLatestState消息,代码如下
override def receive: PartialFunction[Any, Unit] = {
case WorkerLatestState(workerId, executors, driverIds) =>
idToWorker.get(workerId) match {
case Some(worker) =>
for (exec <- executors) {
val executorMatches = worker.executors.exists {
case (_, e) => e.application.id == exec.appId && e.id == exec.execId
}
if (!executorMatches) {
worker.endpoint.send(KillExecutor(masterUrl, exec.appId, exec.execId))
}
}
for (driverId <- driverIds) {
val driverMatches = worker.drivers.exists { case (id, _) => id == driverId }
if (!driverMatches) {
worker.endpoint.send(KillDriver(driverId))
}
}
case None =>
logWarning("Worker state from unknown worker: " + workerId)
}
}
- 至此,Worker注册到Master通信流程,完全结束。^_^
- 后面整个集群会持续以下模式:由Worker定时向Master发送心跳包,而Master也会在本节点定时检测Worker的心跳,移除超时的Worker。
- Worker注册到Master的通信流程示意图如下