背景:集群启动的时候启动了master和worker
用户提交程序时:
1. 首先new spark context,其中会new dagScheduler,taskSchedulerImpl和sparkDeploySchedulerBackend,并start taskSchedulerImpl。
2. 在taskSchedulerImpl start时,会start schedulerBackend,在standalone模式下是start了sparkDeoplySchedulerBackend。
3. 在start sparkDeoploySchedulerBackend时会先start它的父类CoarseGrainedSchedulerBackend(start时会初始化DriverEndpoint这个内部类,该内部类是rpcEndpoint,它有onstart方法,在该方法执行时会执行Option(self).foreach(_.send(ReviveOffers))来周期性地发ReviveOffers消息给自己,ReviveOffers是个空的object,会触发makeOffers来‘Make fake resource offers on all executors’.
4. SparkDeploySchedulerBackend的start方法在调用其父类CoarseGrainedScheduleBackend的start方法启动了DriverEndpoint后,接下来会new AppClient并start AppClient,在start AppClient中会new ClientEndpoint(rpcEnv)这个内部类,该内部类是个消息通讯体,在实例化完成后会自动执行其onstart方法,onstart()内部会发消息给master来注册app(其实质是发送消息给所有masters,一旦跟一个master连接成功,就cancel与其他master的连接):masterRef.send(RegisterApplication(appDescription, self))。需要注意的是:这里的appDescription包含了app的具体信息,包括command信息;这里的self是ClientEndpoint本身。.
5. master本身是个ThreadSafeRpcEndpoint消息通讯体,接受到来自ClientEndpoint的消息RegisterApplication(description, driver)后,会createApplication(description, driver)和registerApplication(app)来创建和这册Application,并发送Application注册成功的消息给driver:driver.send(RegisteredApplication(app.id, self))(注意:这里的driver其实是ClientEndpoint!),然后调用schedule()方法。
6. clientEndpoint接受到RegisteredApplication(appId_, masterRef)消息后,会调用master = Some(masterRef)和listener.connected(appId.get),(后者实质是调用AppClientListener的具体实现SparkDeploySchedulerBackend.connected(appId.get)),至此clientEndpoint获得了注册成功了的Application的ID和Master的地址。
7. master在注册完Application后接下来会调用schedule()方法,在schedule()方法中会调用startExecutorsOnWorkers(),在startExecutorsOnWorkers()方法中会调用scheduleExecutorsOnWorkers(app, usableWorkers, spreadOutApps)和allocateWorkerResourceToExecutors(app, assignedCores(pos), coresPerExecutor, usableWorkers(pos)),在allocateWorkerResourceToExecutors()中会launchExecutor(worker, exec),仔细看这个launchExecutor(worker: WorkerInfo, exec: ExecutorDesc),它发送了如下两条消息:
worker.endpoint.send(LaunchExecutor(masterUrl, exec.application.id, exec.id, exec.application.desc, exec.cores, exec.memory))
exec.application.driver.send(ExecutorAdded(exec.id, worker.id, worker.hostPort, exec.cores, exec.memory))
8. 上述master发送的第一条消息是发给worker让其laucn executor的,worker本身是个消息通讯体,其在接受到消息LaunchExecutor(masterUrl, appId, execId, appDesc, cores_, memory_) 后,创建并start了一个(ExecutorRunnerManages the execution of one executor process.),而后发ExecutorStateChanged消息给master通知Master executor状态的变化 :sendToMaster(ExecutorStateChanged(appId, execId, manager.state, None, None))。
9. 上述master发送的第二条消息是发送给ClientEndpoint这个消息通讯体通知它获得了executor,ClientEndpoint在接受到 ExecutorAdded(id: Int, workerId: String, hostPort: String, cores: Int, memory: Int)消息后会调用listener.executorAdded(fullId, workerId, hostPort, cores, memory),实质是调用AppClientListener的具体实现SparkDeploySchedulerBackend.executorAdded(fullId: String, workerId: String, hostPort: String, cores: Int,memory: Int),到此executor注册成功.
10. 我们来仔细看下worker是怎么launchExecutor的,worker创建了ExecutorRunner,然后调用了ExecutorRunner的start()方法,该start()方法调用了方法fetchAndRunExecutor(),这个fetchAndRunExecutor()方法中有以下代码:
val builder = CommandUtils.buildProcessBuilder(appDesc.command, new SecurityManager(conf),memory, sparkHome.getAbsolutePath, substituteVariables)
process = builder.start()
这里就是构建并启动新的进程的关键之所在!所有的要启动的新的进程的相关信息都在这个builder里!!!我们来看下它都有哪些信息,以及从哪里来的。
这个ExecutorRunner的appDesc.command来自于worker从maser接受的case class消息LaunchExecutor(masterUrl, appId, execId, appDesc, cores_, memory_)中的appDesc,而master的这个appDesc来自于它从clientEndpoint接受的case class消息RegisterApplication(appDescription: ApplicationDescription, driver: RpcEndpointRef)中的appDescription, 而clientEndpoint的appDescription则来自于AppClient实例化时从sparkDeploySchedulerBackend中传入的appDesc,该appDesc包含了command:
val appDesc = new ApplicationDescription(sc.appName, maxCores, sc.executorMemory, command,appUIAddress, sc.eventLogDir, sc.eventLogCodec, coresPerExecutor, initialExecutorLimit)
val command = Command("org.apache.spark.executor.CoarseGrainedExecutorBackend", args, sc.executorEnvs, classPathEntries ++ testingClassPath, libraryPathEntries, javaOpts)
我们可以看到command包含了要启动的进程的名字CoarseGrainedExecutorBackend,也包含了args参数, args参数内容如下:
val args = Seq(
"--driver-url", driverUrl,
"--executor-id", "{{EXECUTOR_ID}}",
"--hostname", "{{HOSTNAME}}",
"--cores", "{{CORES}}",
"--app-id", "{{APP_ID}}",
"--worker-url", "{{WORKER_URL}}")
这里的driverUrl其内容是:
// The endpoint for executors to talk to us
val driverUrl = RpcEndpointAddress(
sc.conf.get("spark.driver.host"),
sc.conf.get("spark.driver.port").toInt,
CoarseGrainedSchedulerBackend.ENDPOINT_NAME).toString
我们可以看到,driverUrl在这里对应的endPoint名字是CoarseGrainedSchedulerBackend.ENDPOINT_NAME,其内容实质是“CoarseGrainedScheduler”。至此一切真相大白,CoarseGrainedExecutorBackend进程启动时接受到的以后要通信的对象driverUrl就是由sparkDeploySchedulerBackend在这里设定的,其endPoint名字是“CoarseGrainedScheduler“!,由于driverEndpoint在rpcEnv中注册的Endpoint名字是“CoarseGrainedScheduler”,而clientEndpoint在rpcEnv中注册的Endpoint名字是'AppClient',所以我们说CoarseGrainedExecutorBackend要通信的对象是driverEndpoint,而不是clientEndpoint!
注意:clientEndpoint在rpcEnv中注册时的Endpoint名字是'AppClient',如下源码:
def start() {
// Just launch an rpcEndpoint; it will call back into the listener.
endpoint.set(rpcEnv.setupEndpoint("AppClient", new ClientEndpoint(rpcEnv)))
}
注意:driverEndpoint在rpcEnv中注册时的Endpoint名字是'CoarseGrainedScheduler',如下源码:
driverEndpoint = rpcEnv.setupEndpoint(ENDPOINT_NAME, createDriverEndpoint(properties))
这里的ENDPOINT_NAME来自以下源码:
private[spark] object CoarseGrainedSchedulerBackend {
val ENDPOINT_NAME = "CoarseGrainedScheduler"
}