当sparkContext被创建后,worker就会分配executor,这个过程如下图所示:
如上图所示,executor要经过很多个步骤才会被创建。
- SparkContext中有一个叫做createTaskScheduler()的函数,这个函数会根据master URL的类型,创建taskScheduler和相应的backend,其主要代码如下:
private defcreateTaskScheduler(
sc: SparkContext,
master: String): (SchedulerBackend, TaskScheduler) = {master match {
case "local"=>
val scheduler = new TaskSchedulerImpl(sc, MAX_LOCAL_TASK_FAILURES, isLocal =true)
val backend = new LocalBackend(sc.getConf, scheduler, 1)
scheduler.initialize(backend)
(backend, scheduler) case LOCAL_N_REGEX(threads) => ......
case LOCAL_N_FAILURES_REGEX(threads, maxFailures) => ......
case SPARK_REGEX(sparkUrl) => ......
case LOCAL_CLUSTER_REGEX(numSlaves, coresPerSlave, memoryPerSlave) => ......
case "yarn-standalone"| "yarn-cluster" => ......
case "yarn-client"=> ......
case MESOS_REGEX(mesosUrl) => ......
case SIMR_REGEX(simrUrl) => ......
case zkUrl if zkUrl.startsWith("zk://") => ......
case _ => ......}
SparkContext会调用这个函数创建taskScheduler,随后start taskScheduler,代码如下:
val (sched, ts) = SparkContext.createTaskScheduler(this, master)
_taskScheduler = ts
_dagScheduler = new DAGScheduler(this)
// start TaskScheduler after taskScheduler sets DAGScheduler reference in DAGScheduler's
_taskScheduler.start()
需要注意的是,tasksheduler 开始之前,一定要设置DAGScheduler引用,在DAGScheduler.scala中用这一行代码实现
taskScheduler.setDAGScheduler(this)
- TaskScheduler中有一个start()方法,该方法会直接调用backend.start(),核心代码如下:
override def start() { backend.start() ......
}
这个backend是怎么来的呢?且看SparkContext中createTaskScheduler 函数的实现,这个backend是通过这句代码来的scheduler.initialize(backend)
- 接下来看看backend中的start函数做了哪些事情(注:不同部署模式的backend有可能不一样,以下代码来自SparkDeploySchedulerBackend),主要代码如下:
val command = Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",
args, sc.executorEnvs, classPathEntries ++ testingClassPath, libraryPathEntries, javaOpts)
val appUIAddress = sc.ui.map(_.appUIAddress).getOrElse("")
val coresPerExecutor = conf.getOption("spark.executor.cores").map(_.toInt)
val appDesc = new ApplicationDescription(sc.appName, maxCores, sc.executorMemory,
command, appUIAddress, sc.eventLogDir, sc.eventLogCodec, coresPerExecutor)
client = new AppClient(sc.env.rpcEnv, masters, appDesc, this, conf)
client.start()
首先定义并初始化了command变量,因为部署模式是local,传入的参数是“*CoarseGrainedExecutorBackend”,CoarseGrainedExecutorBackend在后面创建了executor。创建appDesc的时候需要传入command变量,而创建AppClient的时候又需要传入appDesc。最后启动了AppClient。
- 这个AppClient是用来干嘛的呢?它是用来向Master注册Application。且看如下主要代码(源代码在AppClient.scala中):
private def tryRegisterAllMasters():
...
masterRef.send(RegisterApplication(appDescription, self))
...
override def receive():
case RegisteredApplication(appId_, masterRef) => {...}
...
这个函数主要向Master发送注册Application的信息,在Master.scala的receive函数中会接收这个注册信息,所以真正创建Application的是Master,创建成功之后,master会把创建成功的信息传回给AppClient,AppClient.scala的receive函数会接收这个信息。创建application的代码如下:
override def receive():
...
case RegisterApplication(description, driver) => { //接收到来自appClient发送的信息
...
val app = createApplication(description, driver)
registerApplication(app) // 注册创建好的application
driver.send(RegisteredApplication(app.id, self)) //把消息发送给AppClient,该消息会被函数receive()接收
...}
随后Master会向worker发送创建executorRunner的请求:
private def launchExecutor:
worker.endpoint.send(LaunchExecutor(...))
- Worker接收到信息之后,会创建executorRunner:
case LaunchExecutor(...)=>{ val manager = new ExecutorRunner(...)}
- executorRunner 会根据ApplicationDescription中的描述运行 executor:
...
val builder = CommandUtils.buildProcessBuilder(appDesc.command,...)
val command = builder.command()
...
...
ref.ask[RegisterExecutorResponse](
RegisterExecutor(executorId, self, hostPort, cores, extractLogUrls))
...
override def receive:
case RegisteredExecutor(hostname) =>
executor = new Executor(executorId, hostname, env, userClassPath, isLocal = false)
...