上一讲主要降到了spark executor资源在Master的分配原理。今天来讲Spark Executor的创建和启动过程。创建的过程可以功过如下时序图表示:
override def start() {
super.start()//调用父类的CoarseGrainedSchedulerBackend.start方法将配置参数复制给它的properties成员,并且创建driverEndPoint
// The endpoint for executors to talk to us
val driverUrl = rpcEnv.uriOf(SparkEnv.driverActorSystemName,
RpcAddress(sc.conf.get("spark.driver.host"), sc.conf.get("spark.driver.port").toInt),
CoarseGrainedSchedulerBackend.ENDPOINT_NAME)
val args = Seq(
"--driver-url", driverUrl,
"--executor-id", "{{EXECUTOR_ID}}",
"--hostname", "{{HOSTNAME}}",
"--cores", "{{CORES}}",
"--app-id", "{{APP_ID}}",
"--worker-url", "{{WORKER_URL}}")
val extraJavaOpts = sc.conf.getOption("spark.executor.extraJavaOptions")
.map(Utils.splitCommandString).getOrElse(Seq.empty)
val classPathEntries = sc.conf.getOption("spark.executor.extraClassPath")
.map(_.split(java.io.File.pathSeparator).toSeq).getOrElse(Nil)
val libraryPathEntries = sc.conf.getOption("spark.executor.extraLibraryPath")
.map(_.split(java.io.File.pathSeparator).toSeq).getOrElse(Nil)
// When testing, expose the parent class path to the child. This is processed by
// compute-classpath.{cmd,sh} and makes all needed jars available to child processes
// when the assembly is built with the "*-provided" profiles enabled.
val testingClassPath =
if (sys.props.contains("spark.testing")) {
sys.props("java.class.path").split(java.io.File.pathSeparator).toSeq
} else {
Nil
}
// Start executors with a few necessary configs for registering with the scheduler
val sparkJavaOpts = Utils.sparkJavaOpts(conf, SparkConf.isExecutorStartupConf)
val javaOpts = sparkJavaOpts ++ extraJavaOpts
val command = Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",
args, sc.executorEnvs, classPathEntries ++ testingClassPath, libraryPathEntries, javaOpts)
val appUIAddress = sc.ui.map(_.appUIAddress).getOrElse("")
val coresPerExecutor = conf.getOption("spark.executor.cores").map(_.toInt)
val appDesc = new ApplicationDescription(sc.appName, maxCores, sc.executorMemory,
command, appUIAddress, sc.eventLogDir, sc.eventLogCodec, coresPerExecutor)//创建ApplicationDescription对象,这个对象是应用的描述,包括executor内存大小、executor core的个数、spark应用core的最大分配个数
client = new AppClient(sc.env.rpcEnv, masters, appDesc, this, conf)
client.start()//创建并启动AppClient
waitForRegistration()
}
def start() {
// Just launch an actor; it will call back into the listener.
endpoint = rpcEnv.setupEndpoint("AppClient", new ClientEndpoint(rpcEnv))
}
AppCliet的endpoint的实际类型是ClientEndpoint,它的onStart方法在创建ClientEndpoint后接收和发送消息之前执行,相关代码如下:
override def onStart(): Unit = {
try {
registerWithMaster(1)//向Master发送注册Application消息
} catch {
case e: Exception =>
logWarning("Failed to connect to master", e)
markDisconnected()
stop()
}
}
Spark Master和Worker在执行sbin/start-all.sh的时候就已经启动了,它的其中一个功能是提供Application注册服务。
Master在接收到RegisterApplication消息之后会分配各个Executor资源,关于Executor资源如何分配,请参考上一篇文章:spark调度系列----1. spark stanalone模式下worker上executor资源的分配 。最终启动各个Executor,相关代码如下:
Master类的startExecutorsOnWorkers方法
private def startExecutorsOnWorkers(): Unit = {
// Right now this is a very simple FIFO scheduler. We keep trying to fit in the first app
// in the queue, then the second app, etc.
if (spreadOutApps) {
// Try to spread out each app among all the workers, until it has all its cores
for (app <- waitingApps if app.coresLeft > 0) {
val usableWorkers = workers.toArray.filter(_.state == WorkerState.ALIVE)//活着的节点
.filter(worker => worker.memoryFree >= app.desc.memoryPerExecutorMB && //节点的剩余内存大于executor内存
worker.coresFree >= app.desc.coresPerExecutor.getOrElse(1))//节点的空闲core个数大于一个executor需要的core个数
.sortBy(_.coresFree).reverse
val numUsable = usableWorkers.length
val assigned = new Array[Int](numUsable) // Number of cores to give on each node
var toAssign = math.min(app.coresLeft, usableWorkers.map(_.coresFree).sum)
var pos = 0
while (toAssign > 0) {
if (usableWorkers(pos).coresFree - assigned(pos) > 0) {//这个worker还有空闲core,则为app在这个worker分配一个core,轮询尽可能多的worker
toAssign -= 1
assigned(pos) += 1
}
pos = (pos + 1) % numUsable//选择下一个worker节点
}
// Now that we've decided how many cores to give on each node, let's actually give them
for (pos <- 0 until numUsable if assigned(pos) > 0) {
allocateWorkerResourceToExecutors(app, assigned(pos), usableWorkers(pos))//在选定的worker上分配executor,一个worker可能分配多个executor
}
}
} else {
// Pack each app into as few workers as possible until we've assigned all its cores
for (worker <- workers if worker.coresFree > 0 && worker.state == WorkerState.ALIVE) {
for (app <- waitingApps if app.coresLeft > 0) {
allocateWorkerResourceToExecutors(app, app.coresLeft, worker)
}
}
}
}
Master类的allocateWorkerResourceToExecutors方法:
private def allocateWorkerResourceToExecutors(
app: ApplicationInfo,
coresToAllocate: Int,
worker: WorkerInfo): Unit = {
val memoryPerExecutor = app.desc.memoryPerExecutorMB
val coresPerExecutor = app.desc.coresPerExecutor.getOrElse(coresToAllocate)//如果没有明确指定一个executor core的个数,则把这个worker上分配的所有core分配给一个executor
var coresLeft = coresToAllocate
while (coresLeft >= coresPerExecutor && worker.memoryFree >= memoryPerExecutor) {//每次为一个executor分配的core个数至少为明确指定的core个数
val exec = app.addExecutor(worker, coresPerExecutor)
coresLeft -= coresPerExecutor
launchExecutor(worker, exec)//发送启动Executor消息
app.state = ApplicationState.RUNNING
}
}
在这个方法里面,Master向worker发送启动Executor的消息。worker在接收到LauchExecutor消息之后,会创建ExecutorRunner对象,之后执行ExecutorRunner.start,代码如下:
<pre name="code" class="java">case LaunchExecutor(masterUrl, appId, execId, appDesc, cores_, memory_) =>
if (masterUrl != activeMasterUrl) {
logWarning("Invalid Master (" + masterUrl + ") attempted to launch executor.")
} else {
try {
logInfo("Asked to launch executor %s/%d for %s".format(appId, execId, appDesc.name))
// Create the executor's working directory
val executorDir = new File(workDir, appId + "/" + execId)//日志目录
if (!executorDir.mkdirs()) {
throw new IOException("Failed to create directory " + executorDir)
}
// Create local dirs for the executor. These are passed to the executor via the
// SPARK_EXECUTOR_DIRS environment variable, and deleted by the Worker when the
// application finishes.
val appLocalDirs = appDirectories.get(appId).getOrElse {
Utils.getOrCreateLocalRootDirs(conf).map { dir =>
Utils.createDirectory(dir, namePrefix = "executor").getAbsolutePath()
}.toSeq
}
appDirectories(appId) = appLocalDirs
val manager = new ExecutorRunner(
appId,
execId,
appDesc.copy(command = Worker.maybeUpdateSSLSettings(appDesc.command, conf)),
cores_,
memory_,
self,
workerId,
host,
webUi.boundPort,
publicAddress,
sparkHome,
executorDir,
workerUri,
conf,
appLocalDirs, ExecutorState.LOADING)
executors(appId + "/" + execId) = manager
manager.start()//执行ExecutorRunner
coresUsed += cores_
memoryUsed += memory_
sendToMaster(ExecutorStateChanged(appId, execId, manager.state, None, None))
} catch {
case e: Exception => {
logError(s"Failed to launch executor $appId/$execId for ${appDesc.name}.", e)
if (executors.contains(appId + "/" + execId)) {
executors(appId + "/" + execId).kill()
executors -= appId + "/" + execId
}
sendToMaster(ExecutorStateChanged(appId, execId, ExecutorState.FAILED,
Some(e.toString), None))
}
}
}
ExecutorRunner.start最终会执行到ExecutorRunner.fetchAndRunExecutor,在这个方法里面启动了Executor进程,代码如下:
private def fetchAndRunExecutor() {
try {
// Launch the process
val builder = CommandUtils.buildProcessBuilder(appDesc.command, new SecurityManager(conf),
memory, sparkHome.getAbsolutePath, substituteVariables)
val command = builder.command()
logInfo("Launch command: " + command.mkString("\"", "\" \"", "\""))
builder.directory(executorDir)
builder.environment.put("SPARK_EXECUTOR_DIRS", appLocalDirs.mkString(File.pathSeparator))
// In case we are running this from within the Spark Shell, avoid creating a "scala"
// parent process for the executor command
builder.environment.put("SPARK_LAUNCH_WITH_SCALA", "0")
// Add webUI log urls
val baseUrl =
s"http://$publicAddress:$webUiPort/logPage/?appId=$appId&executorId=$execId&logType="
builder.environment.put("SPARK_LOG_URL_STDERR", s"${baseUrl}stderr")
builder.environment.put("SPARK_LOG_URL_STDOUT", s"${baseUrl}stdout")
process = builder.start()//启动Executor
val header = "Spark Executor Command: %s\n%s\n\n".format(
command.mkString("\"", "\" \"", "\""), "=" * 40)
// Redirect its stdout and stderr to files
val stdout = new File(executorDir, "stdout")
stdoutAppender = FileAppender(process.getInputStream, stdout, conf)
我的spark application提交命令为:
./spark-submit --class spark_security.login_users.Sockpuppet --driver-memory 3g--executor-memory 3g --executor-cores 5 --total-executor-cores 15 --name Logintest --master spark://ddos12:7077 --driver-java-options "-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8888" --conf "spark.executor.extraJavaOptions=-Xdebug -Xrunjdwp:transport=dt_socket,address=9999,server=y,suspend=n" --conf spark.ui.port=4048 /home/wangbaogang/nocache_onewin.jar hdfs://ddos12:9000/prop/logindealer.properties
Executor启动命令在日志里面记录了启动命令,上面application提交命令对应的Executor启动命令为:
/spark-1.4.1-bin-hadoop2.6/lib/spark-assembly-1.4.1-hadoop2.6.0.jar:/export/servers/spark-1.4.1-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/export/servers/spark-1.4.1-bi
n-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/export/servers/spark-1.4.1-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/export/servers/hadoop2.6.0/etc/hadoop/" "-Xms3072M"
"-Xmx3072M" "-Dspark.ui.port=4048" "-Dspark.driver.port=55048" "-Xdebug" "-Xrunjdwp:transport=dt_socket,address=9999,server=y,suspend=n" "-XX:MaxPermSize=256m" "org.apache
.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "akka.tcp://sparkDriver@192.168.185.12:55048/user/CoarseGrainedScheduler" "--executor-id" "1" "--hostname" "192
.spark.executor.CoarseGrainedExecutorBackend
private def run(
driverUrl: String,
executorId: String,
hostname: String,
cores: Int,
appId: String,
workerUrl: Option[String],
userClassPath: Seq[URL]) {
SignalLogger.register(log)
SparkHadoopUtil.get.runAsSparkUser { () =>
// Debug code
Utils.checkHost(hostname)
// Bootstrap to fetch the driver's Spark properties.
val executorConf = new SparkConf
val port = executorConf.getInt("spark.executor.port", 0)
val fetcher = RpcEnv.create(
"driverPropsFetcher",
hostname,
port,
executorConf,
new SecurityManager(executorConf))
val driver = fetcher.setupEndpointRefByURI(driverUrl)//获取driver信息
val props = driver.askWithRetry[Seq[(String, String)]](RetrieveSparkProps) ++
Seq[(String, String)](("spark.app.id", appId))//请求driver的属性信息
fetcher.shutdown()
// Create SparkEnv using properties we fetched from the driver.
val driverConf = new SparkConf()
for ((key, value) <- props) {
// this is required for SSL in standalone mode
if (SparkConf.isExecutorStartupConf(key)) {
driverConf.setIfMissing(key, value)
} else {
driverConf.set(key, value)
}
}
if (driverConf.contains("spark.yarn.credentials.file")) {
logInfo("Will periodically update credentials from: " +
driverConf.get("spark.yarn.credentials.file"))
SparkHadoopUtil.get.startExecutorDelegationTokenRenewer(driverConf)
}
val env = SparkEnv.createExecutorEnv(
driverConf, executorId, hostname, port, cores, isLocal = false)//创建executor的SparkEnv信息
// SparkEnv sets spark.driver.port so it shouldn't be 0 anymore.
val boundPort = env.conf.getInt("spark.executor.port", 0)
assert(boundPort != 0)
// Start the CoarseGrainedExecutorBackend endpoint.
val sparkHostPort = hostname + ":" + boundPort
env.rpcEnv.setupEndpoint("Executor", new CoarseGrainedExecutorBackend(
env.rpcEnv, driverUrl, executorId, sparkHostPort, cores, userClassPath, env))//创建executor的endpoint,用于和driver的endpoint通信
workerUrl.foreach { url =>
env.rpcEnv.setupEndpoint("WorkerWatcher", new WorkerWatcher(env.rpcEnv, url))
}
env.rpcEnv.awaitTermination()
SparkHadoopUtil.get.stopExecutorDelegationTokenRenewer()
}
}
在这个方法里面,主要是为Executor设置初始信息,获得driver的属性信息,创建Executor的SparkEnv,创建用户和driver通信的Executor endpoint
override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] = {
case RegisterExecutor(executorId, executorRef, hostPort, cores, logUrls) =>
Utils.checkHostPort(hostPort, "Host port expected " + hostPort)//侦听RegisterExecutor事件,并且发送Executor注册完成事件
if (executorDataMap.contains(executorId)) {
context.reply(RegisterExecutorFailed("Duplicate executor ID: " + executorId))
} else {
logInfo("Registered executor: " + executorRef + " with ID " + executorId)
context.reply(RegisteredExecutor)//发送RegisteredExecutor注册完成事件
addressToExecutorId(executorRef.address) = executorId
totalCoreCount.addAndGet(cores)
totalRegisteredExecutors.addAndGet(1)
val (host, _) = Utils.parseHostPort(hostPort)
val data = new ExecutorData(executorRef, executorRef.address, host, cores, cores, logUrls)
// This must be synchronized because variables mutated
// in this block are read when requesting executors
CoarseGrainedSchedulerBackend.this.synchronized {
executorDataMap.put(executorId, data)
if (numPendingExecutors > 0) {
numPendingExecutors -= 1
logDebug(s"Decremented number of pending executors ($numPendingExecutors left)")
}
}
listenerBus.post(
SparkListenerExecutorAdded(System.currentTimeMillis(), executorId, data))
makeOffers()//发送这个application的资源请求和分配
}
override def onStart() {//这个方法在CoarseGrainedExecutorBackend处理任何消息之前首先调用执行,应该是在这个类的对象初始化之后就执行
logInfo("Connecting to driver: " + driverUrl)
rpcEnv.asyncSetupEndpointRefByURI(driverUrl).flatMap { ref =>
// This is a very fast action so we can use "ThreadUtils.sameThread"
driver = Some(ref)
ref.ask[RegisteredExecutor.type](
RegisterExecutor(executorId, self, hostPort, cores, extractLogUrls))//发送RegisterExecutor消息,并且等待回应
}(ThreadUtils.sameThread).onComplete {
// This is a very fast action so we can use "ThreadUtils.sameThread"
case Success(msg) => Utils.tryLogNonFatalError {
Option(self).foreach(_.send(msg)) // msg must be RegisteredExecutor 回应的消息是RegisterExecutored消息,并且发送这个消息,这个消息在这个类的receive方法接收
}
case Failure(e) => {
logError(s"Cannot register with driver: $driverUrl", e)
System.exit(1)
}
}(ThreadUtils.sameThread)
}