Executor
执行Task
的前期准备:
- 在我们介绍
Executor
执行Task之前,先看一个重要的类,它就是CoarseGrainedExecutorBackend
类 - 它创建这个进程的时候会调用
onStart
方法 - 它是
ExecutorBackend
粗粒度进程, - 它负责向
Driver
发送Executor
的注册请求 - 它是一个通信的进程,它可以与
Driver
相互通信 - 它是
Executor
所在的一个进程名称,Executor
才是处理Task
真正的对象,Executor
处理Task
都是由线程池来进行Task
的处理的。 - 它负责接受
Driver
返回回来的Executor
注册信息,然后创建Executor
上下文。 - 它负责接受
TaskSchedule
发送过来的LaunchTask
消息,开始Task
的启动与计算
- 在我们介绍
Executor
执行Task
的原理分析:
- 当
CoarseGrainedExecutorBackend
接收到Driver
发送过来的RegisteredExecutor
消息的时候就会创建Executor
- 然后当再次接受
Driver
发送过来的LaunchTask
消息后就会开始执行Task
,首先它会对发送来的TaskTaskDescription
进行反序列化,然后调用launchTask
方法交由Executor
去执行Task
。 - 在
launchTask
方法中,创建了TaskRunner
,然后TaskRunner继承了Runnable接口,然后将这个TaskRunner
加入到线程池和缓存中,然后线程池调用executor
方法开始Task
的执行。
- 当
Executor
执行Task
的原码分析:CoarseGrainedExecutorBackend
的onStart
方法:该方法在创建CoarseGrainedExecutorBackend
类的时候被执行,它会向Driver
注册Executor
override def onStart() { logInfo("Connecting to driver: " + driverUrl) rpcEnv.asyncSetupEndpointRefByURI(driverUrl).flatMap { ref => // This is a very fast action so we can use "ThreadUtils.sameThread" driver = Some(ref) //向Driver发送Executor的注册请求 ref.ask[RegisterExecutorResponse]( RegisterExecutor(executorId, self, hostPort, cores, extractLogUrls)) }(ThreadUtils.sameThread).onComplete { // This is a very fast action so we can use "ThreadUtils.sameThread" case Success(msg) => Utils.tryLogNonFatalError { Option(self).foreach(_.send(msg)) // msg must be RegisterExecutorResponse } case Failure(e) => { logError(s"Cannot register with driver: $driverUrl", e) System.exit(1) } }(ThreadUtils.sameThread) }
CoarseGrainedExecutorBackend
的receive
方法:该方法作用就是接受各种消息用的。override def receive: PartialFunction[Any, Unit] = { //Driver返回Executor注册成功的消息,然后就会创建Executor对象。 case RegisteredExecutor(hostname) => logInfo("Successfully registered with driver") executor = new Executor(executorId, hostname, env, userClassPath, isLocal = false) //Driver返回Executor注册失败的消息,然后程序结束执行。 case RegisterExecutorFailed(message) => logError("Slave registration failed: " + message) System.exit(1) //接受Driver发送过来的LaunchTask消息,这个消息作用就是要求Executor开始执行Task任务 case LaunchTask(data) => if (executor == null) { logError("Received LaunchTask command but executor was null") System.exit(1) } else { //首先会对传过来的TaskDescription进行反序列化, val taskDesc = ser.deserialize[TaskDescription](data.value) logInfo("Got assigned task " + taskDesc.taskId) //调用executor的launchTask方法开始执行Task任务。 //this:ExecutorBackend,taskId:task的索引Id,attemptNumber:尝试执行的次数, //taskDesc.name:task的名称,taskDesc.serializedTask:TaskDescription序列化后的对象 executor.launchTask(this, taskId = taskDesc.taskId, attemptNumber = taskDesc.attemptNumber, taskDesc.name, taskDesc.serializedTask) } case KillTask(taskId, _, interruptThread) => if (executor == null) { logError("Received KillTask command but executor was null") System.exit(1) } else { executor.killTask(taskId, interruptThread) } case StopExecutor => logInfo("Driver commanded a shutdown") // Cannot shutdown here because an ack may need to be sent back to the caller. So send // a message to self to actually do the shutdown. self.send(Shutdown) case Shutdown => executor.stop() stop() rpcEnv.shutdown() }
Executor
的launchTask
方法:该方法的作用是为每个Task
创建一个TaskRunner
,然后将TaskRunner
放入内存缓存中,然后再将TaskRunner
放入线程池中,等待线程执行。def launchTask( context: ExecutorBackend, taskId: Long, attemptNumber: Int, taskName: String, serializedTask: ByteBuffer): Unit = { //为每一个Task都创建一个对应的TaskRunner对象,TaskRunner继承了Java的Runnable接口 val tr = new TaskRunner(context, taskId = taskId, attemptNumber = attemptNumber, taskName, serializedTask) //将TaskRunner放入内存缓存 runningTasks.put(taskId, tr) //Executor内部有一个Java线程池,然后将Task封装到TaskRunner线程,直接放到 //线程池中去执行,如果线程池中线程不够用的,就会等待有了空闲的线程在开始执行 threadPool.execute(tr) }
TaskRunner
继承了Runable
接口,执行Task
的程序都放在了多线程的run
方法里了,每当一个Task
过来就会创建一个TaskRunner
对象,并且创建一个线程线程去执行Task
,然后这些TaskRunner
会放到线程池中去执行。下边是run
方法的源码解析override def run(): Unit = { //为Task分配一个内存管理器 val taskMemoryManager = new TaskMemoryManager(env.memoryManager, taskId) //记录反序列化的时间 val deserializeStartTime = System.currentTimeMillis() Thread.currentThread.setContextClassLoader(replClassLoader) //创建一个序列化器,用来对Task数据进行反序列化 val ser = env.closureSerializer.newInstance() logInfo(s"Running $taskName (TID $taskId)") //向Driver发送Task当前的执行状态 execBackend.statusUpdate(taskId, TaskState.RUNNING, EMPTY_BYTE_BUFFER) var taskStart: Long = 0 startGCTime = computeTotalGcTime() try { //对序列化后的Task数据进行反序列化 val (taskFiles, taskJars, taskBytes) = Task.deserializeWithDependencies(serializedTask) //通过网络通信,获取Task依赖的文件、资源、jar包,比如说Hadoop的配置文件 updateDependencies(taskFiles, taskJars) //通过反序列化将Task进行反序列化 //类加载的作用:用发射动态加载一个类,创建类的对象 task = ser.deserialize[Task[Any]](taskBytes, Thread.currentThread.getContextClassLoader) task.setTaskMemoryManager(taskMemoryManager) //如果在序列化之前以及被停掉了,那么就会马上退出,否则就会继续执行Task if (killed) { throw new TaskKilledException } logDebug("Task " + taskId + "'s epoch is " + task.epoch) env.mapOutputTracker.updateEpoch(task.epoch) // 计算出Task开始的时间 taskStart = System.currentTimeMillis() var threwException = true //value:就是MapStatus,因为执行Task所得结果其实就是Shuffle操作,那么Shuffle //操作后的结果会被持久化到对应Shuffle文件中,MapStatus它封装了Shuffle的文件地址,以及计算结果的大小。 //后边会将这个MapStatus序列化,返回给对应Executor的CoraseGrainedBackend上 val (value, accumUpdates) = try { //执行Task最核心的方法,不要着急,我们会在下边的源码中讲到 val res = task.run( taskAttemptId = taskId, attemptNumber = attemptNumber, metricsSystem = env.metricsSystem) threwException = false res } finally { //当Task执行成功或者失败都会释放内存 val freedMemory = taskMemoryManager.cleanUpAllAllocatedMemory() //监测是否内存泄漏,如果泄漏就会跑出异常 if (freedMemory > 0) { val errMsg = s"Managed memory leak detected; size = $freedMemory bytes, TID = $taskId" if (conf.getBoolean("spark.unsafe.exceptionOnMemoryLeak", false) && !threwException) { throw new SparkException(errMsg) } else { logError(errMsg) } } } //计算出Task结束的时间 val taskFinish = System.currentTimeMillis() // If the task has been killed, let's fail it. if (task.killed) { throw new TaskKilledException } //为Task执行后的到的结果创建序列化器 val resultSer = env.serializer.newInstance() //记录序列化Task执行结果的时间 val beforeSerialization = System.currentTimeMillis() //序列化Task执行后的结果,因为这个结果会返回给Driver val valueBytes = resultSer.serialize(value) //记录序列化Task结果的完成的时间 val afterSerialization = System.currentTimeMillis() //设置Task运行时候的一些指标,这些都会在SparkUI上显示 for (m <- task.metrics) { m.setExecutorDeserializeTime( (taskStart - deserializeStartTime) + task.executorDeserializeTime) m.setExecutorRunTime((taskFinish - taskStart) - task.executorDeserializeTime) m.setJvmGCTime(computeTotalGcTime() - startGCTime) m.setResultSerializationTime(afterSerialization - beforeSerialization) m.updateAccumulators() } //一个包含了Task结果与累加器的更新的TaskResult val directResult = new DirectTaskResult(valueBytes, accumUpdates, task.metrics.orNull) //序列化TaskResult val serializedDirectResult = ser.serialize(directResult) //计算序TaskResult序列后的大小 val resultSize = serializedDirectResult.limit // directSend = sending directly back to the driver val serializedResult: ByteBuffer = { //如果执行结果序列化后的大小是否大于最大的限制大小(可配置,默认是1G),如果大于最大的大小,那么直接丢弃它 if (maxResultSize > 0 && resultSize > maxResultSize) { logWarning(s"Finished $taskName (TID $taskId). Result is larger than maxResultSize " + s"(${Utils.bytesToString(resultSize)} > ${Utils.bytesToString(maxResultSize)}), " + s"dropping it.") ser.serialize(new IndirectTaskResult[Any](TaskResultBlockId(taskId), resultSize)) //如果执行结果序列化后的大小超出阈值大小,但是不超过最大限制大小(1G), //那么序列化的结果不直接发送给Driver,而是通过BlockManage获取 } else if (resultSize >= akkaFrameSize - AkkaUtils.reservedSizeBytes) { val blockId = TaskResultBlockId(taskId) env.blockManager.putBytes( blockId, serializedDirectResult, StorageLevel.MEMORY_AND_DISK_SER) logInfo( s"Finished $taskName (TID $taskId). $resultSize bytes result sent via BlockManager)") ser.serialize(new IndirectTaskResult[Any](blockId, resultSize)) //如果没有超出阈值,那么就会直接返回给Driver } else { logInfo(s"Finished $taskName (TID $taskId). $resultSize bytes result sent to driver") serializedDirectResult } } //向Driver(其实是Executor所在的CoraseGrainedBackend)发送对应Task的执行结果与执行状态 //因为Executor启动以后会向CoraseGrainedBackend进行注册。 execBackend.statusUpdate(taskId, TaskState.FINISHED, serializedResult) //下边是一些异常捕获,不同执行程序可能遇到不同的异常 //根据不同的异常对程序做不同的处理 } catch { case ffe: FetchFailedException => val reason = ffe.toTaskEndReason execBackend.statusUpdate(taskId, TaskState.FAILED, ser.serialize(reason)) case _: TaskKilledException | _: InterruptedException if task.killed => logInfo(s"Executor killed $taskName (TID $taskId)") execBackend.statusUpdate(taskId, TaskState.KILLED, ser.serialize(TaskKilled)) case cDE: CommitDeniedException => val reason = cDE.toTaskEndReason execBackend.statusUpdate(taskId, TaskState.FAILED, ser.serialize(reason)) case t: Throwable => // Attempt to exit cleanly by informing the driver of our failure. // If anything goes wrong (or this was a fatal exception), we will delegate to // the default uncaught exception handler, which will terminate the Executor. logError(s"Exception in $taskName (TID $taskId)", t) val metrics: Option[TaskMetrics] = Option(task).flatMap { task => task.metrics.map { m => m.setExecutorRunTime(System.currentTimeMillis() - taskStart) m.setJvmGCTime(computeTotalGcTime() - startGCTime) m.updateAccumulators() m } } val serializedTaskEndReason = { try { ser.serialize(new ExceptionFailure(t, metrics)) } catch { case _: NotSerializableException => // t is not serializable so just send the stacktrace ser.serialize(new ExceptionFailure(t, metrics, false)) } } execBackend.statusUpdate(taskId, TaskState.FAILED, serializedTaskEndReason) // Don't forcibly exit unless the exception was inherently fatal, to avoid // stopping other tasks unnecessarily. if (Utils.isFatalError(t)) { SparkUncaughtExceptionHandler.uncaughtException(t) } } finally { //Task执行完毕以后将task从RunningTask的队列中移除去 runningTasks.remove(taskId) } } }
Executor
的updateDependencies
方法,该方法的作用就是通过网络通信,获取Task
依赖的文件、资源、jar
包,比如说Hadoop
的配置文件private def updateDependencies(newFiles: HashMap[String, Long], newJars: HashMap[String, Long]) { //获取Hadoop的配置文件 lazy val hadoopConf = SparkHadoopUtil.get.newConfiguration(conf) //同步代码块,因为在CoarseGrainedExecutorBackend进程中运行多个线程, //来执行不同的Task那么多个线程访问同一个资源,就会出现线程安全问题, //所以为了避免数据同步问题,加上同步到代码块 synchronized { // 遍历要拉去的文件 for ((name, timestamp) <- newFiles if currentFiles.getOrElse(name, -1L) < timestamp) { logInfo("Fetching " + name + " with timestamp " + timestamp) // Fetch file with useCache mode, close cache for local mode. //通过Utils.fetchFile方法,利用网络通信来拉去依赖文件 Utils.fetchFile(name, new File(SparkFiles.getRootDirectory()), conf, env.securityManager, hadoopConf, timestamp, useCache = !isLocal) currentFiles(name) = timestamp } //遍历拉去的Jar for ((name, timestamp) <- newJars) { val localName = name.split("/").last val currentTimeStamp = currentJars.get(name) .orElse(currentJars.get(localName)) .getOrElse(-1L) //处理时间戳的问题,保证了Jar的时间戳小于当前时间戳 if (currentTimeStamp < timestamp) { logInfo("Fetching " + name + " with timestamp " + timestamp) // 通过Utils.fetchFile方法,利用网络通信进行Jar的拉去 Utils.fetchFile(name, new File(SparkFiles.getRootDirectory()), conf, env.securityManager, hadoopConf, timestamp, useCache = !isLocal) currentJars(name) = timestamp // Add it to our class loader val url = new File(SparkFiles.getRootDirectory(), localName).toURI.toURL if (!urlClassLoader.getURLs().contains(url)) { logInfo("Adding " + url + " to class loader") urlClassLoader.addURL(url) } } } } }
Task
里的run
方法,也就是执行Task
所需的准备工作的结尾final def run( taskAttemptId: Long, attemptNumber: Int, metricsSystem: MetricsSystem) : (T, AccumulatorUpdates) = { //创建TaskContext,也就是Task执行的上下文,封装了Task执行所需要的数据 //stageId:属于哪个Stage,partitionId:所处理的分区,attemptNumber:尝试执行的次数, //taskMemoryManager:所需要的内存管理器,metricsSystem:系统指标, //internalAccumulators:内部去累加器 context = new TaskContextImpl( stageId, partitionId, taskAttemptId, attemptNumber, taskMemoryManager, metricsSystem, internalAccumulators, runningLocally = false) TaskContext.setTaskContext(context) context.taskMetrics.setHostname(Utils.localHostName()) context.taskMetrics.setAccumulatorsUpdater(context.collectInternalAccumulators) taskThread = Thread.currentThread() if (_killed) { kill(interruptThread = false) } try { //调用runTask方法,因为runTask是一个抽象类,所以它的处理逻辑都是基于子类来实现的 //因为Task的子类有两个,一个是ShuffleMapTask,另个一是ResultTask,如果想看具体的Task //执行程序,就需要到这两个子类去解析具体的处理逻辑 (runTask(context), context.collectAccumulators()) } finally { context.markTaskCompleted() try { Utils.tryLogNonFatalError { // Release memory used by this thread for unrolling blocks SparkEnv.get.blockManager.memoryStore.releaseUnrollMemoryForThisTask() // Notify any tasks waiting for execution memory to be freed to wake up and try to // acquire memory again. This makes impossible the scenario where a task sleeps forever // because there are no other tasks left to notify it. Since this is safe to do but may // not be strictly necessary, we should revisit whether we can remove this in the future. val memoryManager = SparkEnv.get.memoryManager memoryManager.synchronized { memoryManager.notifyAll() } } } finally { TaskContext.unset() } } }
列表内容
Spark Core(十五)Executor执行Task的原理与源码分析(一)
最新推荐文章于 2021-09-13 08:05:18 发布