原创

第37课 : Task执行内幕与结果处理解密

版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://blog.csdn.net/duan_zhihua/article/details/72849913

第37课 :  Task执行内幕与结果处理解密

Task执行及结果处理原理流程图和源码解密:

在Standalone模式中,Driver中的CoarseGrainedSchedulerBackend给CoarseGrainedExecutorBackend发送launchTasks消息,CoarseGrainedExecutorBackend收到launchTasks消息以后会调用executor.launchTask。

CoarseGrainedExecutorBackend的receive方法如下,模式匹配收到LaunchTask消息:

1.         override def receive:PartialFunction[Any, Unit] = {

2.         ……

3.          case LaunchTask(data) =>

4.               if (executor == null) {

5.                 exitExecutor(1, "ReceivedLaunchTask command but executor was null")

6.               } else {

7.                 val taskDesc =ser.deserialize[TaskDescription](data.value)

8.                 logInfo("Got assigned task "+ taskDesc.taskId)

9.                 executor.launchTask(this, taskId =taskDesc.taskId, attemptNumber = taskDesc.attemptNumber,

10.                taskDesc.name,taskDesc.serializedTask)

11.            }

 

(1)LaunchTask判断executor是否存在,如果executor不存在则直接退出。然后会反序列化TaskDescription:

1.             val taskDesc =ser.deserialize[TaskDescription](data.value)

(2)Executor会通过launchTask来执行Task,launchTask方法中分别传入taskId、尝试次数、任务名称, 序列化后的任务本身:

1.             executor.launchTask(this, taskId =taskDesc.taskId, attemptNumber = taskDesc.attemptNumber,  taskDesc.name, taskDesc.serializedTask)

 

进入Executor.scala的launchTask方法,launchTask方法中new出来一个TaskRunner,传入的参数包括taskId、尝试次数、任务名称, 序列化后的任务本身。然后放入runningTasks数据结构,在threadPool 中执行TaskRunner。launchTask方法源码如下:

1.               deflaunchTask(

2.               context: ExecutorBackend,

3.               taskId: Long,

4.               attemptNumber: Int,

5.               taskName: String,

6.               serializedTask: ByteBuffer): Unit = {

7.             val tr = new TaskRunner(context, taskId =taskId, attemptNumber = attemptNumber, taskName,      serializedTask)

8.             runningTasks.put(taskId, tr)

9.             threadPool.execute(tr)

10.        }

 

TaskRunner本身是一个Runnable接口:

1.            class TaskRunner(

2.               execBackend: ExecutorBackend,

3.               val taskId: Long,

4.               val attemptNumber: Int,

5.               taskName: String,

6.               serializedTask: ByteBuffer)

7.             extends Runnable {

 

看一下TaskRunner的run方法,TaskMemoryManager是内存的管理,deserializeStartTime 是反序列化开始的时间,setContextClassLoader是ClassLoader加载具体的类。ser 是序列化器。:

1.               override def run(): Unit = {

2.               threadId = Thread.currentThread.getId

3.               Thread.currentThread.setName(threadName)

4.               valthreadMXBean = ManagementFactory.getThreadMXBean

5.               val taskMemoryManager = newTaskMemoryManager(env.memoryManager, taskId)

6.               val deserializeStartTime =System.currentTimeMillis()

7.               val deserializeStartCpuTime = if(threadMXBean.isCurrentThreadCpuTimeSupported) {

8.                 threadMXBean.getCurrentThreadCpuTime

9.               } else 0L

10.            Thread.currentThread.setContextClassLoader(replClassLoader)

11.            val ser =env.closureSerializer.newInstance()

12.            logInfo(s"Running $taskName (TID$taskId)")

13.            execBackend.statusUpdate(taskId,TaskState.RUNNING, EMPTY_BYTE_BUFFER)

14.          ......

15.        val (taskFiles, taskJars, taskProps,taskBytes) =

16.                Task.deserializeWithDependencies(serializedTask)

17.      ……

18.      val value = try {

19.                val res = task.run(

20.                  taskAttemptId = taskId,

21.                  attemptNumber = attemptNumber,

22.                  metricsSystem = env.metricsSystem)

23.                threwException = false

24.                Res

25.      …….

26.        val valueBytes = resultSer.serialize(value)

27.      …….

 

然后调用execBackend.statusUpdate,statusUpdate是ExecutorBackend的方法,ExecutorBackend通过statusUpdate给Driver发信息,汇报自己状态。。

1.            private[spark]trait ExecutorBackend {

2.           def statusUpdate(taskId: Long, state:TaskState, data: ByteBuffer): Unit

3.         }

(3)TaskRunner的run方法中,TaskRunner在ThreadPool来运行具体的Task,在TaskRunner的run方法中首先会通过调用statusUpdate给Driver发信息汇报自己的状态说明自己是Running状态:

1.            execBackend.statusUpdate(taskId,TaskState.RUNNING, EMPTY_BYTE_BUFFER)

其中EMPTY_BYTE_BUFFER没有具体内容:

1.           privateval EMPTY_BYTE_BUFFER = ByteBuffer.wrap(new Array[Byte](0))

  接下来通过Task.deserializeWithDependencies(serializedTask)反序列化Task,得到一个Tuple,获取到taskFiles, taskJars, taskProps, taskBytes等信息。

(4)Executor会通过TaskRunner 在ThreadPool来运行具体的Task, TaskRunner内部会做一些准备工作:反序列化Task的依赖。

1.                 val(taskFiles, taskJars, taskProps, taskBytes) =

2.                           Task.deserializeWithDependencies(serializedTask)

 

然后通过网络来获取需要的文件、Jar等。

1.          updateDependencies(taskFiles, taskJars)

我们看一下updateDependencies方法,从SparkContext收到一组新的文件 JARs,下载Task运行需要的依赖Jars,在类加载机中加载新的JARs包。updateDependencies方法源码如下:

1.           privatedef updateDependencies(newFiles: HashMap[String, Long], newJars:HashMap[String, Long]) {

2.             lazy val hadoopConf =SparkHadoopUtil.get.newConfiguration(conf)

3.             synchronized {

4.               // Fetch missing dependencies

5.               for ((name, timestamp) <- newFiles ifcurrentFiles.getOrElse(name, -1L) < timestamp) {

6.                 logInfo("Fetching " + name +" with timestamp " + timestamp)

7.                 // Fetch file with useCache mode, closecache for local mode.

8.                 Utils.fetchFile(name, newFile(SparkFiles.getRootDirectory()), conf,

9.                   env.securityManager, hadoopConf,timestamp, useCache = !isLocal)

10.              currentFiles(name) = timestamp

11.            }

12.            for ((name, timestamp) <- newJars) {

13.              val localName =name.split("/").last

14.              val currentTimeStamp =currentJars.get(name)

15.                .orElse(currentJars.get(localName))

16.                .getOrElse(-1L)

17.              if (currentTimeStamp < timestamp) {

18.                logInfo("Fetching " + name+ " with timestamp " + timestamp)

19.                // Fetch file with useCache mode,close cache for local mode.

20.                Utils.fetchFile(name, newFile(SparkFiles.getRootDirectory()), conf,

21.                  env.securityManager, hadoopConf,timestamp, useCache = !isLocal)

22.                currentJars(name) = timestamp

23.                // Add it to our class loader

24.                val url = newFile(SparkFiles.getRootDirectory(), localName).toURI.toURL

25.                if(!urlClassLoader.getURLs().contains(url)) {

26.                  logInfo("Adding " + url +" to class loader")

27.                  urlClassLoader.addURL(url)

28.                }

29.              }

30.            }

31.          }

32.        }

 

Executor的updateDependencies方法中:Executor运行具体任务的时候进行下载,下载文件使用synchronized关键字,因为Executor在线程中运行,同一个Stage内部不同的任务线程要共享这些内容,因此ExecutorBackend多条线程资源操作的时候,需要通过同步块加锁。

updateDependencies方法的Utils.fetchFile, 将文件或目录下载到目标目录,支持各种

方式获取文件:包括HTTP,Hadoop兼容的文件系统,标准文件系统的文件,基于URL参数。获取目录只支持从Hadoop兼容的文件系统。如果` usecache `设置为 true,第一次尝试拿文件到本地缓存,执行同一应用程序进行共享。` usecache `主要用于executors,而不是本地模式。如果目标文件已经存在,并有不同于请求文件的内容,将抛出SparkException异常。

1.            deffetchFile(

2.               url: String,

3.               targetDir: File,

4.               conf: SparkConf,

5.               securityMgr: SecurityManager,

6.               hadoopConf: Configuration,

7.               timestamp: Long,

8.               useCache: Boolean) {

9.         ……

10.         doFetchFile(url, localDir, cachedFileName,conf, securityMgr, hadoopConf)

11.      …….

 

doFetchFile方法如下,包括spark、http | https | ftp、file 各种协议方式的下载:

1.          private def doFetchFile(

2.               url: String,

3.               targetDir: File,

4.               filename: String,

5.               conf: SparkConf,

6.               securityMgr: SecurityManager,

7.               hadoopConf: Configuration) {

8.             val targetFile = new File(targetDir,filename)

9.             val uri = new URI(url)

10.          val fileOverwrite =conf.getBoolean("spark.files.overwrite", defaultValue = false)

11.          Option(uri.getScheme).getOrElse("file")match {

12.            case "spark" =>

13.           ......

14.              downloadFile(url, is, targetFile,fileOverwrite)

15.            case "http" | "https"| "ftp" =>

16.         ......

17.              downloadFile(url, in, targetFile,fileOverwrite)

18.            case "file" =>

19.            ......

20.            copyFile(url, sourceFile, targetFile,fileOverwrite)

21.            case _ =>

22.              val fs = getHadoopFileSystem(uri,hadoopConf)

23.              val path = new Path(uri)

24.              fetchHcfsFile(path, targetDir, fs, conf,hadoopConf, fileOverwrite,

25.                            filename =Some(filename))

26.          }

27.        }

 

(5)回到TaskRunner的run方法,所有依赖的Jar都下载完成以后,然后是反序列Task本身;

1.          task = ser.deserialize[Task[Any]](taskBytes,Thread.currentThread.getContextClassLoader) 

在执行具体Task的业务逻辑前会进行四次反序列:

a)      TaskDescription的反序列化;

b)      反序列化Task的依赖;

c)      Task的反序列化;

d)      RDD反序列化;

 

(6)回到TaskRunner的run方法,调用反序列化后的Task.run方法来执行任务并获得执行结果:

1.              val value = try {

2.                   val res = task.run(

3.                     taskAttemptId = taskId,

4.                     attemptNumber = attemptNumber,

5.                     metricsSystem = env.metricsSystem)

6.                   threwException = false

7.                   res

其中Task的run方法调用的时候会导致会导致Task的抽象方法runTask的调用,在Task的runTask内部会调用RDD的iterator()方法,该方法就是我们针对当前Task所对应的Partition进行计算的关键之所在,在处理的内部会迭代Partition的元素并交给我们自定义的function进行处理!

进入task.run方法,run方法里面再调用runTask方法:

1.               finaldef run(

2.               taskAttemptId: Long,

3.               attemptNumber: Int,

4.               metricsSystem: MetricsSystem): T = {

5.             SparkEnv.get.blockManager.registerTask(taskAttemptId)

6.             context = new TaskContextImpl(

7.           ......

8.             TaskContext.setTaskContext(context)

9.            ......

10.          try {

11.            runTask(context)

12.       ......

 

进入Task.scala的runTask方法,这里是一个抽象方法,没有具体的实现。

1.           defrunTask(context: TaskContext): T

Task包括2种Task:ResultTask、ShuffleMapTask,抽象runTask方法具体的实现由子类的runTask实现。我们先看一下ShuffleMapTask的runTask方法,runTask实际运行的时候会调用RDD的iterator,然后针对partition进行计算:

1.             overridedef runTask(context: TaskContext): MapStatus = {

2.          ......

3.             val ser =SparkEnv.get.closureSerializer.newInstance()

4.             val (rdd, dep) = ser.deserialize[(RDD[_],ShuffleDependency[_, _, _])](

5.             ......

6.               val manager = SparkEnv.get.shuffleManager

7.               writer = manager.getWriter[Any,Any](dep.shuffleHandle, partitionId, context)

8.               writer.write(rdd.iterator(partition,context).asInstanceOf[Iterator[_ <: Product2[Any, Any]]])

9.               writer.stop(success = true).get

10.         ......

ShuffleMapTask在计算具体的Partition之后实际上会通过shuffleManager获得的shuffleWriter把当前Task计算内容根据具体的shuffleManager实现来写入到具体的文件。操作完成以后会把MapStatus发送给DAGscheduler,Driver的DAGScheduler的MapOutputTracker会收到注册的信息。

同样的,ResultTask的runTask方法也是调用RDD的iterator,然后针对partition进行计算。MapOutputTracker会把ShuffleMapTask执行结果交给ResultTask,ResultTask根据前面Stage的执行结果进行Shuffle产生整个Job最后的结果。

1.            overridedef runTask(context: TaskContext): U = {

2.            ......

3.             val ser =SparkEnv.get.closureSerializer.newInstance()

4.             val (rdd, func) = ser.deserialize[(RDD[T],(TaskContext, Iterator[T]) => U)](

5.          ......

6.             func(context, rdd.iterator(partition,context))

7.           }

ResultTask、ShuffleMapTask 的runTask方法真正执行的时候,调用RDD的iterator,对partition进行计算。RDD.scala的iterator方法源码如下:

1.           finaldef iterator(split: Partition, context: TaskContext): Iterator[T] = {

2.             if (storageLevel != StorageLevel.NONE) {

3.               getOrCompute(split, context)

4.             } else {

5.               computeOrReadCheckpoint(split, context)

6.             }

7.           }

 

 

(7)回到TaskRunner的run方法,把执行结果序列化,并根据大小判断不同的结果传回给Driver的方式:

l  task.run运行的结果赋值给value

l  resultSer.serialize(value)把task.run的执行结果value序列化。

l  maxResultSize > 0 && resultSize > maxResultSize对任务执行结果的大小进行判断并进行相应的处理。任务执行完以后,任务的执行结果可以最大达到1G。

如果任务执行结果特别大的情况,超过1GB,日志提示超出任务大小限制。返回ser.serialize(newIndirectTaskResult[Any](TaskResultBlockId(taskId), resultSize))

如果任务执行结果小于1G,大于maxDirectResultSize(128M),就放入blockManager。返回ser.serialize(new IndirectTaskResult[Any](blockId, resultSize))。

如果任务执行结果小于128M,就直接返回serializedDirectResult

TaskRunner的run方法如下:

1.                  override def run(): Unit = {

2.         ......

3.         val value = try {

4.                   val res = task.run(

5.                     taskAttemptId = taskId,

6.                     attemptNumber = attemptNumber,

7.                     metricsSystem = env.metricsSystem)

8.                   threwException = false

9.                   res

10.      ......

11.         valvalueBytes = resultSer.serialize(value)

12.      ......

13.       val directResult = newDirectTaskResult(valueBytes, accumUpdates)

14.              val serializedDirectResult =ser.serialize(directResult)

15.              val resultSize =serializedDirectResult.limit

16.      ......

17.       

18.       val serializedResult: ByteBuffer = {

19.                if (maxResultSize > 0 &&resultSize > maxResultSize) {

20.              …….

21.            ser.serialize(newIndirectTaskResult[Any](TaskResultBlockId(taskId), resultSize))

22.                } else if (resultSize >maxDirectResultSize) {

23.                  val blockId = TaskResultBlockId(taskId)

24.                  env.blockManager.putBytes(

25.                    blockId,

26.                    newChunkedByteBuffer(serializedDirectResult.duplicate()),

27.                    StorageLevel.MEMORY_AND_DISK_SER)

28.                ……

29.                  ser.serialize(newIndirectTaskResult[Any](blockId, resultSize))

30.                } else {

31.             ……

32.                  serializedDirectResult

33.                }

34.              }

 

其中的maxResultSize大小是1GB, 任务的执行结果可以最大达到1G

1.           Executor.scala

2.         // Limit of bytes for totalsize of results (default is 1GB)

3.           private val maxResultSize =Utils.getMaxResultSize(conf)

4.         …….

5.         Utils.scala

6.           // Limit of bytes for total size of results(default is 1GB)

7.           def getMaxResultSize(conf: SparkConf): Long ={

8.             memoryStringToMb(conf.get("spark.driver.maxResultSize","1g")).toLong << 20

9.           }

 

其中的Executor.scala 中的maxDirectResultSize大小,取spark.task.maxDirectResultSize和RpcUtils.maxMessageSizeBytes的最小值。其中spark.rpc.message.maxSize默认配置是128M。spark.task.maxDirectResultSize在配置文件中进行配置。

1.           privateval maxDirectResultSize = Math.min(

2.             conf.getSizeAsBytes("spark.task.maxDirectResultSize",1L << 20),

3.             RpcUtils.maxMessageSizeBytes(conf))

4.         ……

5.         def maxMessageSizeBytes(conf:SparkConf): Int = {

6.             val maxSizeInMB =conf.getInt("spark.rpc.message.maxSize", 128)

7.             if (maxSizeInMB >MAX_MESSAGE_SIZE_IN_MB) {

8.               throw new IllegalArgumentException(

9.                 s"spark.rpc.message.maxSize shouldnot be greater than $MAX_MESSAGE_SIZE_IN_MB MB")

10.          }

11.          maxSizeInMB * 1024 * 1024

12.        }

 

补充说明:Driver发校息给Executor,发送任务的序列化大小。Spark 1.6版本中CoarseGrainedSchedulerBackend的launchTask方法中序列化任务大小的限制是akkaFrameSize - AkkaUtils.reservedSizeBytes。其中akkaFrameSize是128M,reservedSizeBytes是200B。

1.         private def launchTasks(tasks:Seq[Seq[TaskDescription]]) {

2.             ......

3.                 if (serializedTask.limit >=akkaFrameSize - AkkaUtils.reservedSizeBytes) {

4.         ……

5.         private val akkaFrameSize =AkkaUtils.maxFrameSizeBytes(conf)

6.         …….

7.           def maxFrameSizeBytes(conf: SparkConf): Int ={

8.             val frameSizeInMB =conf.getInt("spark.akka.frameSize", 128)

9.             if (frameSizeInMB >AKKA_MAX_FRAME_SIZE_IN_MB) {

10.            throw new IllegalArgumentException(

11.              s"spark.akka.frameSize should notbe greater than $AKKA_MAX_FRAME_SIZE_IN_MB MB")

12.          }

13.          frameSizeInMB * 1024 * 1024

14.        }

15.      …….

16.      val reservedSizeBytes = 200 *1024

17.      ……

 

Spark 2.1版本中 CoarseGrainedSchedulerBackend的launchTask方法中序列化任务大小的限制是maxRpcMessageSize为128M。

1.           private def launchTasks(tasks:Seq[Seq[TaskDescription]]) {

2.               ......

3.                 if (serializedTask.limit >=maxRpcMessageSize) {

4.         ……

5.          

6.         private val maxRpcMessageSize =RpcUtils.maxMessageSizeBytes(conf)

7.          

8.         def maxMessageSizeBytes(conf:SparkConf): Int = {

9.             val maxSizeInMB =conf.getInt("spark.rpc.message.maxSize", 128)

10.          if (maxSizeInMB >MAX_MESSAGE_SIZE_IN_MB) {

11.            throw new IllegalArgumentException(

12.              s"spark.rpc.message.maxSize shouldnot be greater than $MAX_MESSAGE_SIZE_IN_MB MB")

13.          }

14.          maxSizeInMB * 1024 * 1024

15.        }

16.      }

 

 

l  回到TaskRunner的run方法:execBackend.statusUpdate(taskId,TaskState.FINISHED, serializedResult)给Driver发送一个消息,消息中将taskId, TaskState.FINISHED, serializedResult放进去。

statusUpdate方法如下:

1.           override def statusUpdate(taskId: Long, state:TaskState, data: ByteBuffer) {

2.             val msg = StatusUpdate(executorId, taskId,state, data)

3.             driver match {

4.               case Some(driverRef) =>driverRef.send(msg)

5.               case None => logWarning(s"Drop$msg because has not yet connected to driver")

6.             }

7.           }

 

(8)CoarseGrainedExecutorBackend给DriverEndpoint发送StatusUpdate来传输执行结果,DriverEndpoint会把执行结果传递给TaskSchedulerImpl处理,然后交给TaskResultGetter内部通过线程去分别处理Task执行成功和失败时候的不同情况,然后告诉DAGScheduler任务处理结束的状况。

 

 

CoarseGrainedSchedulerBackend.scala中DriverEndpoint的receive方法:

1.            override def receive: PartialFunction[Any,Unit] = {

2.               case StatusUpdate(executorId, taskId,state, data) =>

3.                 scheduler.statusUpdate(taskId, state,data.value)

4.                 if (TaskState.isFinished(state)) {

5.                   executorDataMap.get(executorId) match{

6.                     case Some(executorInfo) =>

7.                       executorInfo.freeCores +=scheduler.CPUS_PER_TASK

8.                       makeOffers(executorId)

9.                     case None =>

10.                    // Ignoring the update since wedon't know about the executor.

11.                    logWarning(s"Ignored task statusupdate ($taskId state $state) " +

12.                      s"from unknown executorwith ID $executorId")

13.                }

14.              }

 

DriverEndpoint的receive方法中StatusUpdate调用scheduler.statusUpdate的,然后释放资源,再次进行资源调度makeOffers(executorId)。

TaskSchedulerImpl的statusUpdate中:

l  如果是TaskState.LOST,则记录下原因,将Executor清理掉。

l  如果是TaskState.isFinished ,则从taskSet中运行的任务中remove掉任务,调用taskResultGetter.enqueueSuccessfulTask处理。

l  如果是TaskState.FAILED, TaskState.KILLED, TaskState.LOST,调用taskResultGetter.enqueueFailedTask处理。

TaskSchedulerImpl的statusUpdate源码如下:

1.                  def statusUpdate(tid: Long, state: TaskState,serializedData: ByteBuffer) {

2.             var failedExecutor: Option[String] = None

3.             var reason: Option[ExecutorLossReason] =None

4.             synchronized {

5.               try {

6.                 taskIdToTaskSetManager.get(tid) match {

7.                   case Some(taskSet) =>

8.                     if (state == TaskState.LOST) {

9.              // TaskState.LOST is only used by thedeprecated Mesos fine-grained scheduling mode,

10.           //where each executor corresponds to a single task, so mark the executor asfailed.

11.                    val execId =taskIdToExecutorId.getOrElse(tid, throw new IllegalStateException(

12.                      "taskIdToTaskSetManager.contains(tid)<=> taskIdToExecutorId.contains(tid)"))

13.                    if(executorIdToRunningTaskIds.contains(execId)) {

14.                      reason = Some(

15.                        SlaveLost(s"Task $tidwas lost, so marking the executor as lost as well."))

16.                      removeExecutor(execId,reason.get)

17.                      failedExecutor = Some(execId)

18.                    }

19.                  }

20.                  if (TaskState.isFinished(state)) {

21.                    cleanupTaskState(tid)

22.                    taskSet.removeRunningTask(tid)

23.                    if (state == TaskState.FINISHED){

24.                      taskResultGetter.enqueueSuccessfulTask(taskSet,tid, serializedData)

25.                    } else if (Set(TaskState.FAILED,TaskState.KILLED, TaskState.LOST).contains(state)) {

26.                      taskResultGetter.enqueueFailedTask(taskSet,tid, state, serializedData)

27.                    }

28.                  }

29.                case None =>

30.                  logError(

31.                    ("Ignoring update with state%s for TID %s because its task set is gone (this is " +

32.                      "likely the result ofreceiving duplicate task finished status updates) or its " +

33.                      "executor has been markedas failed.")

34.                      .format(state, tid))

35.              }

36.            } catch {

37.              case e: Exception =>logError("Exception in statusUpdate", e)

38.            }

39.          }

40.          // Update the DAGScheduler without holdinga lock on this, since that can deadlock

41.          if (failedExecutor.isDefined) {

42.            assert(reason.isDefined)

43.            dagScheduler.executorLost(failedExecutor.get,reason.get)

44.            backend.reviveOffers()

45.          }

46.        }

 

其中taskResultGetter是TaskResultGetter的实例化对象:

1.               private[spark] var taskResultGetter = newTaskResultGetter(sc.env, this)

 

TaskResultGetter.scala源码如下:

1.            private[spark] classTaskResultGetter(sparkEnv: SparkEnv, scheduler: TaskSchedulerImpl)

2.           extends Logging {

3.          

4.           private val THREADS =sparkEnv.conf.getInt("spark.resultGetter.threads", 4)

5.          

6.           // Exposed for testing.

7.           protected val getTaskResultExecutor:ExecutorService =

8.             ThreadUtils.newDaemonFixedThreadPool(THREADS,"task-result-getter")

9.         …….

10.      def enqueueSuccessfulTask(

11.            taskSetManager: TaskSetManager,

12.            tid: Long,

13.            serializedData: ByteBuffer): Unit = {

14.          getTaskResultExecutor.execute(new Runnable{

15.            override def run(): Unit =Utils.logUncaughtExceptions {

16.              try {

17.                val (result, size) =serializer.get().deserialize[TaskResult[_]](serializedData) match {

18.                  case directResult: DirectTaskResult[_]=>

19.                    if(!taskSetManager.canFetchMoreResults(serializedData.limit())) {

20.                      return

21.                    }

22.                    // deserialize "value"without holding any lock so that it won't block other threads.

23.                    // We should call it here, so that when it'scalled again in

24.                    //"TaskSetManager.handleSuccessfulTask", it does not need todeserialize the value.

25.                    directResult.value(taskResultSerializer.get())

26.                    (directResult, serializedData.limit())

27.                  case IndirectTaskResult(blockId,size) =>

28.                    if(!taskSetManager.canFetchMoreResults(size)) {

29.                      // dropped by executor if sizeis larger than maxResultSize

30.                      sparkEnv.blockManager.master.removeBlock(blockId)

31.                      return

32.                    }

33.                    logDebug("Fetching indirecttask result for TID %s".format(tid))

34.                    scheduler.handleTaskGettingResult(taskSetManager,tid)

35.                    val serializedTaskResult = sparkEnv.blockManager.getRemoteBytes(blockId)

36.                    if(!serializedTaskResult.isDefined) {

37.                      /* We won't be able to get thetask result if the machine that ran the task failed

38.                       * between when the task endedand when we tried to fetch the result, or if the

39.                       * block manager had to flushthe result. */

40.                      scheduler.handleFailedTask(

41.                        taskSetManager, tid,TaskState.FINISHED, TaskResultLost)

42.                      return

43.                    }

44.                    val deserializedResult =serializer.get().deserialize[DirectTaskResult[_]](

45.                      serializedTaskResult.get.toByteBuffer)

46.                    // force deserialization ofreferenced value

47.                    deserializedResult.value(taskResultSerializer.get())

48.                    sparkEnv.blockManager.master.removeBlock(blockId)

49.                    (deserializedResult, size)

50.                }

51.       

52.                // Set the task result size in theaccumulator updates received from the executors.

53.                // We need to do this here on thedriver because if we did this on the executors then

54.                // we would have to serialize theresult again after updating the size.

55.                result.accumUpdates =result.accumUpdates.map { a =>

56.                  if (a.name == Some(InternalAccumulator.RESULT_SIZE)){

57.                    val acc =a.asInstanceOf[LongAccumulator]

58.                    assert(acc.sum == 0L, "taskresult size should not have been set on the executors")

59.                    acc.setValue(size.toLong)

60.                    acc

61.                  } else {

62.                    a

63.                  }

64.                }

65.       

66.                scheduler.handleSuccessfulTask(taskSetManager,tid, result)

67.              } catch {

68.                case cnf: ClassNotFoundException=>

69.                  val loader =Thread.currentThread.getContextClassLoader

70.                  taskSetManager.abort("ClassNotFound withclassloader: " + loader)

71.                // Matching NonFatal so we don'tcatch the ControlThrowable from the "return" above.

72.                case NonFatal(ex) =>

73.                  logError("Exception whilegetting task result", ex)

74.                  taskSetManager.abort("Exceptionwhile getting task result: %s".format(ex))

75.              }

76.            }

77.          })

78.        }

 

TaskResultGetter.scala的enqueueSuccessfulTask方法中,处理成功任务的时候开辟了一条新线程,先将结果反序列化,然后根据接收的结果类型 DirectTaskResult、IndirectTaskResult分别处理。

如果是DirectTaskResult,直接获得结果并返回。

如果是IndirectTaskResult,就通过blockManager.getRemoteBytes远程获取。获取以后再进行反序列化。

最后是 scheduler.handleSuccessfulTask。

TaskSchedulerImpl的handleSuccessfulTask的源码如下:

1.            def handleSuccessfulTask(

2.               taskSetManager: TaskSetManager,

3.               tid: Long,

4.               taskResult:DirectTaskResult[_]): Unit = synchronized {

5.             taskSetManager.handleSuccessfulTask(tid,taskResult)

6.           }

TaskSchedulerImpl中也有失败任务的相应处理:

1.             def handleFailedTask(

2.               taskSetManager:TaskSetManager,

3.               tid: Long,

4.               taskState: TaskState,

5.               reason: TaskFailedReason):Unit = synchronized {

6.             taskSetManager.handleFailedTask(tid,taskState, reason)

7.             if (!taskSetManager.isZombie&& taskState != TaskState.KILLED) {

8.               // Need to revive offersagain now that the task set manager state has been updated to

9.               // reflect failed tasks thatneed to be re-run.

10.            backend.reviveOffers()

11.          }

12.        }

 

TaskSchedulerImpl的handleSuccessfulTask交给TaskSetManager调用handleSuccessfulTask,告诉DAGScheduler任务处理结束的状况,并且Kill掉其他尝试的相同任务(因为一个任务已经尝试成功,其它的相同任务尝试没必要再次去尝试)。

TaskSetManager的handleSuccessfulTask源码如下:

1.            def handleSuccessfulTask(tid: Long, result:DirectTaskResult[_]): Unit = {

2.             val info = taskInfos(tid)

3.             val index = info.index

4.             info.markFinished(TaskState.FINISHED)

5.             removeRunningTask(tid)

6.            ……

7.             sched.dagScheduler.taskEnded(tasks(index),Success, result.value(), result.accumUpdates, info)

8.           ……

9.             for (attemptInfo <-taskAttempts(index) if attemptInfo.running) {

10.            logInfo(s"Killing attempt${attemptInfo.attemptNumber} for task ${attemptInfo.id} " +

11.              s"in stage${taskSet.id} (TID ${attemptInfo.taskId}) on ${attemptInfo.host} " +

12.              s"as the attempt${info.attemptNumber} succeeded on ${info.host}")

13.            sched.backend.killTask(attemptInfo.taskId,attemptInfo.executorId, true)

14.          }

15.         ……

16.          maybeFinishTaskSet()

17.        }

 

TaskSetManager的handleSuccessfulTask中调用了maybeFinishTaskSet,maybeFinishTaskSet的源码如下:

1.           private def maybeFinishTaskSet() {

2.             if (isZombie &&runningTasks == 0) {

3.               sched.taskSetFinished(this)

4.             }

5.           }

 

总结一下:

如Task执行及结果处理原理流程图:任务从Driver上发送过来,CoarseGrainedSchedulerBackend发送任务,CoarseGrainedExecutorBackend收到任务之后,交给Executor处理, Executor会通过launchTask来执行Task, TaskRunner内部会做很多准备工作:反序列化Task的依赖、通过网络来获取需要的文件、Jar、反序列Task本身等待;然后调用Task的runTask进行执行,runTask有2种:ShuffleMapTask、ResultTask。通过iterator()方法根据业务逻辑循环遍历,如果是ShuffleMapTask,就把MapStatus汇报给MapOutTracker;如果是ResultTask,就从前面的MapOutTracker中获取信息。


展开阅读全文

没有更多推荐了,返回首页

博客模板©2019 由CSDN提供 模版主题为:skin3-template by CSDN官方博客 设计