spark源码-任务提交流程之-3-ApplicationMaster

26 篇文章 0 订阅
25 篇文章 2 订阅

1.概述

​ 在【spark源码-任务提交流程之YarnClusterApplication】中分析到,任务执行流程在执行YarnClusterApplication过程中,会将am上下文参数进行封装,然后将应用提交给RM;这其中就将amClass参数进行了设置;

​ 针对yarn-cluster模式,amClass = bin/java org.apache.spark.deploy.yarn.ApplicationMaster;即将应用提交到RM后,RM会选择一台NM机器启动AM;

​ 下面将针对AM的启动进行分析;

2.main 主入口

​ 全路径:org.apache.spark.deploy.yarn.ApplicationMaster;

​ 在ApplicationMaster启动后,做了如下3件事:

​ 1、解析参数;

​ 2、实例化AM

​ 3、执行run

object ApplicationMaster extends Logging {
  def main(args: Array[String]): Unit = {
    SignalUtils.registerLogger(log)
    //解析参数
    val amArgs = new ApplicationMasterArguments(args)
    //实例化AM
    master = new ApplicationMaster(amArgs)
    //执行run
    System.exit(master.run())
  }
}

2.1.解析AM参数进行封装

​ 根据参数定义应用程序入口点、jars、参数、属性文件等信息;

​ 同时设置executor默认数量为2;

class ApplicationMasterArguments(val args: Array[String]) {
  //应用程序 jar 以及选项中包含的任何 jar		由参数--jar定义
  var userJar: String = null
  //应用程序的入口点
  var userClass: String = null
  var primaryPyFile: String = null
  var primaryRFile: String = null
  //应用程序参数
  var userArgs: Seq[String] = Nil
  //额外属性的文件
  var propertiesFile: String = null

  parseArgs(args.toList)

  private def parseArgs(inputArgs: List[String]): Unit = {
    val userArgsBuffer = new ArrayBuffer[String]()

    var args = inputArgs

    while (!args.isEmpty) {
      // --num-workers, --worker-memory, and --worker-cores are deprecated since 1.0,
      // the properties with executor in their names are preferred.
      args match {
        case ("--jar") :: value :: tail =>
          userJar = value
          args = tail

        case ("--class") :: value :: tail =>
          userClass = value
          args = tail

        case ("--primary-py-file") :: value :: tail =>
          primaryPyFile = value
          args = tail

        case ("--primary-r-file") :: value :: tail =>
          primaryRFile = value
          args = tail

        case ("--arg") :: value :: tail =>
          userArgsBuffer += value
          args = tail

        case ("--properties-file") :: value :: tail =>
          propertiesFile = value
          args = tail

        case _ =>
          printUsageAndExit(1, args)
      }
    }

    if (primaryPyFile != null && primaryRFile != null) {
      // scalastyle:off println
      System.err.println("Cannot have primary-py-file and primary-r-file at the same time")
      // scalastyle:on println
      System.exit(-1)
    }

    userArgs = userArgsBuffer.toList
  }

  def printUsageAndExit(exitCode: Int, unknownParam: Any = null) {
  	//..............
  }
}
object ApplicationMasterArguments {
  val DEFAULT_NUMBER_EXECUTORS = 2
}

2.2.实例化AM

​ 在实例化AM过程中,进行了如下事件:

​ 实例化sparkConf并将属性文件中的参数设置到sparkConf中;

​ 将sparkConf中的参数设置到系统属性中;

​ 实例化securityMgr;

​ 实例化RM Client;

​ 加载客户端设置的本地化文件列表

private[spark] class ApplicationMaster(args: ApplicationMasterArguments) extends Logging {

  // TODO: Currently, task to container is computed once (TaskSetManager) - which need not be
  // optimal as more containers are available. Might need to handle this better.

  private val isClusterMode = args.userClass != null
	//实例化spark config
  private val sparkConf = new SparkConf()
  if (args.propertiesFile != null) {
    //将配置的属性文件中的参数缓存到sparkConf中:通过hashMap实现缓存
    Utils.getPropertiesFromFile(args.propertiesFile).foreach { case (k, v) =>
      sparkConf.set(k, v)
    }
  }

  //根据sparkConf实例化securityMgr
  private val securityMgr = new SecurityManager(sparkConf实例化)

  private var metricsSystem: Option[MetricsSystem] = None

  // 将sparkConf中的参数设置为系统属性
  sparkConf.getAll.foreach { case (k, v) =>
    sys.props(k) = v
  }

  //根据sparkConf构建yarnConf
  private val yarnConf = new YarnConfiguration(SparkHadoopUtil.newConfiguration(sparkConf))

  //实例化类加载器
  private val userClassLoader = {
    val classpath = Client.getUserClasspath(sparkConf)
    val urls = classpath.map { entry =>
      new URL("file:" + new File(entry.getPath()).getAbsolutePath())
    }

    if (isClusterMode) {
      if (Client.isUserClassPathFirst(sparkConf, isDriver = true)) {
        new ChildFirstURLClassLoader(urls, Utils.getContextOrSparkClassLoader)
      } else {
        new MutableURLClassLoader(urls, Utils.getContextOrSparkClassLoader)
      }
    } else {
      new MutableURLClassLoader(urls, Utils.getContextOrSparkClassLoader)
    }
  }

  //令牌更新器
  private val credentialRenewer: Option[AMCredentialRenewer] = sparkConf.get(KEYTAB).map { _ =>
    new AMCredentialRenewer(sparkConf, yarnConf)
  }

  //使用UGI的用户作为当前ApplicationMaster的运行用
  private val ugi = credentialRenewer match {
    case Some(cr) =>
      // Set the context class loader so that the token renewer has access to jars distributed
      // by the user.
      val currentLoader = Thread.currentThread().getContextClassLoader()
      Thread.currentThread().setContextClassLoader(userClassLoader)
      try {
        cr.start()
      } finally {
        Thread.currentThread().setContextClassLoader(currentLoader)
      }

    case _ =>
      SparkHadoopUtil.get.createSparkUser()
  }

  //实例化RM
  private val client = doAsUser { new YarnRMClient() }

  //默认为executor数量的两倍(如果启用动态分配,则为最大executor数量的两倍),最小为3
  private val maxNumExecutorFailures = {
    val effectiveNumExecutors =
      if (Utils.isDynamicAllocationEnabled(sparkConf)) {
        sparkConf.get(DYN_ALLOCATION_MAX_EXECUTORS)
      } else {
        sparkConf.get(EXECUTOR_INSTANCES).getOrElse(0)
      }
    // By default, effectiveNumExecutors is Int.MaxValue if dynamic allocation is enabled. We need
    // avoid the integer overflow here.
    val defaultMaxNumExecutorFailures = math.max(3,
      if (effectiveNumExecutors > Int.MaxValue / 2) Int.MaxValue else (2 * effectiveNumExecutors))

    sparkConf.get(MAX_EXECUTOR_FAILURES).getOrElse(defaultMaxNumExecutorFailures)
  }

  @volatile private var exitCode = 0
  @volatile private var unregistered = false
  @volatile private var finished = false
  @volatile private var finalStatus = getDefaultFinalStatus
  @volatile private var finalMsg: String = ""
  @volatile private var userClassThread: Thread = _

  @volatile private var reporterThread: Thread = _
  @volatile private var allocator: YarnAllocator = _

  // A flag to check whether user has initialized spark context
  @volatile private var registered = false

  // Lock for controlling the allocator (heartbeat) thread.
  private val allocatorLock = new Object()

  // 心跳间隔
  private val heartbeatInterval = {
    // Ensure that progress is sent before YarnConfiguration.RM_AM_EXPIRY_INTERVAL_MS elapses.
    val expiryInterval = yarnConf.getInt(YarnConfiguration.RM_AM_EXPIRY_INTERVAL_MS, 120000)
    math.max(0, math.min(expiryInterval / 2, sparkConf.get(RM_HEARTBEAT_INTERVAL)))
  }

  // 分配程序轮询之前的初始等待间隔,以允许在执行程序被请求时更快地上升
  private val initialAllocationInterval = math.min(heartbeatInterval,
    sparkConf.get(INITIAL_HEARTBEAT_INTERVAL))

  // 分配程序轮询之前的下一个等待间隔
  private var nextAllocationInterval = initialAllocationInterval

  private var rpcEnv: RpcEnv = null

  // 在集群模式下,用于告诉AM用户的SparkContext已经初始化.
  private val sparkContextPromise = Promise[SparkContext]()

  // 加载客户端设置的本地化文件列表。它在启动执行器时使用,并在这里加载,以便这些配置在集群模式下不会污染Web UI的环境页面
  private val localResources = doAsUser {
    logInfo("Preparing Local resources")
    val resources = HashMap[String, LocalResource]()

    def setupDistributedCache(
        file: String,
        rtype: LocalResourceType,
        timestamp: String,
        size: String,
        vis: String): Unit = {
      val uri = new URI(file)
      val amJarRsrc = Records.newRecord(classOf[LocalResource])
      amJarRsrc.setType(rtype)
      amJarRsrc.setVisibility(LocalResourceVisibility.valueOf(vis))
      amJarRsrc.setResource(ConverterUtils.getYarnUrlFromURI(uri))
      amJarRsrc.setTimestamp(timestamp.toLong)
      amJarRsrc.setSize(size.toLong)

      val fileName = Option(uri.getFragment()).getOrElse(new Path(uri).getName())
      resources(fileName) = amJarRsrc
    }

    val distFiles = sparkConf.get(CACHED_FILES)
    val fileSizes = sparkConf.get(CACHED_FILES_SIZES)
    val timeStamps = sparkConf.get(CACHED_FILES_TIMESTAMPS)
    val visibilities = sparkConf.get(CACHED_FILES_VISIBILITIES)
    val resTypes = sparkConf.get(CACHED_FILES_TYPES)

    for (i <- 0 to distFiles.size - 1) {
      val resType = LocalResourceType.valueOf(resTypes(i))
      setupDistributedCache(distFiles(i), resType, timeStamps(i).toString, fileSizes(i).toString,
      visibilities(i))
    }

    // Distribute the conf archive to executors.
    sparkConf.get(CACHED_CONF_ARCHIVE).foreach { path =>
      val uri = new URI(path)
      val fs = FileSystem.get(uri, yarnConf)
      val status = fs.getFileStatus(new Path(uri))
      // SPARK-16080: Make sure to use the correct name for the destination when distributing the
      // conf archive to executors.
      val destUri = new URI(uri.getScheme(), uri.getRawSchemeSpecificPart(),
        Client.LOCALIZED_CONF_DIR)
      setupDistributedCache(destUri.toString(), LocalResourceType.ARCHIVE,
        status.getModificationTime().toString, status.getLen.toString,
        LocalResourceVisibility.PRIVATE.name())
    }

    // Clean up the configuration so it doesn't show up in the Web UI (since it's really noisy).
    CACHE_CONFIGS.foreach { e =>
      sparkConf.remove(e)
      sys.props.remove(e.key)
    }

    resources.toMap
  }
}

2.3.执行AM的run方法

​ 针对集群模式,设置系统属性、构建spark调用上下文、调用runDriver方法

private[spark] class ApplicationMaster(args: ApplicationMasterArguments) extends Logging {
	final def run(): Int = {
    doAsUser {
      //执行runImpl实现方法
      runImpl()
    }
    exitCode
  }

  private def runImpl(): Unit = {
    try {
      val appAttemptId = client.getAttemptId()

      var attemptID: Option[String] = None

      //设置集群模式的系统属性
      if (isClusterMode) {
        // Set the web ui port to be ephemeral for yarn so we don't conflict with
        // other spark processes running on the same box
        System.setProperty("spark.ui.port", "0")

        // Set the master and deploy mode property to match the requested mode.
        System.setProperty("spark.master", "yarn")
        System.setProperty("spark.submit.deployMode", "cluster")

        // Set this internal configuration if it is running on cluster mode, this
        // configuration will be checked in SparkContext to avoid misuse of yarn cluster mode.
        System.setProperty("spark.yarn.app.id", appAttemptId.getApplicationId().toString())

        attemptID = Option(appAttemptId.getAttemptId.toString)
      }

      //在HDFS和Yarn上建立Spark调用上下文。上下文将由传入的参数构造
      new CallerContext(
        "APPMASTER", sparkConf.get(APP_CALLER_CONTEXT),
        Option(appAttemptId.getApplicationId.toString), attemptID).setCurrentContext()

      logInfo("ApplicationAttemptId: " + appAttemptId)

      // This shutdown hook should run *after* the SparkContext is shut down.
      val priority = ShutdownHookManager.SPARK_CONTEXT_SHUTDOWN_PRIORITY - 1
      ShutdownHookManager.addShutdownHook(priority) { () =>
        val maxAppAttempts = client.getMaxRegAttempts(sparkConf, yarnConf)
        val isLastAttempt = client.getAttemptId().getAttemptId() >= maxAppAttempts

        if (!finished) {
          // The default state of ApplicationMaster is failed if it is invoked by shut down hook.
          // This behavior is different compared to 1.x version.
          // If user application is exited ahead of time by calling System.exit(N), here mark
          // this application as failed with EXIT_EARLY. For a good shutdown, user shouldn't call
          // System.exit(0) to terminate the application.
          finish(finalStatus,
            ApplicationMaster.EXIT_EARLY,
            "Shutdown hook called before final status was reported.")
        }

        if (!unregistered) {
          // we only want to unregister if we don't want the RM to retry
          if (finalStatus == FinalApplicationStatus.SUCCEEDED || isLastAttempt) {
            unregister(finalStatus, finalMsg)
            cleanupStagingDir()
          }
        }
      }

      if (isClusterMode) {
        //yarn-cluster模式,执行runDriver
        runDriver()
      } else {
        runExecutorLauncher()
      }
    } catch {
      case e: Exception =>
        // catch everything else if not specifically handled
        logError("Uncaught exception: ", e)
        finish(FinalApplicationStatus.FAILED,
          ApplicationMaster.EXIT_UNCAUGHT_EXCEPTION,
          "Uncaught exception: " + StringUtils.stringifyException(e))
    } finally {
      try {
        metricsSystem.foreach { ms =>
          ms.report()
          ms.stop()
        }
      } catch {
        case e: Exception =>
          logWarning("Exception during stopping of the metric system: ", e)
      }
    }
  }
  
  private def doAsUser[T](fn: => T): T = {
    ugi.doAs(new PrivilegedExceptionAction[T]() {
      override def run: T = fn
    })
  }
}

2.3.1.runDriver

private[spark] class ApplicationMaster(args: ApplicationMasterArguments) extends Logging {
  private def runDriver(): Unit = {
    addAmIpFilter(None)
    //启动用户程序,返回一个线程,即启动driver线程
    userClassThread = startUserApplication()

    // This a bit hacky, but we need to wait until the spark.driver.port property has
    // been set by the Thread executing the user class.
    logInfo("Waiting for spark context initialization...")
    val totalWaitTime = sparkConf.get(AM_MAX_WAIT_TIME)
    try {
      val sc = ThreadUtils.awaitResult(sparkContextPromise.future,
        Duration(totalWaitTime, TimeUnit.MILLISECONDS))
      if (sc != null) {
        rpcEnv = sc.env.rpcEnv

        val userConf = sc.getConf
        val host = userConf.get("spark.driver.host")
        val port = userConf.get("spark.driver.port").toInt
        //注册AM:AM向RM注册
        registerAM(host, port, userConf, sc.ui.map(_.webUrl))

        val driverRef = rpcEnv.setupEndpointRef(
          RpcAddress(host, port),
          YarnSchedulerBackend.ENDPOINT_NAME)
        //申请资源
        createAllocator(driverRef, userConf)
      } else {
        // Sanity check; should never happen in normal operation, since sc should only be null
        // if the user app did not create a SparkContext.
        throw new IllegalStateException("User did not initialize spark context!")
      }
      resumeDriver()
      userClassThread.join()
    } catch {
      case e: SparkException if e.getCause().isInstanceOf[TimeoutException] =>
        logError(
          s"SparkContext did not initialize after waiting for $totalWaitTime ms. " +
           "Please check earlier log output for errors. Failing the application.")
        finish(FinalApplicationStatus.FAILED,
          ApplicationMaster.EXIT_SC_NOT_INITED,
          "Timed out waiting for SparkContext.")
    } finally {
      resumeDriver()
    }
  }
}
2.3.1.1.startUserApplication 启动一个driver线程

​ driver是一个执行用户类代码的线程名称;

​ 通过spark-submit提交参数–class指定的类路径确定用户类;

​ 通过反射获取该用户类的main方法;

​ 创建线程执行该main方法;

​ driver线程是am进程的子线程;

private[spark] class ApplicationMaster(args: ApplicationMasterArguments) extends Logging {
	private def startUserApplication(): Thread = {
    logInfo("Starting the user application in a separate Thread")

    //解析用户参数
    var userArgs = args.userArgs
    if (args.primaryPyFile != null && args.primaryPyFile.endsWith(".py")) {
      // When running pyspark, the app is run using PythonRunner. The second argument is the list
      // of files to add to PYTHONPATH, which Client.scala already handles, so it's empty.
      userArgs = Seq(args.primaryPyFile, "") ++ userArgs
    }
    if (args.primaryRFile != null && args.primaryRFile.endsWith(".R")) {
      // TODO(davies): add R dependencies here
    }

    //通过反射获取用户类main方法
    val mainMethod = userClassLoader.loadClass(args.userClass)
      .getMethod("main", classOf[Array[String]])

    //创建新线程
    val userThread = new Thread {
      override def run() {
        try {
          if (!Modifier.isStatic(mainMethod.getModifiers)) {
            logError(s"Could not find static main method in object ${args.userClass}")
            finish(FinalApplicationStatus.FAILED, ApplicationMaster.EXIT_EXCEPTION_USER_CLASS)
          } else {
            //通过反射实现main方法调用
            mainMethod.invoke(null, userArgs.toArray)
            finish(FinalApplicationStatus.SUCCEEDED, ApplicationMaster.EXIT_SUCCESS)
            logDebug("Done running user class")
          }
        } catch {
          case e: InvocationTargetException =>
            e.getCause match {
              case _: InterruptedException =>
                // Reporter thread can interrupt to stop user class
              case SparkUserAppException(exitCode) =>
                val msg = s"User application exited with status $exitCode"
                logError(msg)
                finish(FinalApplicationStatus.FAILED, exitCode, msg)
              case cause: Throwable =>
                logError("User class threw exception: " + cause, cause)
                finish(FinalApplicationStatus.FAILED,
                  ApplicationMaster.EXIT_EXCEPTION_USER_CLASS,
                  "User class threw exception: " + StringUtils.stringifyException(cause))
            }
            sparkContextPromise.tryFailure(e.getCause())
        } finally {
          // Notify the thread waiting for the SparkContext, in case the application did not
          // instantiate one. This will do nothing when the user code instantiates a SparkContext
          // (with the correct master), or when the user code throws an exception (due to the
          // tryFailure above).
          sparkContextPromise.trySuccess(null)
        }
      }
    }
    userThread.setContextClassLoader(userClassLoader)
    //设置线程名为driver:driver线程为AM进程下的一个子线程,该线程执行用户类的main方法
    userThread.setName("Driver")
    //启动线程:执行线程run方法,及调用执行用户类定义的main方法
    userThread.start()
    //返回driver线程
    userThread
  }
}
2.3.1.2.向RM注册AM

​ 通过RPC方式向RM注册AM;

​ 通过将AM的host、port、对外提供的追踪的web url注册到RM并得到RM响应,完成注册流程;

private[spark] class ApplicationMaster(args: ApplicationMasterArguments) extends Logging {
	private def registerAM(
      host: String,	//dirver的ip
      port: Int,		//driver线程的port
      _sparkConf: SparkConf,
      uiAddress: Option[String]): Unit = {
    val appId = client.getAttemptId().getApplicationId().toString()
    val attemptId = client.getAttemptId().getAttemptId().toString()
    val historyAddress = ApplicationMaster
      .getHistoryServerAddress(_sparkConf, yarnConf, appId, attemptId)

    //调用YarnRMClient的register方法进行注册
    client.register(host, port, yarnConf, _sparkConf, uiAddress, historyAddress)
    registered = true
  }
}

private[spark] class YarnRMClient extends Logging {
  def register(
      driverHost: String,
      driverPort: Int,
      conf: YarnConfiguration,
      sparkConf: SparkConf,
      uiAddress: Option[String],
      uiHistoryAddress: String): Unit = {
    //构建AM与RM通信客户端并启动
    amClient = AMRMClient.createAMRMClient()
    amClient.init(conf)
    amClient.start()
    this.uiHistoryAddress = uiHistoryAddress

    val trackingUrl = uiAddress.getOrElse {
      if (sparkConf.get(ALLOW_HISTORY_SERVER_TRACKING_URL)) uiHistoryAddress else ""
    }

    logInfo("Registering the ApplicationMaster")
    synchronized {
      //注册AM
      amClient.registerApplicationMaster(driverHost, driverPort, trackingUrl)
      registered = true
    }
  }
}

//AMRMClientImpl是AMRMClient的实现之一
public class AMRMClientImpl<T extends ContainerRequest> extends AMRMClient<T> {
  //注册AM:将AM的host、port、appTrackingUrl信息绑定到AMRMClient中,准备注册环境
  public RegisterApplicationMasterResponse registerApplicationMaster(String appHostName, int appHostPort, String appTrackingUrl) throws YarnException, IOException {
        this.appHostName = appHostName;
        this.appHostPort = appHostPort;
        this.appTrackingUrl = appTrackingUrl;
        Preconditions.checkArgument(appHostName != null, "The host name should not be null");
        Preconditions.checkArgument(appHostPort >= -1, "Port number of the host should be any integers larger than or equal to -1");
    		//环境准备好后,调用具体注册逻辑
        return this.registerApplicationMaster();
    }

  	//具体AM注册逻辑
    private RegisterApplicationMasterResponse registerApplicationMaster() throws YarnException, IOException {
      	//注册信息封装
        RegisterApplicationMasterRequest request = RegisterApplicationMasterRequest.newInstance(this.appHostName, this.appHostPort, this.appTrackingUrl);
      	//RPC方式注册,封装响应信息
        RegisterApplicationMasterResponse response = this.rmClient.registerApplicationMaster(request);
        synchronized(this) {
            this.lastResponseId = 0;
            if (!response.getNMTokensFromPreviousAttempts().isEmpty()) {
                this.populateNMTokens(response.getNMTokensFromPreviousAttempts());
            }

            return response;
        }
    }
}
2.3.1.2.1 RegisterApplicationMasterRequest 注册请求信息封装

​ 主要封装host、port、web url;

public abstract class RegisterApplicationMasterRequest {
  public static RegisterApplicationMasterRequest newInstance(String host, int port, String trackingUrl) {
        RegisterApplicationMasterRequest request = (RegisterApplicationMasterRequest)Records.newRecord(RegisterApplicationMasterRequest.class);
        //ApplicationMaster启动所在的节点的host
    		request.setHost(host);
        //ApplicationMaster本次启动对外rpc的端口号
    		request.setRpcPort(port);
        //ApplicationMaster对外提供的追踪的web url,用户可以通过该url查看应用程序执行状态
    		request.setTrackingUrl(trackingUrl);
        return request;
    }
}
2.3.1.2.2 egisterApplicationMasterResponse 注册响应信息封装

​ 响应重要信息:最大可申请的单个Container的占用的资源量、应用程序访问控制列表;

public class RegisterApplicationMasterResponsePBImpl extends RegisterApplicationMasterResponse {
    Builder builder = null;
    boolean viaProto = false;
  	//最大可申请的单个Container的占用的资源量
    private Resource maximumResourceCapability;
  	//应用程序访问控制列表
    private Map<ApplicationAccessType, String> applicationACLS = null;
    private List<Container> containersFromPreviousAttempts = null;
    private List<NMToken> nmTokens = null;
    private EnumSet<SchedulerResourceTypes> schedulerResourceTypes = null;
}
2.3.1.3.AM向RM申请资源
private[spark] class ApplicationMaster(args: ApplicationMasterArguments) extends Logging {
	private def createAllocator(driverRef: RpcEndpointRef, _sparkConf: SparkConf): Unit = {
    val appId = client.getAttemptId().getApplicationId().toString()
    val driverUrl = RpcEndpointAddress(driverRef.address.host, driverRef.address.port,
      CoarseGrainedSchedulerBackend.ENDPOINT_NAME).toString

    // Before we initialize the allocator, let's log the information about how executors will
    // be run up front, to avoid printing this out for every single executor being launched.
    // Use placeholders for information that changes such as executor IDs.
    logInfo {
      val executorMemory = _sparkConf.get(EXECUTOR_MEMORY).toInt
      val executorCores = _sparkConf.get(EXECUTOR_CORES)
      val dummyRunner = new ExecutorRunnable(None, yarnConf, _sparkConf, driverUrl, "<executorId>",
        "<hostname>", executorMemory, executorCores, appId, securityMgr, localResources)
      dummyRunner.launchContextDebugInfo()
    }

    //创建分配器
    allocator = client.createAllocator(
      yarnConf,
      _sparkConf,
      driverUrl,
      driverRef,
      securityMgr,
      localResources)

    credentialRenewer.foreach(_.setDriverRef(driverRef))

    // Initialize the AM endpoint *after* the allocator has been initialized. This ensures
    // that when the driver sends an initial executor request (e.g. after an AM restart),
    // the allocator is ready to service requests.
    rpcEnv.setupEndpoint("YarnAM", new AMEndpoint(rpcEnv, driverRef))

    //分配器分配资源
    allocator.allocateResources()
    val ms = MetricsSystem.createMetricsSystem("applicationMaster", sparkConf, securityMgr)
    val prefix = _sparkConf.get(YARN_METRICS_NAMESPACE).getOrElse(appId)
    ms.registerSource(new ApplicationMasterSource(prefix, allocator))
    // do not register static sources in this case as per SPARK-25277
    ms.start(false)
    metricsSystem = Some(ms)
    reporterThread = launchReporterThread()
  }
}
2.3.1.3.1.createAllocator创建分配器
private[spark] class YarnRMClient extends Logging {
  def createAllocator(
      conf: YarnConfiguration,
      sparkConf: SparkConf,
      driverUrl: String,
      driverRef: RpcEndpointRef,
      securityMgr: SecurityManager,
      localResources: Map[String, LocalResource]): YarnAllocator = {
    require(registered, "Must register AM before creating allocator.")
    //通过实例化YarnAllocator,由YarnAllocator向RM申请container资源
    new YarnAllocator(driverUrl, driverRef, conf, sparkConf, amClient, getAttemptId(), securityMgr,
      localResources, new SparkRackResolver())
  }
}
2.3.1.3.2.allocateResources分配器申请、分配资源
private[yarn] class YarnAllocator(
    driverUrl: String,
    driverRef: RpcEndpointRef,
    conf: YarnConfiguration,
    sparkConf: SparkConf,
    amClient: AMRMClient[ContainerRequest],
    appAttemptId: ApplicationAttemptId,
    securityMgr: SecurityManager,
    localResources: Map[String, LocalResource],
    resolver: SparkRackResolver,
    clock: Clock = new SystemClock)
  extends Logging {
    
    def allocateResources(): Unit = synchronized {
    //更新容器请求列表:根据当前运行的executor数量和要请求的executor总数,来同步更新向ResourceManager请求的containers数。
    updateResourceRequests()

    val progressIndicator = 0.1f
    // 向ResourceManager申请containers资源,并返回分配响应
    val allocateResponse = amClient.allocate(progressIndicator)
		//获取分配的containers列表
    val allocatedContainers = allocateResponse.getAllocatedContainers()
    //黑名单节点跟踪
    allocatorBlacklistTracker.setNumClusterNodes(allocateResponse.getNumClusterNodes)

    //如果分配的containers数大于0,就处理这些containers
    if (allocatedContainers.size > 0) {
      logDebug(("Allocated containers: %d. Current executor count: %d. " +
        "Launching executor count: %d. Cluster resources: %s.")
        .format(
          allocatedContainers.size,
          runningExecutors.size,
          numExecutorsStarting.get,
          allocateResponse.getAvailableResources))
			//处理从ResourceManager获取的containers,并在containers中启动executor
      handleAllocatedContainers(allocatedContainers.asScala)
    }

    //获取已使用containers列表,也可能是出错的containers
    val completedContainers = allocateResponse.getCompletedContainersStatuses()
    if (completedContainers.size > 0) {
      logDebug("Completed %d containers".format(completedContainers.size))
      //处理使用过的containers
      processCompletedContainers(completedContainers.asScala)
      logDebug("Finished processing %d completed containers. Current running executor count: %d."
        .format(completedContainers.size, runningExecutors.size))
    }
  }
}
2.3.1.3.2.1.updateResourceRequests 更新容器请求

​ 根据当前运行的executor数量和要请求的executor总数,来同步更新向ResourceManager请求的containers数。

​ 根据每个节点的task,将挂起的容器请求列表进行分组:

​ 本地可匹配请求列表;

​ 本地不匹配请求列表;

​ 非本地请求列表;

​ 对于那些本地匹配的请求列表之外的两种请求,会进行取消并重新发起请求,然后,根据container放置策略来重新计算本地性,以最大化任务的本地性执行;

​ 关注容器请求本地性,尽量提高容器请求本地性,对不不能满足本地性的请求,将其取消;

​ 当已存在的executor数量多于需要的executor数量时,从挂起的容器请求中移除多出数量的容器请求;

private[yarn] class YarnAllocator(
    driverUrl: String,
    driverRef: RpcEndpointRef,
    conf: YarnConfiguration,
    sparkConf: SparkConf,
    amClient: AMRMClient[ContainerRequest],
    appAttemptId: ApplicationAttemptId,
    securityMgr: SecurityManager,
    localResources: Map[String, LocalResource],
    resolver: SparkRackResolver,
    clock: Clock = new SystemClock)
  extends Logging {
    
	def updateResourceRequests(): Unit = {
    // 挂起的容器请求序列
    val pendingAllocate = getPendingAllocate
    val numPendingAllocate = pendingAllocate.size
    //计算缺失的executor数量
    val missing = targetNumExecutors - numPendingAllocate -
      numExecutorsStarting.get - runningExecutors.size
    logDebug(s"Updating resource requests, target: $targetNumExecutors, " +
      s"pending: $numPendingAllocate, running: ${runningExecutors.size}, " +
      s"executorsStarting: ${numExecutorsStarting.get}")

    // 挂起的容器请求分类:
    // localRequests: 本地可匹配请求列表
    // staleRequests:	本地不匹配请求列表
    // anyHostRequests:	非本地请求列表;
    val (localRequests, staleRequests, anyHostRequests) = splitPendingAllocationsByLocality(
      hostToLocalTaskCounts, pendingAllocate)

    if (missing > 0) {
      logInfo(s"Will request $missing executor container(s), each with " +
        s"${resource.getVirtualCores} core(s) and " +
        s"${resource.getMemory} MB memory (including $memoryOverhead MB of overhead)")

      // 取消本地不匹配的请求列表
      staleRequests.foreach { stale =>
        amClient.removeContainerRequest(stale)
      }
      val cancelledContainers = staleRequests.size
      if (cancelledContainers > 0) {
        logInfo(s"Canceled $cancelledContainers container request(s) (locality no longer needed)")
      }

      // 计算可用容器数量
      val availableContainers = missing + cancelledContainers

      // 计算潜在容器数了:将非本地请求包含进来,尽量提高本地化程度;
      val potentialContainers = availableContainers + anyHostRequests.size

      //重新计算每个container的节点本地性(node locality)和机架本地性(rack locality,当前机架其他节点)
      val containerLocalityPreferences = containerPlacementStrategy.localityOfRequestedContainers(
        potentialContainers, numLocalityAwareTasks, hostToLocalTaskCounts,
          allocatedHostToContainersMap, localRequests)

      根据计算的containers本地性,重新实例化container请求
      val newLocalityRequests = new mutable.ArrayBuffer[ContainerRequest]
      containerLocalityPreferences.foreach {
        case ContainerLocalityPreferences(nodes, racks) if nodes != null =>
          newLocalityRequests += createContainerRequest(resource, nodes, racks)
        case _ =>
      }

      //当前可用的containers可以满足所有新的container请求
      if (availableContainers >= newLocalityRequests.size) {
        for (i <- 0 until (availableContainers - newLocalityRequests.size)) {
          newLocalityRequests += createContainerRequest(resource, null, null)
        }
      } 
      //当前可用的containers不能满足所有新的container请求,不能满足的请求会在其他机架的节点上放置container,所以会取消这些请求,来获取更好的本地性
      else {
        val numToCancel = newLocalityRequests.size - availableContainers
        anyHostRequests.slice(0, numToCancel).foreach { nonLocal =>
          amClient.removeContainerRequest(nonLocal)
        }
        if (numToCancel > 0) {
          logInfo(s"Canceled $numToCancel unlocalized container requests to resubmit with locality")
        }
      }

      //重新添加container请求:将请求传递给RM
      newLocalityRequests.foreach { request =>
        amClient.addContainerRequest(request)
      }

      if (log.isInfoEnabled()) {
        val (localized, anyHost) = newLocalityRequests.partition(_.getNodes() != null)
        if (anyHost.nonEmpty) {
          logInfo(s"Submitted ${anyHost.size} unlocalized container requests.")
        }
        localized.foreach { request =>
          logInfo(s"Submitted container request for host ${hostStr(request)}.")
        }
      }
    } 
    //挂起的+启动的+运行中的executor数量多样需要的executor数量,则从挂起的请求中取消多出的请求;
    else if (numPendingAllocate > 0 && missing < 0) {
      val numToCancel = math.min(numPendingAllocate, -missing)
      logInfo(s"Canceling requests for $numToCancel executor container(s) to have a new desired " +
        s"total $targetNumExecutors executors.")
      // cancel pending allocate requests by taking locality preference into account
      val cancelRequests = (staleRequests ++ anyHostRequests ++ localRequests).take(numToCancel)
      cancelRequests.foreach(amClient.removeContainerRequest)
    }
  }
}
2.3.1.3.2.2.handleAllocatedContainers 处理分配的资源

​ 根据匹配规则,在节点、机架、其他机架3个场景下选择可用容器;

​ 在可用容器中启动executor

private[yarn] class YarnAllocator(
    driverUrl: String,
    driverRef: RpcEndpointRef,
    conf: YarnConfiguration,
    sparkConf: SparkConf,
    amClient: AMRMClient[ContainerRequest],
    appAttemptId: ApplicationAttemptId,
    securityMgr: SecurityManager,
    localResources: Map[String, LocalResource],
    resolver: SparkRackResolver,
    clock: Clock = new SystemClock)
  extends Logging {
  
  def handleAllocatedContainers(allocatedContainers: Seq[Container]): Unit = {
    //可用容器列表
    val containersToUse = new ArrayBuffer[Container](allocatedContainers.size)

    // 根据节点(host)结合匹配规则,选择可用容器
    val remainingAfterHostMatches = new ArrayBuffer[Container]
    for (allocatedContainer <- allocatedContainers) {
      matchContainerToRequest(allocatedContainer, allocatedContainer.getNodeId.getHost,
        containersToUse, remainingAfterHostMatches)
    }

    // 根据机架结合匹配规则,选择可用容器:单独线程完成
    val remainingAfterRackMatches = new ArrayBuffer[Container]
    if (remainingAfterHostMatches.nonEmpty) {
      var exception: Option[Throwable] = None
      val thread = new Thread("spark-rack-resolver") {
        override def run(): Unit = {
          try {
            for (allocatedContainer <- remainingAfterHostMatches) {
              val rack = resolver.resolve(conf, allocatedContainer.getNodeId.getHost)
              matchContainerToRequest(allocatedContainer, rack, containersToUse,
                remainingAfterRackMatches)
            }
          } catch {
            case e: Throwable =>
              exception = Some(e)
          }
        }
      }
      thread.setDaemon(true)
      thread.start()

      try {
        thread.join()
      } catch {
        case e: InterruptedException =>
          thread.interrupt()
          throw e
      }

      if (exception.isDefined) {
        throw exception.get
      }
    }

    // 非本节点和本机架的容器,再次匹配
    val remainingAfterOffRackMatches = new ArrayBuffer[Container]
    for (allocatedContainer <- remainingAfterRackMatches) {
      matchContainerToRequest(allocatedContainer, ANY_HOST, containersToUse,
        remainingAfterOffRackMatches)
    }

    //经过本节点、本机架、非本节点和本机架 3中匹配过滤后,还未匹配上的,内部释放
    if (!remainingAfterOffRackMatches.isEmpty) {
      logDebug(s"Releasing ${remainingAfterOffRackMatches.size} unneeded containers that were " +
        s"allocated to us")
      for (container <- remainingAfterOffRackMatches) {
        internalReleaseContainer(container)
      }
    }
		//在分配的containers中启动executors
    runAllocatedContainers(containersToUse)

    logInfo("Received %d containers from YARN, launching executors on %d of them."
      .format(allocatedContainers.size, containersToUse.size))
  }
}

3.执行流程

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-GAjslApe-1658742053147)(/Users/daiqing/Library/Application Support/typora-user-images/image-20220725173937925.png)]

4.总结

​ 在AM中,做了3件事情:

​ 1、创建driver线程并启动driver线程,执行用户类定义的main方法;

​ 2、向RM注册AM;

​ 3、AM向RM申请资源,根据资源(containers)匹配规则选择可用资源,并在分片的container中启动executor;

​ driver线程是一个AM进程中执行用户类main方法的线程,在定义driver线程后,就将driver线程启动了,启动后执行的run方法中完成用户类定义的main方法的调用;

​ AM向RM注册时,将AM的host、port、web url传给RM,RM将最大可申请的单个Container的占用的资源量、应用程序访问控制列表反馈回来;

​ AM向RM申请资源的过程:在AM中创建资源分配器,由资源分配器向RM申请资源、筛选资源(container)、在container中启动executor;

5.参考资料

spark源码-任务提交流程之YarnClusterApplication

Spark内核之YARN Cluster模式源码详解(Submit详解)

yarn2.7源码分析之ApplicationMaster与ResourceManager.ApplicationMasterService的通信

Yarn的ApplicationMaster介绍

2,spark源码分析-ApplicationMaster启动

spark源码跟踪(十一)ApplicationMaster中的关键线程

Spark源码——Spark on YARN Container资源申请分配、Executor的启动

Yarn源码剖析(四)-- AM的注册与资源调度申请Container及启动

Spark源码——Spark on YARN Executor执行Task的过程

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
### 回答1: Spark submit任务提交是指将用户编写的Spark应用程序提交到集群中运行的过程。在Spark中,用户可以通过命令行工具或API方式提交任务Spark submit命令的基本语法如下: ``` ./bin/spark-submit \ --class <main-class> \ --master <master-url> \ --deploy-mode <deploy-mode> \ --conf <key>=<value> \ <application-jar> \ [application-arguments] ``` 其中,`--class`指定应用程序的主类,`--master`指定集群的URL,`--deploy-mode`指定应用程序的部署模式,`--conf`指定应用程序的配置参数,`<application-jar>`指定应用程序的jar包路径,`[application-arguments]`指定应用程序的命令行参数。 在Spark中,任务提交的过程主要包括以下几个步骤: 1. 创建SparkConf对象,设置应用程序的配置参数; 2. 创建SparkContext对象,连接到集群; 3. 加载应用程序的主类; 4. 运行应用程序的main方法; 5. 关闭SparkContext对象,释放资源。 在任务提交的过程中,Spark会自动将应用程序的jar包和依赖的库文件上传到集群中,并在集群中启动Executor进程来执行任务任务执行完成后,Spark会将结果返回给Driver进程,并将Executor进程关闭。 总之,Spark submit任务提交Spark应用程序运行的关键步骤,掌握任务提交的原理和方法对于开发和调试Spark应用程序非常重要。 ### 回答2: Spark 作为一款强大的分布式计算框架,提供了很多提交任务的方式,其中最常用的方法就是通过 spark-submit 命令来提交任务spark-submit 是 Spark 提供的一个命令行工具,用于在集群上提交 Spark 应用程序,并在集群上运行。 spark-submit 命令的语法如下: ``` ./bin/spark-submit [options] <app jar | python file> [app arguments] ``` 其中,[options] 为可选的参数,包括了执行模式、执行资源等等,<app jar | python file> 为提交的应用程序的文件路径,[app arguments] 为应用程序运行时的参数。 spark-submit 命令会将应用程序的 jar 文件以及所有的依赖打包成一个 zip 文件,然后将 zip 文件提交到集群上运行。在运行时,Spark 会根据指定的主类(或者 Python 脚本文件)启动应用程序。 在提交任务时,可以通过设置一些参数来控制提交任务的方式。例如: ``` --master:指定该任务运行的模式,默认为 local 模式,可设置为 Spark Standalone、YARN、Mesos、Kubernetes 等模式。 --deploy-mode:指定该任务的部署模式,默认为 client,表示该应用程序会在提交任务的机器上运行,可设置为 cluster,表示该应用程序会在集群中一台节点上运行。 --num-executors:指定该任务需要的 executor 数量,每个 executor 会占用一个计算节点,因此需要根据集群配置与任务要求确定该参数的值。 --executor-memory:指定每个 executor 可用的内存量,默认为 1g,可以适当调整该值以达到更好的任务运行效果。 ``` 此外,还有一些参数可以用来指定应用程序运行时需要传递的参数: ``` --conf:指定应用程序运行时需要的一些配置参数,比如 input 文件路径等。 --class:指定要运行的类名或 Python 脚本文件名。 --jars:指定需要使用的 Jar 包文件路径。 --py-files:指定要打包的 python 脚本,通常用于将依赖的 python 包打包成 zip 文件上传。 ``` 总之,spark-submit 是 Spark 提交任务最常用的方法之一,通过该命令能够方便地将应用程序提交到集群上运行。在提交任务时,需要根据实际场景调整一些参数,以达到更好的任务运行效果。 ### 回答3: Spark是一个高效的分布式计算框架,其中比较重要的组成部分就是任务提交。在Spark中,任务提交主要通过spark-submit来实现。本文将从两方面,即任务提交之前的准备工作和任务提交过程中的细节进行探讨。 一、任务提交之前的准备工作 1.环境配置 在执行任务提交前,需要确保所在的计算机环境已经配置好了SparkSpark的环境配置主要包括JAVA环境、Spark的二进制包、PATH路径配置、SPARK_HOME环境变量配置等。 2.编写代码 Spark任务提交是基于代码的,因此在任务提交前,需要编写好自己的代码,并上传到集群中的某个路径下,以便后续提交任务时调用。 3.参数设置 在任务提交时,需要对一些关键的参数进行设置。例如,任务名、任务对应的代码路径、任务需要的资源、任务需要的worker节点等。 二、任务提交过程中的细节 1.启动Driver 当使用spark-submit命令提交任务时,Spark会启动一个Driver来运行用户的代码。这个Driver通常需要连接到Spark集群来执行任务。 2.上传文件 Spark支持在任务提交时上传所需的文件。这些文件可以用于设置Spark的环境变量、为任务提供数据源等。 3.资源需求 Spark任务执行依赖于一定的资源。每个任务可以指定自己的资源需求,例如需要多少内存、需要多少CPU等。这些资源需求通常与提交任务时需要的worker节点数量有关系。 4.监控和日志 在任务执行的过程中,Spark会收集任务的监控数据和日志信息。这些数据可用于后续的调试和性能优化。 总之,在Spark任务提交过程中,需要充分考虑任务的资源需求和监控日志信息的收集,以便更好地完成任务和优化Spark运行效率。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值