生产环境客户端提交spark程序,基于脚本提交的spark-submit
spark.version:2.4.0
scala.version:2.12
源码解析:
spark-submit:
--main() # InProcessSparkSubmit
(new SparkSubmit()).doSubmit(args)
doSubmit 做模式匹配到submit
# submit 解析参数
->parseArguments 此时有一个重要的就是childMainClass为集群yarn模式的时候取值为org.apache.spark.deploy.yarn.YarnClusterApplication
#submit最后调用内部方法doRunMain(),器内部再次调用runMain()方法
runmain()->mainClass = Utils.classForName(childMainClass) # 进入yarn
->app.start()->mainMethod.invoke()类反射调用
yarn :org.apache.spark.deploy.yarn.YarnClusterApplication
对SparkApplication onstart方法进行重写
class YarnClusterApplication extends SparkApplication
# 启动yarn客户端,yarnClient此时应该不在yarn中,和sparksubmit同级
start()->new Client(new ClientArguments(args), conf).run()
new ClientArguments(args) # 单纯的对参数进行解析
new Client()->
准备yarn客户端
val yarnClient = YarnClient.createYarnClient ->YarnClient client = new YarnClientImpl()
集群的一些配置信息
/**
* Submit an application to the ResourceManager.
* If set spark.yarn.submit.waitAppCompletion to true, it will stay alive
* reporting the application's status until the application has exited for any reason.
* Otherwise, the client process will exit after submission.
* If the application finishes with a failed, killed, or undefined status,
* throw an appropriate SparkException.
*/
run()->submitApplication()[启动后端连接,yarn客户端初始化,yarn客户端start]
// Get a new application from our RM 在其中的一个nodemanger中创建AM
val newApp = yarnClient.createApplication()
// Set up the appropriate contexts to launch our AM
// 提交的是相关的/bin/java 命令,所以am是一个进程,我们可以使用jps看到的
val containerContext = createContainerLaunchContext(newAppResponse)
createContainerLaunchContext
val amClass =
if (isClusterMode) {
Utils.classForName("org.apache.spark.deploy.yarn.ApplicationMaster").getName
} else {
# spark-shell xcall jps 可以看到显示的为ExecutorLauncher,spark-shell不可能为cluster,只能为client
Utils.classForName("org.apache.spark.deploy.yarn.ExecutorLauncher").getName
}
org.apache.spark.deploy.yarn.ApplicationMaster
ApplicationMaster()->val amArgs = new ApplicationMasterArguments(args) # 解析参数
master = new ApplicationMaster(amArgs)
启动ApplicationMaster
System.exit(master.run())
run()->runImpl()->->runDriver()/runExecutorLauncher()->
userClassThread = startUserApplication()->userThread.setName("Driver") userThread.start() # driver 可以看出来是一个线程
# 分配资源
createAllocator->allocator.allocateResources()
handleAllocatedContainers(allocatedContainers.asScala)
runAllocatedContainers(containersToUse)
ExecutorRunnable.run()->
# 想其中的一个nodemanager中创建容器
nmClient = NMClient.createNMClient()
nmClient.init(conf)
nmClient.start()
startContainer()->prepareCommand()->org.apache.spark.executor.CoarseGrainedExecutorBackend
->main()->run()->env.rpcEnv.setupEndpoint("Executor", new CoarseGrainedExecutorBackend
/**
* An end point for the RPC that defines what functions to trigger given a message.
*
* It is guaranteed that `onStart`, `receive` and `onStop` will be called in sequence.
*
* The life-cycle of an endpoint is:
*
* {@code constructor -> onStart -> receive* -> onStop}
*
* Note: `receive` can be called concurrently. If you want `receive` to be thread-safe, please use
* [[ThreadSafeRpcEndpoint]]
*
* If any error is thrown from one of [[RpcEndpoint]] methods except `onError`, `onError` will be
* invoked with the cause. If `onError` throws an error, [[RpcEnv]] will ignore it.
*/
# executor先想driver注册,接收driver的注册成功响应,最后启动任务
->onstart()-receive[RegisteredExecutor,RegisterExecutorFailed,LaunchTask]