《一线大厂Java面试题解析+核心总结学习笔记+最新讲解视频+实战项目源码》,点击传送门,即可获取!
private def tryRegisterAllMasters(): Array[JFuture[_]] = {
//由于HA等环境有多个Master,需要遍历所有的Master发送消息
for (masterAddress <- masterRpcAddresses) yield {
//向线程池中启动注册线程,当该线程读到应用注册成功标志registered=ture时,退出注册线程
registerMasterThreadPool.submit(new Runnable {
override def run(): Unit = try {
if (registered.get) {
return
}
logInfo("Connecting to master " + masterAddress.toSparkURL + “…”)
//获取Master终端点的引用,发送注册应用的消息
val masterRef = rpcEnv.setupEndpointRef(masterAddress, Master.ENDPOINT_NAME)
masterRef.send(RegisterApplication(appDescription, self))
} catch {
case ie: InterruptedException => // Cancelled
case NonFatal(e) => logWarning(s"Failed to connect to master $masterAddress", e)
}
})
}
}
当Master接收到注册应用的消息时,在registerApplication方法中记录应用消息并把该消息加入到等待运行应用列表中,注册完毕发送RegisteredApplication给ClientEndpoint,同时调用startExecutorOnWorker方法运行应用,通知Worker启动Executor。
private def startExecutorsOnWorkers(): Unit = {
// Right now this is a very simple FIFO scheduler. We keep trying to fit in the first app
// in the queue, then the second app, etc.
//使用FIFO调度算法运行应用,先注册的应用先运行
for (app <- waitingApps) {
val coresPerExecutor = app.desc.coresPerExecutor.getOrElse(1)
// If the cores left is less than the coresPerExecutor,the cores left will not be allocated
if (app.coresLeft >= coresPerExecutor) {
// Filter out workers that don’t have enough resources to launch an executor
val usableWorkers = workers.toArray.filter(_.state == WorkerState.ALIVE)
.filter(worker => worker.memoryFree >= app.desc.memoryPerExecutorMB &&
worker.coresFree >= coresPerExecutor)
.sortBy(_.coresFree).reverse
//确定运行在哪些Worker上和每个Worker分配用于运行的核数,分配算法有两种,一种时把应用
//运行在尽可能多的Worker上,相反,另一种是运行在尽可能少的Worker上
val assignedCores = scheduleExecutorsOnWorkers(app, usableWorkers, spreadOutApps)
// Now that we’ve decided how many cores to allocate on each worker, let’s allocate them
//通知分配的Worker,启动Worker
for (pos <- 0 until usableWorkers.length if assignedCores(pos) > 0) {
allocateWorkerResourceToExecutors(
app, assignedCores(pos), app.desc.coresPerExecutor, usableWorkers(pos))
}
}
}
}
(2)ApplicationClientEndpoint接收到Master发送RegisteredApplication消息,需要把注册表示registered改为true,Master注册线程获取状态变化后,完成注册Application。
override def receive: PartialFunction[Any, Unit] = {
//Master注册线程获取状态变化后,完成注册Application进程
case RegisteredApplication(appId_, masterRef) =>
// FIXME How to handle the following cases?
// 1. A master receives multiple registrations and sends back multiple
// RegisteredApplications due to an unstable network.
// 2. Receive multiple RegisteredApplication from different masters because the master is
// changing.
appId.set(appId_)
registered.set(true)
master = Some(masterRef)
listener.connected(appId.get)
…
}
(3)在Master类的startExecutorOnWorker方法中分配资源运行应用程序时,调用allocationWorkerResourceToExecutor方法实现Worker启动Executor。
override def receive: PartialFunction[Any, Unit] = synchronized {
…
case LaunchExecutor(masterUrl, appId, execId, appDesc, cores_, memory_) =>
…
//创建Executor执行目录
val executorDir = new File(workDir, appId + “/” + execId)
if (!executorDir.mkdirs()) {
throw new IOException("Failed to create directory " + executorDir)
}
//通过SPARK_EXECUTOR_DIRS环境变量,在Worker中创建Executor中创建Executor执行目录,
//当程序执行完后由Worker进行删除
val appLocalDirs = appDirectories.getOrElse(appId, {
val localRootDirs = Utils.getOrCreateLocalRootDirs(conf)
val dirs = localRootDirs.flatMap { dir =>
try {
val appDir = Utils.createDirectory(dir, namePrefix = “executor”)
Utils.chmod700(appDir)
Some(appDir.getAbsolutePath())
} catch {
case e: IOException =>
logWarning(s"${e.getMessage}. Ignoring this directory.")
None
}
}.toSeq
if (dirs.isEmpty) {
throw new IOException("No subfolder can be created in " +
s"${localRootDirs.mkString(“,”)}.")
}
dirs
})
appDirectories(appId) = appLocalDirs
//在ExecutorRunner中创建CoarseGrainedExecutorBackend对象,创建的是使用应用信息中的
//command,而command在SparkDeploySchedulerBackend的start方法中构建
val manager = new ExecutorRunner(
appId,
execId,
appDesc.copy(command = Worker.maybeUpdateSSLSettings(appDesc.command, conf)),
cores_,
memory_,
self,
workerId,
host,
webUi.boundPort,
publicAddress,
sparkHome,
executorDir,
workerUri,
conf,
appLocalDirs, ExecutorState.RUNNING)
executors(appId + “/” + execId) = manager
manager.start()
coresUsed += cores_
memoryUsed += memory_
//向Master发送消息,表示Executor状态已经被更改ExecutorState.RUNNING
sendToMaster(ExecutorStateChanged(appId, execId, manager.state, None, None))
} catch {
case e: Exception =>
logError(s"Failed to launch executor a p p I d / appId/ appId/execId for ${appDesc.name}.", e)
if (executors.contains(appId + “/” + execId)) {
executors(appId + “/” + execId).kill()
executors -= appId + “/” + execId
}
sendToMaster(ExecutorStateChanged(appId, execId, ExecutorState.FAILED,
Some(e.toString), None))
}
}
…
}
在Executor创建中调用了fetchAndRunExecutor方法进行实现。
private def fetchAndRunExecutor() {
try {
// Launch the process
val subsOpts = appDesc.command.javaOpts.map {
Utils.substituteAppNExecIds(_, appId, execId.toString)
}
val subsCommand = appDesc.command.copy(javaOpts = subsOpts)
//通过应用程序的信息和环境配置创建构造器builder
val builder = CommandUtils.buildProcessBuilder(subsCommand, new SecurityManager(conf),
memory, sparkHome.getAbsolutePath, substituteVariables)
val command = builder.command()
val formattedCommand = command.asScala.mkString(“”“, “” “”, “””)
logInfo(s"Launch command: $formattedCommand")
//在构造器builder中添加执行目录信息
builder.directory(executorDir)
builder.environment.put(“SPARK_EXECUTOR_DIRS”, appLocalDirs.mkString(File.pathSeparator))
// In case we are running this from within the Spark Shell, avoid creating a “scala”
// parent process for the executor command
builder.environment.put(“SPARK_LAUNCH_WITH_SCALA”, “0”)
// Add webUI log urls
//在构造器builder中添加监控页面输入日志地址信息
val baseUrl =
if (conf.getBoolean(“spark.ui.reverseProxy”, false)) {
s"/proxy/ w o r k e r I d / l o g P a g e / ? a p p I d = workerId/logPage/?appId= workerId/logPage/?appId=appId&executorId=$execId&logType="
} else {
s"http:// p u b l i c A d d r e s s : publicAddress: publicAddress:webUiPort/logPage/?appId=KaTeX parse error: Expected 'EOF', got '&' at position 6: appId&̲executorId=execId&logType="
}
builder.environment.put(“SPARK_LOG_URL_STDERR”, s"${baseUrl}stderr")
builder.environment.put(“SPARK_LOG_URL_STDOUT”, s"${baseUrl}stdout")
//启动构造器,创建CoarseGrainedExecutorBackend实例
process = builder.start()
val header = “Spark Executor Command: %s\n%s\n\n”.format(
formattedCommand, “=” * 40)
// Redirect its stdout and stderr to files
//输出创建CoarseGrainedExecutorBackend实例运行信息
val stdout = new File(executorDir, “stdout”)
stdoutAppender = FileAppender(process.getInputStream, stdout, conf)
val stderr = new File(executorDir, “stderr”)
Files.write(header, stderr, StandardCharsets.UTF_8)
stderrAppender = FileAppender(process.getErrorStream, stderr, conf)
// Wait for it to exit; executor may exit with code 0 (when driver instructs it to shutdown)
// or with nonzero exit code
//等待CoarseGrainedExecutorBackend运行结束,当结束时,向Worker发送退出状态信息
val exitCode = process.waitFor()
state = ExecutorState.EXITED
val message = "Command exited with code " + exitCode
worker.send(ExecutorStateChanged(appId, execId, state, Some(message), Some(exitCode)))
} catch {
case interrupted: InterruptedException =>
logInfo(“Runner thread for executor " + fullId + " interrupted”)
state = ExecutorState.KILLED
killProcess(None)
case e: Exception =>
logError(“Error running executor”, e)
state = ExecutorState.FAILED
killProcess(Some(e.toString))
}
}
}
(4)Mater接收到Worker发送的ExecutorStateChanged消息
override def receive: PartialFunction[Any, Unit] = {
…
case ExecutorStateChanged(appId, execId, state, message, exitStatus) =>
val execOption = idToApp.get(appId).flatMap(app => app.executors.get(execId))
execOption match {
case Some(exec) =>
val appInfo = idToApp(appId)
val oldState = exec.state
exec.state = state
if (state == ExecutorState.RUNNING) {
assert(oldState == ExecutorState.LAUNCHING,
s"executor $execId state transfer from $oldState to RUNNING is illegal")
appInfo.resetRetryCount()
}
//向Driver发送ExecutorUpdated消息
exec.application.driver.send(ExecutorUpdated(execId, state, message, exitStatus, false))
if (ExecutorState.isFinished(state)) {
// Remove this executor from the worker and app
logInfo(s"Removing executor ${exec.fullId} because it is $state")
// If an application has already finished, preserve its
// state to display its information properly on the UI
if (!appInfo.isFinished) {
appInfo.removeExecutor(exec)
}
exec.worker.removeExecutor(exec)
val normalExit = exitStatus == Some(0)
// Only retry certain number of times so we don’t go into an infinite loop.
// Important note: this code path is not exercised by tests, so be very careful when
// changing this if
condition.
if (!normalExit
&& appInfo.incrementRetryCount() >= MAX_EXECUTOR_RETRIES
&& MAX_EXECUTOR_RETRIES >= 0) { // < 0 disables this application-killing path
val execs = appInfo.executors.values
if (!execs.exists(_.state == ExecutorState.RUNNING)) {
logError(s"Application ${appInfo.desc.name} with ID ${appInfo.id} failed " +
s"${appInfo.retryCount} times; removing it")
removeApplication(appInfo, ApplicationState.FAILED)
}
}
}
schedule()
case None =>
logWarning(s"Got status update for unknown executor a p p I d / appId/ appId/execId")
}
…
}
(5)在DriverEndpoint终端点进行注册Executor。(在步骤(3)CoarseGrainedExecutorBackend启动方法Onstart中,会发送注册Executor消息给RegisterExecutor给DriverEndpoint)
override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] = {
case RegisterExecutor(executorId, executorRef, hostname, cores, logUrls) =>
if (executorDataMap.contains(executorId)) {
executorRef.send(RegisterExecutorFailed("Duplicate executor ID: " + executorId))
context.reply(true)
}
…
//记录executor的编号,以及该executor使用的核数
addressToExecutorId(executorAddress) = executorId
totalCoreCount.addAndGet(cores)
totalRegisteredExecutors.addAndGet(1)
val data = new ExecutorData(executorRef, executorAddress, hostname,
cores, cores, logUrls)
// This must be synchronized because variables mutated
// in this block are read when requesting executors
//创建executor编号和其具体信息的键值列表
CoarseGrainedSchedulerBackend.this.synchronized {
executorDataMap.put(executorId, data)
if (currentExecutorIdCounter < executorId.toInt) {
currentExecutorIdCounter = executorId.toInt
}
if (numPendingExecutors > 0) {
numPendingExecutors -= 1
logDebug(s"Decremented number of pending executors ($numPendingExecutors left)")
}
}
//回复executor完成注册消息
executorRef.send(RegisteredExecutor)
// Note: some tests expect the reply to come after we put the executor in the map
context.reply(true)
listenerBus.post(
SparkListenerExecutorAdded(System.currentTimeMillis(), executorId, data))
//分配运行任务资源并发送LaunchTask消息执行任务
makeOffers()
}
…
}
(6)当CoarseGrainedExecutorBackend接收到Executor注册成功的RegisteredExecutor消息时,在CoarseGrainedExecutorBackend容器中实例化Executor对象。
override def receive: PartialFunction[Any, Unit] = {
case RegisteredExecutor =>
logInfo(“Successfully registered with driver”)
try {
//根据环境变量的参数,启动Executor,在Spark中,它是真正任务的执行者
executor = new Executor(executorId, hostname, env, userClassPath, isLocal = false)
} catch {
case NonFatal(e) =>
exitExecutor(1, "Unable to create executor due to " + e.getMessage, e)
}
…
}
实例化的Executor对象会定时向Driver发送心跳信息,等待Driver下发任务。
private val heartbeater = ThreadUtils.newDaemonSingleThreadScheduledExecutor(“driver-heartbeater”)
/
private def startDriverHeartbeater(): Unit = {
//设置间隔时间
val intervalMs = HEARTBEAT_INTERVAL_MS
// Wait a random interval so the heartbeats don’t end up in sync
//等待随机时间间隔,这样心跳不会在同步中结束
val initialDelay = intervalMs + (math.random * intervalMs).asInstanceOf[Int]
val heartbeatTask = new Runnable() {
override def run(): Unit = Utils.logUncaughtExceptions(reportHeartBeat())
}
//发送心跳信息给Driver
heartbeater.scheduleAtFixedRate(heartbeatTask, initialDelay, intervalMs, TimeUnit.MILLISECONDS)
}
}
(7)CoarseGrainedExecutorBackend的Executor启动后,接收到从DriverEndpoint终端点发送的LaunchTask执行任务消息,任务执行是在Executor的launchTask方法实现的。
override def receive: PartialFunction[Any, Unit] = {
…
case LaunchTask(data) =>
if (executor == null) {
//当Executor没有成功启动时,输出异常日志并关闭Executor
exitExecutor(1, “Received LaunchTask command but executor was null”)
} else {
val taskDesc = TaskDescription.decode(data.value)
logInfo("Got assigned task " + taskDesc.taskId)
//启动TaskRunner进程执行任务
executor.launchTask(this, taskDesc)
}
…
}
调用executor的launchTask方法,在该方法中创建TaskRunner进程,然后把该进程加入到threadPool中,由Executor统一调度。
def launchTask(context: ExecutorBackend, taskDescription: TaskDescription): Unit = {
val tr = new TaskRunner(context, taskDescription)
runningTasks.put(taskDescription.taskId, tr)
threadPool.execute(tr)
}
(8)在TaskRunner执行任务完成时,会由向DriverEndpoint终端点发送状态变更StatusUpdate消息。
总结
一般像这样的大企业都有好几轮面试,所以自己一定要花点时间去收集整理一下公司的背景,公司的企业文化,俗话说「知己知彼百战不殆」,不要盲目的去面试,还有很多人关心怎么去跟HR谈薪资。
这边给大家一个建议,如果你的理想薪资是30K,你完全可以跟HR谈33~35K,而不是一下子就把自己的底牌暴露了出来,不过肯定不能说的这么直接,比如原来你的公司是25K,你可以跟HR讲原来的薪资是多少,你们这边能给到我的是多少?你说我这边希望可以有一个20%涨薪。
最后再说几句关于招聘平台的,总之,简历投递给公司之前,请确认下这家公司到底咋样,先去百度了解下,别被坑了,每个平台都有一些居心不良的广告党等着你上钩,千万别上当!!!
Java架构学习资料,学习技术内容包含有:Spring,Dubbo,MyBatis, RPC, 源码分析,高并发、高性能、分布式,性能优化,微服务 高级架构开发等等。
还有Java核心知识点+全套架构师学习资料和视频+一线大厂面试宝典+面试简历模板可以领取+阿里美团网易腾讯小米爱奇艺快手哔哩哔哩面试题+Spring源码合集+Java架构实战电子书。
《一线大厂Java面试题解析+核心总结学习笔记+最新讲解视频+实战项目源码》,点击传送门,即可获取!
val tr = new TaskRunner(context, taskDescription)
runningTasks.put(taskDescription.taskId, tr)
threadPool.execute(tr)
}
(8)在TaskRunner执行任务完成时,会由向DriverEndpoint终端点发送状态变更StatusUpdate消息。
总结
一般像这样的大企业都有好几轮面试,所以自己一定要花点时间去收集整理一下公司的背景,公司的企业文化,俗话说「知己知彼百战不殆」,不要盲目的去面试,还有很多人关心怎么去跟HR谈薪资。
这边给大家一个建议,如果你的理想薪资是30K,你完全可以跟HR谈33~35K,而不是一下子就把自己的底牌暴露了出来,不过肯定不能说的这么直接,比如原来你的公司是25K,你可以跟HR讲原来的薪资是多少,你们这边能给到我的是多少?你说我这边希望可以有一个20%涨薪。
最后再说几句关于招聘平台的,总之,简历投递给公司之前,请确认下这家公司到底咋样,先去百度了解下,别被坑了,每个平台都有一些居心不良的广告党等着你上钩,千万别上当!!!
Java架构学习资料,学习技术内容包含有:Spring,Dubbo,MyBatis, RPC, 源码分析,高并发、高性能、分布式,性能优化,微服务 高级架构开发等等。
还有Java核心知识点+全套架构师学习资料和视频+一线大厂面试宝典+面试简历模板可以领取+阿里美团网易腾讯小米爱奇艺快手哔哩哔哩面试题+Spring源码合集+Java架构实战电子书。
[外链图片转存中…(img-tb2F9a06-1714745977131)]
《一线大厂Java面试题解析+核心总结学习笔记+最新讲解视频+实战项目源码》,点击传送门,即可获取!