之前分析了Spark HistoryServer的Web界面构建和后台数据解析的流程,下面介绍一下Web操作在后台执行的流程以及为了提高查询速度,数据在HistoryServer的缓存策略
绑定路由
在HistoryServer的实例化过程中,会绑定以/api/v1/开头的路由
attachHandler(ApiRootResource.getServletHandler(this))
ApiRootResource中制定了路由的规则,通过url访问路由会返回对应的查询结果
除了 /api/v1/applications、/api/v1/application/{appid}直接从内存中获取结果外,其他路由基本都是通过缓存中间层获取查询结果
查询缓存
def withSparkUI[T](appId: String, attemptId: Option[String])(f: SparkUI => T): T = {
val appKey = attemptId.map(appId + "/" + _).getOrElse(appId)
getSparkUI(appKey) match {
case Some(ui) =>
f(ui)
case None => throw new NotFoundException("no such app: " + appId)
}
}
def getSparkUI(appKey: String): Option[SparkUI] = {
appCache.getSparkUI(appKey)
}
private val appCache = new ApplicationCache(this, retainedApplications, new SystemClock())
private[history] class ApplicationCache(
val operations: ApplicationCacheOperations,
val retainedApplications: Int,
val clock: Clock) extends Logging {
private val appLoader = new CacheLoader[CacheKey, CacheEntry] {
/** the cache key doesn't match a cached entry, or the entry is out-of-date, so load it. */
override def load(key: CacheKey): CacheEntry = {
loadApplicationEntry(key.appId, key.attemptId)
}
}
………………………
protected val appCache: LoadingCache[CacheKey, CacheEntry] = {
CacheBuilder.newBuilder()
.maximumSize(retainedApplications)
.removalListener(removalListener)
.build(appLoader)
}
………………………
ApplicationCache该类中保存了一个appCache成员变量,该类是google提供的第三方包,用以实现缓存功能,使用该类,会将第一次查询的结果缓存起来,如果客户端再次发送相同的url请求,则将缓存的结果直接返回即可,节省资源和时间
通过查询UIRoot提供的接口withSparkUI(实际执行的是appCache实例的getSparkUI方法),最终会执行lookupAndUpdate函数,如下:
private def lookupAndUpdate(appId: String, attemptId: Option[String]): (CacheEntry, Boolean) = {
metrics.lookupCount.inc()
val cacheKey = CacheKey(appId, attemptId)
var entry = appCache.getIfPresent(cacheKey)
var updated = false
if (entry == null) {
// no entry, so fetch without any post-fetch probes for out-of-dateness
// this will trigger a callback to loadApplicationEntry()
entry = appCache.get(cacheKey)
} else if (!entry.completed) {
val now = clock.getTimeMillis()
log.debug(s"Probing at time $now for updated application $cacheKey -> $entry")
metrics.updateProbeCount.inc()
updated = time(metrics.updateProbeTimer) {
entry.updateProbe()
}
if (updated) {
logDebug(s"refreshing $cacheKey")
metrics.updateTriggeredCount.inc()
appCache.refresh(cacheKey)
// and repeat the lookup
entry = appCache.get(cacheKey)
} else {
// update the probe timestamp to the current time
entry.probeTime = now
}
}
(entry, updated)
}
在该函数中,首先会调用appCache.getIfPresent(cacheKey)方法。如果缓存中存在值,则进一步判断是否application是否完成,如果完成了则直接返回,如果未完成,则检查数据是否有更新,如果有更新,则刷新该值后返回,如果没有更新则直接返回;如果缓存中不存在值,则直接计算值并返回。
计算的过程实际上就是调用CacheLoader(也就是appLoader实例)的load函数,在load函数中,会调用loadApplicationEntry函数
def loadApplicationEntry(appId: String, attemptId: Option[String]): CacheEntry = {
logDebug(s"Loading application Entry $appId/$attemptId")
metrics.loadCount.inc()
time(metrics.loadTimer) {
operations.getAppUI(appId, attemptId) match {
在loadApplicationEntry函数中,会调用operations(在这里实际是HistoryServer)的getAppUI方法
override def getAppUI(appId: String, attemptId: Option[String]): Option[LoadedAppUI] = {
provider.getAppUI(appId, attemptId)
}
在这里又有一层调用,provider实例实际上是FsHistoryProvider,相关代码如下:
override def getAppUI(appId: String, attemptId: Option[String]): Option[LoadedAppUI] = {
try {
applications.get(appId).flatMap { appInfo =>
appInfo.attempts.find(_.attemptId == attemptId).flatMap { attempt =>
val replayBus = new ReplayListenerBus()
val ui = {
val conf = this.conf.clone()
val appSecManager = new SecurityManager(conf)
SparkUI.createHistoryUI(conf, replayBus, appSecManager, appInfo.name,
HistoryServer.getAttemptURI(appId, attempt.attemptId), attempt.startTime)
// Do not call ui.bind() to avoid creating a new server for each application
}
val fileStatus = fs.getFileStatus(new Path(logDir, attempt.logPath))
val appListener = replay(fileStatus, isApplicationCompleted(fileStatus), replayBus)
if (appListener.appId.isDefined) {
val uiAclsEnabled = conf.getBoolean("spark.history.ui.acls.enable", false)
ui.getSecurityManager.setAcls(uiAclsEnabled)
// make sure to set admin acls before view acls so they are properly picked up
ui.getSecurityManager.setAdminAcls(appListener.adminAcls.getOrElse(""))
ui.getSecurityManager.setViewAcls(attempt.sparkUser,
appListener.viewAcls.getOrElse(""))
ui.getSecurityManager.setAdminAclsGroups(appListener.adminAclsGroups.getOrElse(""))
ui.getSecurityManager.setViewAclsGroups(appListener.viewAclsGroups.getOrElse(""))
Some(LoadedAppUI(ui, updateProbe(appId, attemptId, attempt.fileSize)))
} else {
None
}
}
}
} catch {
case e: FileNotFoundException => None
}
}
在该函数中,调用SparkUI.createHistoryUI会构件好相关的Web页面,同时将各模块的监听器(EnvironmentListener、StorageStatusListener、ExecutorsListener、StorageListener、RDDOperationGraphListener等)注册至SparkListenerBus,
然后调用replay函数,在该函数中会将eventlog文件中内容进行逐条解析并通知在SparkListenerBus上注册的各监听器获取各自需要的数据,为各模块提供数据呈现。最后将构好的UI返回去,用以渲染前端web页面。
到这里,已经完成了url请求到数据返回的流程