文章目录
ListenerBus
private[spark] trait ListenerBus[L <: AnyRef, E] extends Logging {
private[this] val listenersPlusTimers = new CopyOnWriteArrayList[(L, Option[Timer])]
private[spark] def listeners = listenersPlusTimers.asScala.map(_._1).asJava
protected def getTimer(listener: L): Option[Timer] = None
final def addListener(listener: L): Unit = {
listenersPlusTimers.add((listener, getTimer(listener)))
}
final def removeListener(listener: L): Unit = {
listenersPlusTimers.asScala.find(_._1 eq listener).foreach { listenerAndTimer =>
listenersPlusTimers.remove(listenerAndTimer)
}
}
def removeListenerOnError(listener: L): Unit = {
removeListener(listener)
}
def postToAll(event: E): Unit = {
val iter = listenersPlusTimers.iterator
while (iter.hasNext) {
//迭代监听器,获取listener,和timer.
val listenerAndMaybeTimer = iter.next()
val listener = listenerAndMaybeTimer._1
val maybeTimer = listenerAndMaybeTimer._2
val maybeTimerContext = if (maybeTimer.isDefined) {
maybeTimer.get.time()
} else {
null
}
try {
//将事件投递给监听器
doPostEvent(listener, event)
if (Thread.interrupted()) {
throw new InterruptedException()
}
} catch {
case ie: InterruptedException =>
logError(s"Interrupted while posting to ${Utils.getFormattedClassName(listener)}. " +
s"Removing that listener.", ie)
removeListenerOnError(listener)
case NonFatal(e) if !isIgnorableException(e) =>
logError(s"Listener ${Utils.getFormattedClassName(listener)} threw an exception", e)
} finally {
if (maybeTimerContext != null) {
maybeTimerContext.stop()
}
}
}
}
protected def doPostEvent(listener: L, event: E): Unit
protected def isIgnorableException(e: Throwable): Boolean = false
private[spark] def findListenersByClass[T <: L : ClassTag](): Seq[T] = {
val c = implicitly[ClassTag[T]].runtimeClass
listeners.asScala.filter(_.getClass == c).map(_.asInstanceOf[T]).toSeq
}
}
ListenerBus是个泛型特质,其泛型参数为 [L <: AnyRef, E],其中L是代表监听器的泛型参数,可以看到ListenerBus支持任何类型的监听器,E是代表事件的泛型参数。ListenerBus中各个成员的作用如下:
listeners:用于维护所有注册的监听器,其数据结构为CopyOnWriteArrayList[L];
addListener:向listeners中添加监听器的方法,由于listeners采用CopyOnWriteArrayList来实现,所以addListener方法是线程安全的;
removeListener:从listeners中移除监听器的方法,由于listeners采用CopyOnWriteArrayList来实现,所以removeListener方法是线程安全的;
postToAll:此方法的作用是将事件投递给所有的监听器。虽然CopyOnWriteArrayList本身是线程的安全的,但是由于postToAll方法内部引入了“先检查后执行”的逻辑,因而postToAll方法不是线程安全的,所以所有对postToAll方法的调用应当保证在同一个线程中;
doPostEvent:用于将事件投递给指定的监听器,此方法只提供了接口定义,具体实现需要子类提供;
findListenersByClass:查找与指定类型相同的监听器列表。
ListenerBus的类继承体系
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-lAtneg05-1652773297111)(C:\Users\hz21066838\AppData\Roaming\Typora\typora-user-images\image-20220516194802404.png)]
SparkListenerBus:用于将SparkListenerEvent类型的事件投递到SparkListenerInterface类型的监听器;
StreamingQueryListenerBus:用于将StreamingQueryListener.Event类型的事件投递到StreamingQueryListener类型的监听器,此外还会将StreamingQueryListener.Event类型的事件交给SparkListenerBus;
StreamingListenerBus:用于将StreamingListenerEvent类型的事件投递到StreamingListener类型的监听器,此外还会将StreamingListenerEvent类型的事件交给SparkListenerBus。
SparkListenerBus也有两种实现:
AsyncEventQueue:采用异步线程将SparkListenerEvent类型的事件投递到SparkListener类型的监听器;
ReplayListenerBus:用于从序列化的事件数据中重播事件。
SparkListenerBus
private[spark] trait SparkListenerBus
extends ListenerBus[SparkListenerInterface, SparkListenerEvent] {
protected override def doPostEvent(
listener: SparkListenerInterface,
event: SparkListenerEvent): Unit = {
event match {
case stageSubmitted: SparkListenerStageSubmitted =>
listener.onStageSubmitted(stageSubmitted)
case stageCompleted: SparkListenerStageCompleted =>
listener.onStageCompleted(stageCompleted)
case jobStart: SparkListenerJobStart =>
listener.onJobStart(jobStart)
case jobEnd: SparkListenerJobEnd =>
listener.onJobEnd(jobEnd)
case taskStart: SparkListenerTaskStart =>
listener.onTaskStart(taskStart)
case taskGettingResult: SparkListenerTaskGettingResult =>
listener.onTaskGettingResult(taskGettingResult)
case taskEnd: SparkListenerTaskEnd =>
listener.onTaskEnd(taskEnd)
case environmentUpdate: SparkListenerEnvironmentUpdate =>
listener.onEnvironmentUpdate(environmentUpdate)
case blockManagerAdded: SparkListenerBlockManagerAdded =>
listener.onBlockManagerAdded(blockManagerAdded)
case blockManagerRemoved: SparkListenerBlockManagerRemoved =>
listener.onBlockManagerRemoved(blockManagerRemoved)
case unpersistRDD: SparkListenerUnpersistRDD =>
listener.onUnpersistRDD(unpersistRDD)
case applicationStart: SparkListenerApplicationStart =>
listener.onApplicationStart(applicationStart)
case applicationEnd: SparkListenerApplicationEnd =>
listener.onApplicationEnd(applicationEnd)
case metricsUpdate: SparkListenerExecutorMetricsUpdate =>
listener.onExecutorMetricsUpdate(metricsUpdate)
case executorAdded: SparkListenerExecutorAdded =>
listener.onExecutorAdded(executorAdded)
case executorRemoved: SparkListenerExecutorRemoved =>
listener.onExecutorRemoved(executorRemoved)
case executorBlacklistedForStage: SparkListenerExecutorBlacklistedForStage =>
listener.onExecutorBlacklistedForStage(executorBlacklistedForStage)
case nodeBlacklistedForStage: SparkListenerNodeBlacklistedForStage =>
listener.onNodeBlacklistedForStage(nodeBlacklistedForStage)
case executorBlacklisted: SparkListenerExecutorBlacklisted =>
listener.onExecutorBlacklisted(executorBlacklisted)
case executorUnblacklisted: SparkListenerExecutorUnblacklisted =>
listener.onExecutorUnblacklisted(executorUnblacklisted)
case nodeBlacklisted: SparkListenerNodeBlacklisted =>
listener.onNodeBlacklisted(nodeBlacklisted)
case nodeUnblacklisted: SparkListenerNodeUnblacklisted =>
listener.onNodeUnblacklisted(nodeUnblacklisted)
case blockUpdated: SparkListenerBlockUpdated =>
listener.onBlockUpdated(blockUpdated)
case speculativeTaskSubmitted: SparkListenerSpeculativeTaskSubmitted =>
listener.onSpeculativeTaskSubmitted(speculativeTaskSubmitted)
case _ => listener.onOtherEvent(event)
}
}
}
我们看到SparkListenerBus已经实现了ListenerBus的doPostEvent方法,通过对SparkListenerEvent事件的匹配,执行SparkListenerInterface监听器的相应方法。
SparkListenerInterface
private[spark] trait SparkListenerInterface {
/**
* Called when a stage completes successfully or fails, with information on the completed stage.
*/
def onStageCompleted(stageCompleted: SparkListenerStageCompleted): Unit
SparkListenerInterface定义了针对不同事件的处理方式
@DeveloperApi
abstract class SparkListener extends SparkListenerInterface {
override def onStageCompleted(stageCompleted: SparkListenerStageCompleted): Unit = { }
override def onStageSubmitted(stageSubmitted: SparkListenerStageSubmitted): Unit = { }
SparkListener
SparkListener是继承了SparkListenerInterface的抽象类
SparkListenerEvent
trait SparkListenerEvent {
/* Whether output this event to the event log */
protected[spark] def logEvent: Boolean = true
}
@DeveloperApi
case class SparkListenerStageSubmitted(stageInfo: StageInfo, properties: Properties = null)
extends SparkListenerEvent
@DeveloperApi
case class SparkListenerStageCompleted(stageInfo: StageInfo) extends SparkListenerEvent
SparkListenerEvent也是一个特质,下面有很多的样例类,用来区分不同的事件
我们知道当有事件需要通知监听器的时候,可以调用ListenerBus的postToAll方法,postToAll方法遍历所有监听器并调用SparkListenerBus实现的doPostEvent方法,doPostEvent方法对事件类型进行匹配后调用监听器的不同方法。
AsyncEventQueue
private class AsyncEventQueue(
val name: String,
conf: SparkConf,
metrics: LiveListenerBusMetrics,
bus: LiveListenerBus)
extends SparkListenerBus
with Logging {
import AsyncEventQueue._
//定义一个阻塞队列,存放事件
private val eventQueue = new LinkedBlockingQueue[SparkListenerEvent](
conf.get(LISTENER_BUS_EVENT_QUEUE_CAPACITY))
// Keep the event count separately, so that waitUntilEmpty() can be implemented properly;
// this allows that method to return only when the events in the queue have been fully
// processed (instead of just dequeued).
private val eventCount = new AtomicLong()
//使用AtomicLong类型对删除的事件进行计数,每当日志打印了droppedEventsCounter后,会将droppedEventsCounter重置为0;
private val droppedEventsCounter = new AtomicLong(0L)
//用于记录最后一次日志打印droppedEventsCounter的时间戳;
@volatile private var lastReportTimestamp = 0L
//AtomicBoolean类型的变量,用于标记是否由于eventQueue已满,导致新的事件被删除;
private val logDroppedEvent = new AtomicBoolean(false)
private var sc: SparkContext = null
private val started = new AtomicBoolean(false)
private val stopped = new AtomicBoolean(false)
private val droppedEvents = metrics.metricRegistry.counter(s"queue.$name.numDroppedEvents")
private val processingTime = metrics.metricRegistry.timer(s"queue.$name.listenerProcessingTime")
// Remove the queue size gauge first, in case it was created by a previous incarnation of
// this queue that was removed from the listener bus.
metrics.metricRegistry.remove(s"queue.$name.size")
metrics.metricRegistry.register(s"queue.$name.size", new Gauge[Int] {
override def getValue: Int = eventQueue.size()
})
private val dispatchThread = new Thread(s"spark-listener-group-$name") {
setDaemon(true)
override def run(): Unit = Utils.tryOrStopSparkContext(sc) {
dispatch()
}
}
//分发任务,
private def dispatch(): Unit = LiveListenerBus.withinListenerThread.withValue(true) {
//获取一个事件处理
var next: SparkListenerEvent = eventQueue.take()
//循环获取事件并处理
while (next != POISON_PILL) {
val ctx = processingTime.time()
try {
super.postToAll(next)
} finally {
ctx.stop()
}
//事件数信号量减1
eventCount.decrementAndGet()
next = eventQueue.take()
}
//事件数信号量减1
eventCount.decrementAndGet()
}
override protected def getTimer(listener: SparkListenerInterface): Option[Timer] = {
metrics.getTimerForListenerClass(listener.getClass.asSubclass(classOf[SparkListenerInterface]))
}
private[scheduler] def start(sc: SparkContext): Unit = {
//等待分发停止再启动任务
if (started.compareAndSet(false, true)) {
this.sc = sc
//启动任务分发进程,此进程会循环获取队列中的event
dispatchThread.start()
} else {
throw new IllegalStateException(s"$name already started!")
}
}
private[scheduler] def stop(): Unit = {
if (!started.get()) {
throw new IllegalStateException(s"Attempted to stop $name that has not yet started!")
}
if (stopped.compareAndSet(false, true)) {
eventCount.incrementAndGet()
eventQueue.put(POISON_PILL)
}
// this thread might be trying to stop itself as part of error handling -- we can't join
// in that case.
if (Thread.currentThread() != dispatchThread) {
dispatchThread.join()
}
}
def post(event: SparkListenerEvent): Unit = {
//判断LiveListenerBus是否已经处于停止状态;
if (stopped.get()) {
return
}
eventCount.incrementAndGet()
//向eventQueue中添加事件。如果添加成功,则释放信号量进而催化listenerThread能够有效工作。
if (eventQueue.offer(event)) {
return
}
//这里表明任务添加失败,所有信号量-1
eventCount.decrementAndGet()
//如果eventQueue已满造成添加失败,则移除事件,并对删除事件计数器droppedEventsCounter进行自增;
droppedEvents.inc()
droppedEventsCounter.incrementAndGet()
if (logDroppedEvent.compareAndSet(false, true)) {
// Only log the following message once to avoid duplicated annoying logs.
logError(s"Dropping event from queue $name. " +
"This likely means one of the listeners is too slow and cannot keep up with " +
"the rate at which tasks are being started by the scheduler.")
}
logTrace(s"Dropping event $event")
val droppedCount = droppedEventsCounter.get
if (droppedCount > 0) {
// Don't log too frequently
//如果有事件被删除,并且当前系统时间距离上一次打印droppedEventsCounter超过了60秒则将droppedEventsCounter打印到日志。
if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
if (droppedEventsCounter.compareAndSet(droppedCount, 0)) {
val prevLastReportTimestamp = lastReportTimestamp
lastReportTimestamp = System.currentTimeMillis()
val previous = new java.util.Date(prevLastReportTimestamp)
logWarning(s"Dropped $droppedCount events from $name since $previous.")
}
}
}
}
def waitUntilEmpty(deadline: Long): Boolean = {
while (eventCount.get() != 0) {
if (System.currentTimeMillis > deadline) {
return false
}
Thread.sleep(10)
}
true
}
override def removeListenerOnError(listener: SparkListenerInterface): Unit = {
bus.removeListener(listener)
}
}
private object AsyncEventQueue {
val POISON_PILL = new SparkListenerEvent() { }
}
核心是put和dispatch方法,详情看注释。其实所有事件当被获取时,会调用超类的potsAll方法。然后被所有监听的listener处理
LiveListenerBus的工作流程图
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-zPCIxUHV-1652773297113)(C:\Users\hz21066838\AppData\Roaming\Typora\typora-user-images\image-20220516211442763.png)]