继1-SparkContext之后,首先了解下JobProgressListener。
JobProgressListener的scala源文件所在package:
package org.apache.spark.ui.jobs
以及文件开头对类的注释:
/**
* :: DeveloperApi ::
* Tracks task-level information to be displayed in the UI.
*
* All access to the data structures in this class must be synchronized on the
* class, since the UI thread and the EventBus loop may otherwise be reading and
* updating the internal data structures concurrently.
*/
@DeveloperApi
class JobProgressListener(conf: SparkConf) extends SparkListener with Logging {
......
从中可以看出,这个类主要是为UI服务的。其大致流程可以推测:ListenerBus在收到Job相关的消息或事件(最小到task级别)之后,调用JobProgressListener的相关Api更新Job的状态(包括Job, stage, task, ExecutorMetrics, BlockManager, Application)。
提供的接口如下(源码拷贝,也可参考官方Api文档):
override def onJobStart(jobStart: SparkListenerJobStart): Unit
override def onJobEnd(jobEnd: SparkListenerJobEnd): Unit
override def onStageCompleted(stageCompleted: SparkListenerStageCompleted): Unit
override def onStageSubmitted(stageSubmitted: SparkListenerStageSubmitted): Unit
override def onTaskStart(taskStart: SparkListenerTaskStart): Unit
override def onTaskGettingResult(taskGettingResult: SparkListenerTaskGettingResult)
override def onTaskEnd(taskEnd: SparkListenerTaskEnd): Unit
def updateAggregateMetrics(
stageData: StageUIData,
execId: String,
taskMetrics: TaskMetrics,
oldMetrics: Option[TaskMetricsUIData])
override def onExecutorMetricsUpdate(executorMetricsUpdate: SparkListenerExecutorMetricsUpdate)
override def onEnvironmentUpdate(environmentUpdate: SparkListenerEnvironmentUpdate)
override def onBlockManagerAdded(blockManagerAdded: SparkListenerBlockManagerAdded)
override def onBlockManagerRemoved(blockManagerRemoved: SparkListenerBlockManagerRemoved)
override def onApplicationStart(appStarted: SparkListenerApplicationStart)
override def onApplicationEnd(appEnded: SparkListenerApplicationEnd)
关于这个scala源文件中出现的一些我想顺带了解的问题:
- java里面的注释,@DeveloperApi
package org.apache.spark.annotation;
import java.lang.annotation.*;
/**
* A lower-level, unstable API intended for developers.
*
* Developer API's might change or be removed in minor versions of Spark.
*
* NOTE: If there exists a Scaladoc comment that immediately precedes this annotation, the first
* line of the comment must be ":: DeveloperApi ::" with no trailing blank line. This is because
* of the known issue that Scaladoc displays only either the annotation or the comment, whichever
* comes first.
*/
@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.TYPE, ElementType.FIELD, ElementType.METHOD, ElementType.PARAMETER,
ElementType.CONSTRUCTOR, ElementType.LOCAL_VARIABLE, ElementType.PACKAGE})
public @interface DeveloperApi {}
注释Policy 类型有3中,其中RUNTIME类型会被编译器标记并且在VM运行时保留,可以通过反射机制可以获取被标记的Target:
package java.lang.annotation;
/**
* Annotation retention policy. The constants of this enumerated type
* describe the various policies for retaining annotations. They are used
* in conjunction with the {@link Retention} meta-annotation type to specify
* how long annotations are to be retained.
* * @author Joshua Bloch
* @since 1.5
*/
public enum RetentionPolicy {
/**
* Annotations are to be discarded by the compiler.
*/
SOURCE,
/**
* Annotations are to be recorded in the class file by the compiler
* but need not be retained by the VM at run time. This is the default
* behavior.
*/
CLASS,
/**
* Annotations are to be recorded in the class file by the compiler and
* retained by the VM at run time, so they may be read reflectively.
*
* @see java.lang.reflect.AnnotatedElement
*/
RUNTIME
}
- 疑惑之处,暂且标记
InterableLike.scala的drop函数while循环中it.next是不是需要赋值给it(scalaVersion := “2.10.3”),有了解的朋友可以赐教
override /*TraversableLike*/ def drop(n: Int): Repr = {
val b = newBuilder
val lo = math.max(0, n)
b.sizeHint(this, -lo)
var i = 0
val it = iterator
while (i < n && it.hasNext) {
it.next
i += 1
}
(b ++= it).result
}