SparkContext的初始化步骤如下:
1 创建Spark执行环境SparkEnv
1.2 什么是SparkEnv?
SparkEnv是Spark的执行环境对象,其中包括许多与Executor执行相关的对象,所以SparkEnv存在于需要创建Executor的进程中。那么需要创建Executor的进程有哪些呢?
- 在local模式下,Driver进程中会创建Executor。
- 在local-cluster模式或Standalone模式下,Worker中的CoarseGrainedExecutorBackend进程中也会创建Executor。
综上,SparkEnv存在于Driver或CoarseGrainedExecutorBackend进程中。
1.3 SparkEnv的构造步骤
创建SparkEnv,主要使用SparkEnv的createDriverEnv。
private[spark] def createDriverEnv(
conf: SparkConf,
isLocal: Boolean,
listenerBus: LiveListenerBus,
numCores: Int,
mockOutputCommitCoordinator: Option[OutputCommitCoordinator] = None): SparkEnv{...}
其中conf是SparkConf的复制,idLocal标识是否是单机模式,listenerBus采用监听器模式维护各类事件的处理。SparkEnv.createDriverEnv方法最终调用SparkEnv.create方法创建SparkEnv。
在SparkEnv.create方法的,SparkEnv的构造步骤如下:
- 创建安全管理器SecurityManager;
//step1:创建安全管理器
val securityManager = new SecurityManager(conf, ioEncryptionKey)
if (isDriver) {
securityManager.initializeAuth()
}
- 创建基于Netty的分布式消息系统ActorSystem;(Spark1.6之前使用的Akka)
Netty、Akka待学习,内容转。
val systemName = if (isDriver) driverSystemName else executorSystemName
val rpcEnv = RpcEnv.create(systemName, bindAddress, advertiseAddress, port.getOrElse(-1), conf,
securityManager, numUsableCores, !isDriver)
2 创建RDD清理器metadataCleaner
3 创建并初始化Spark UI
4 Hadoop相关配置及Executor环境变量的配置
5 创建任务调度TaskScheduler
TaskScheduler负责提交任务,请求集群管理器调度任务。
详细内容请参考笔者的另一篇文章TaskScheduler详解及源码介绍。
5.1 createTaskScheduler
创建TaskScheduler的源代码为SparkContext.createTaskScheduler,如下所示。该方法会根据master的配置匹配部署模式,每种部署模式中都会创建两个类(TaskSchedulerImpl、SchedulerBackend)的实例,只是TaskSchedulerImpl都相同,SchedulerBackend不同。
/**
* Create a task scheduler based on a given master URL.
* Return a 2-tuple of the scheduler backend and the task scheduler.
*/
private def createTaskScheduler(
sc: SparkContext,
master: String,
deployMode: String): (SchedulerBackend, TaskScheduler) = {
import SparkMasterRegex._
// When running locally, don't try to re-execute tasks on failure.
val MAX_LOCAL_TASK_FAILURES = 1
master match {
case "local" =>
val scheduler = new TaskSchedulerImpl(sc, MAX_LOCAL_TASK_FAILURES, isLocal = true)
val backend = new LocalSchedulerBackend(sc.getConf, scheduler, 1)
scheduler.initialize(backend)
(backend, scheduler)
case LOCAL_N_REGEX(threads) =>
def localCpuCount: Int = Runtime.getRuntime.availableProcessors()
// local[*] estimates the number of cores on the machine; local[N] uses exactly N threads.
val threadCount = if (threads == "*") localCpuCount else threads.toInt
if (threadCount <= 0) {
throw new SparkException(s"Asked to run locally with $threadCount threads")
}
val scheduler = new TaskSchedulerImpl(sc, MAX_LOCAL_TASK_FAILURES, isLocal = true)
val backend = new LocalSchedulerBackend(sc.getConf, scheduler, threadCount)
scheduler.initialize(backend)
(backend, scheduler)
case LOCAL_N_FAILURES_REGEX(threads, maxFailures) =>
def localCpuCount: Int = Runtime.getRuntime.availableProcessors()
// local[*, M] means the number of cores on the computer with M failures
// local[N, M] means exactly N threads with M failures
val threadCount = if (threads == "*") localCpuCount else threads.toInt
val scheduler = new TaskSchedulerImpl(sc, maxFailures.toInt, isLocal = true)
val backend = new LocalSchedulerBackend(sc.getConf, scheduler, threadCount)
scheduler.initialize(backend)
(backend, scheduler)
case SPARK_REGEX(sparkUrl) =>
val scheduler = new TaskSchedulerImpl(sc)
val masterUrls = sparkUrl.split(",").map("spark://" + _)
val backend = new StandaloneSchedulerBackend(scheduler, sc, masterUrls)
scheduler.initialize(backend)
(backend, scheduler)
case LOCAL_CLUSTER_REGEX(numSlaves, coresPerSlave, memoryPerSlave) =>
// Check to make sure memory requested <= memoryPerSlave. Otherwise Spark will just hang.
val memoryPerSlaveInt = memoryPerSlave.toInt
if (sc.executorMemory > memoryPerSlaveInt) {
throw new SparkException(
"Asked to launch cluster with %d MB RAM / worker but requested %d MB/worker".format(
memoryPerSlaveInt, sc.executorMemory))
}
val scheduler = new TaskSchedulerImpl(sc)
val localCluster = new LocalSparkCluster(
numSlaves.toInt, coresPerSlave.toInt, memoryPerSlaveInt, sc.conf)
val masterUrls = localCluster.start()
val backend = new StandaloneSchedulerBackend(scheduler, sc, masterUrls)
scheduler.initialize(backend)
backend.shutdownCallback = (backend: StandaloneSchedulerBackend) => {
localCluster.stop()
}
(backend, scheduler)
case masterUrl =>
val cm = getClusterManager(masterUrl) match {
case Some(clusterMgr) => clusterMgr
case None => throw new SparkException("Could not parse Master URL: '" + master + "'")
}
try {
val scheduler = cm.createTaskScheduler(sc, masterUrl)
val backend = cm.createSchedulerBackend(sc, masterUrl, scheduler)
cm.initialize(scheduler, backend)
(backend, scheduler)
} catch {
case se: SparkException => throw se
case NonFatal(e) =>
throw new SparkException("External scheduler cannot be instantiated", e)
}
}
}
5.2 TaskSchedulerImpl
TaskSchedulerImpl的源代码如下:
private[spark] class TaskSchedulerImpl(
val sc: SparkContext,
val maxTaskFailures: Int,
isLocal: Boolean = false)
extends TaskScheduler with Logging {
...
}
TaskSchedulerImpl的构造过程:
- 从SparkConf中读取配置信息,包括每个任务分配的CPU数、调度模式(调度模式又FAIR和FIFO两种,默认为FIFO,可以修改属性spark.scheduler.mode来改变)等。源代码为:
val conf = sc.conf
// How often to check for speculative tasks
val SPECULATION_INTERVAL_MS = conf.getTimeAsMs("spark.speculation.interval", "100ms")
// Duplicate copies of a task will only be launched if the original copy has been running for
// at least this amount of time. This is to avoid the overhead of launching speculative copies
// of tasks that are very short.
val MIN_TIME_TO_SPECULATION = 100
private val speculationScheduler =
ThreadUtils.newDaemonSingleThreadScheduledExecutor("task-scheduler-speculation")
// Threshold above which we warn user initial TaskSet may be starved
val STARVATION_TIMEOUT_MS = conf.getTimeAsMs("spark.starvation.timeout", "15s")
// CPUs to request per task
val CPUS_PER_TASK = conf.getInt("spark.task.cpus", 1)
// default scheduler is FIFO
private val schedulingModeConf = conf.get(SCHEDULER_MODE_PROPERTY, SchedulingMode.FIFO.toString)
private lazy val barrierSyncTimeout = conf.get(config.BARRIER_SYNC_TIMEOUT)
- 创建TaskResultGetter,它的作用是通过线程池对Worker上的Executor发送的Task的执行结果进行处理。
//源码:TaskSchedulerImpl的部分源码
// This is a var so that we can reset it for testing purposes.
private[spark] var taskResultGetter = new TaskResultGetter(sc.env, this)
其中,该线程池由Executors.newFixedThreadPool创建,默认4个线程,线程名字以task-result-getter开头,线程工厂默认是Executors.defaultThreadFactory。根据下面两部分源码逐个解释:
- 默认4个线程
TaskResultGetter类中的THREADS常量,通过字符串"spark.resultGetter.threads"得到值4。后面将THREADS作为参数,传入进ThreadUtils.newDaemonFixedThreadPool方法,再传入进Executors.newFixedThreadPool方法,依次向下传入,最终传入进ThreadPoolExecutor的构造方法,作为参数corePoolSize,通过this.corePoolSize = corePoolSize;
设置线程池的默认线程数为4。 - 线程名字以task-result-getter开头
在方法newDaemonFixedThreadPool方法的英文注释中说得很清楚。“Thread names are formatted as prefix-ID, where ID is a unique, sequentially assigned integer.”即线程名的格式是prefix-ID,其中prefix即TaskResultGetter类中通过语句ThreadUtils.newDaemonFixedThreadPool(THREADS, "task-result-getter")
传入的字符串"task-result-getter",ID是惟一的、按顺序分配的整数。 - 线程工厂默认是Executors.defaultThreadFactory
说实话,没找到哎
TaskResultGetter
—>
protected val getTaskResultExecutor: ExecutorService = ThreadUtils.newDaemonFixedThreadPool(THREADS, "task-result-getter")
—>
def newDaemonFixedThreadPool(nThreads: Int, prefix: String): ThreadPoolExecutor = { //使用到的线程工厂 val threadFactory = namedThreadFactory(prefix) ... }
—>
def namedThreadFactory(prefix: String): ThreadFactory = { new ThreadFactoryBuilder().setDaemon(true).setNameFormat(prefix + "-%d").build() }
—>
public ThreadFactory build() {return build(this);}
—>
private static ThreadFactory build(ThreadFactoryBuilder builder) { ... final ThreadFactory backingThreadFactory = builder.backingThreadFactory != null ? builder.backingThreadFactory : Executors.defaultThreadFactory(); Thread thread = backingThreadFactory.newThread(runnable); ... }
—>
//源码来自:Executors.java static class DefaultThreadFactory implements ThreadFactory {...}
追根溯源到最后,最初的线程工厂实例即backingThreadFactory,它是类Executors.defaultThreadFactory的实例。
//源码:TaskResultGetter的部分源码
/**
Runs a thread pool that deserializes and remotely fetches (if necessary) task results.
*/
private[spark] class TaskResultGetter(sparkEnv: SparkEnv, scheduler: TaskSchedulerImpl)
extends Logging {
//默认创建4个线程。
private val THREADS = sparkEnv.conf.getInt("spark.resultGetter.threads", 4)
// Exposed for testing.
protected val getTaskResultExecutor: ExecutorService =
ThreadUtils.newDaemonFixedThreadPool(THREADS, "task-result-getter")
...
}
//源码:newDaemonFixedThreadPool方法的源码,来自ThreadUtils.scala文件。
/**
* Wrapper over newFixedThreadPool. Thread names are formatted as prefix-ID, where ID is a
* unique, sequentially assigned integer.
*/
def newDaemonFixedThreadPool(nThreads: Int, prefix: String): ThreadPoolExecutor = {
val threadFactory = namedThreadFactory(prefix)
Executors.newFixedThreadPool(nThreads, threadFactory).asInstanceOf[ThreadPoolExecutor]
}
//源码:ThreadPoolExecutor
/**
* Creates a new {@code ThreadPoolExecutor} with the given initial
* parameters.
*
* @param corePoolSize the number of threads to keep in the pool, even
* if they are idle, unless {@code allowCoreThreadTimeOut} is set
* @param maximumPoolSize the maximum number of threads to allow in the
* pool
* @param keepAliveTime when the number of threads is greater than
* the core, this is the maximum time that excess idle threads
* will wait for new tasks before terminating.
* @param unit the time unit for the {@code keepAliveTime} argument
* @param workQueue the queue to use for holding tasks before they are
* executed. This queue will hold only the {@code Runnable}
* tasks submitted by the {@code execute} method.
* @param threadFactory the factory to use when the executor
* creates a new thread
* @param handler the handler to use when execution is blocked
* because the thread bounds and queue capacities are reached
* @throws IllegalArgumentException if one of the following holds:<br>
* {@code corePoolSize < 0}<br>
* {@code keepAliveTime < 0}<br>
* {@code maximumPoolSize <= 0}<br>
* {@code maximumPoolSize < corePoolSize}
* @throws NullPointerException if {@code workQueue}
* or {@code threadFactory} or {@code handler} is null
*/
public ThreadPoolExecutor(int corePoolSize,
int maximumPoolSize,
long keepAliveTime,
TimeUnit unit,
BlockingQueue<Runnable> workQueue,
ThreadFactory threadFactory,
RejectedExecutionHandler handler) {
if (corePoolSize < 0 ||
maximumPoolSize <= 0 ||
maximumPoolSize < corePoolSize ||
keepAliveTime < 0)
throw new IllegalArgumentException();
if (workQueue == null || threadFactory == null || handler == null)
throw new NullPointerException();
this.acc = System.getSecurityManager() == null ?
null :
AccessController.getContext();
this.corePoolSize = corePoolSize;
this.maximumPoolSize = maximumPoolSize;
this.workQueue = workQueue;
this.keepAliveTime = unit.toNanos(keepAliveTime);
this.threadFactory = threadFactory;
this.handler = handler;
}
5.3 SchedulerBackend
请参考笔者的另一篇文章SchedulerBackend详解及源码介绍
5.4 initialize方法
该方法将SchedulerBackend与TaskScheduler绑定到一起。
从下面的源码可知,Spark只有两种任务调度模式,即FIFO(先入先出)、FAIR(公平调度)。
//源码来自:TaskSchedulerImpl.scala
def initialize(backend: SchedulerBackend) {
this.backend = backend
schedulableBuilder = {
schedulingMode match {
case SchedulingMode.FIFO =>
new FIFOSchedulableBuilder(rootPool)
case SchedulingMode.FAIR =>
new FairSchedulableBuilder(rootPool, conf)
case _ =>
throw new IllegalArgumentException(s"Unsupported $SCHEDULER_MODE_PROPERTY: " +
s"$schedulingMode")
}
}
schedulableBuilder.buildPools()
}
6 创建和启动DAGScheduler
首次创建DAGScheduler的实例,是在SparkContext初始化的过程中。
//源码来自SparkContext.scala
_dagScheduler = new DAGScheduler(this)
然后进入DAGScheduler类中,在该类中最终会执行语句 eventProcessLoop.start():
private[spark] class DAGScheduler(
private[scheduler] val sc: SparkContext,
private[scheduler] val taskScheduler: TaskScheduler,
listenerBus: LiveListenerBus,
mapOutputTracker: MapOutputTrackerMaster,
blockManagerMaster: BlockManagerMaster,
env: SparkEnv,
clock: Clock = new SystemClock())
extends Logging {
...
eventProcessLoop.start()
}
接下来讲解eventProcessLoop。
6.1 DAGSchedulerEventProcessLoop
eventProcessLoop是下面DAGSchedulerEventProcessLoop类的实例。
//源码来自DAGScheduler.scala
private[spark] val eventProcessLoop = new DAGSchedulerEventProcessLoop(this)
如第6章开始所言,创建DAGScheduler的实例时,调用了eventProcessLoop实例的start()方法,而从下面DAGSchedulerEventProcessLoop类的源代码中并未找到start()方法,所以继续查找其父类EventLoop的源代码。
private[scheduler] class DAGSchedulerEventProcessLoop(dagScheduler: DAGScheduler)
extends EventLoop[DAGSchedulerEvent]("dag-scheduler-event-loop") with Logging {
...
/**
* The main event loop of the DAG scheduler.
*/
override def onReceive(event: DAGSchedulerEvent): Unit = {
val timerContext = timer.time()
try {
doOnReceive(event)
} finally {
timerContext.stop()
}
}
private def doOnReceive(event: DAGSchedulerEvent): Unit = event match {
case JobSubmitted(jobId, rdd, func, partitions, callSite, listener, properties) =>
dagScheduler.handleJobSubmitted(jobId, rdd, func, partitions, callSite, listener, properties)
case MapStageSubmitted(jobId, dependency, callSite, listener, properties) =>
dagScheduler.handleMapStageSubmitted(jobId, dependency, callSite, listener, properties)
case StageCancelled(stageId, reason) =>
dagScheduler.handleStageCancellation(stageId, reason)
case JobCancelled(jobId, reason) =>
dagScheduler.handleJobCancellation(jobId, reason)
...
}
...
}
EventLoop类的源代码如下所示,其中定义了start()方法。在该方法中:
- 由第一个判断语句知,只有stopped布尔变量为false,后面的语句才能执行。
- onStart()方法体为空,即什么也没做。
- 在eventThread.start()语句中,查明eventThread为如下定义的一个线程。线程的run方法中,当stopped为false时,会进行循环等待。while循环中调用了onReceive方法处理事件,onReceive方法在EventLoop中只有定义,在DAGSchedulerEventProcessLoop中进行了实现,即前面的源码,该方法中实际调用了doOnReceive方法,该方法中运行不同的case语句处理相应的事件(这些事件实际在action算子的runJob方法运行后会发出
???
笔者还没明白,后面会继续补充)。
/**
* An event loop to receive events from the caller and process all events in the event thread. It
* will start an exclusive event thread to process all events.
*
* Note: The event queue will grow indefinitely. So subclasses should make sure `onReceive` can
* handle events in time to avoid the potential OOM.
*/
private[spark] abstract class EventLoop[E](name: String) extends Logging {
private val eventQueue: BlockingQueue[E] = new LinkedBlockingDeque[E]()
private val stopped = new AtomicBoolean(false)
// Exposed for testing.
private[spark] val eventThread = new Thread(name) {
setDaemon(true)
override def run(): Unit = {
try {
while (!stopped.get) {
val event = eventQueue.take()
try {
onReceive(event)
} catch {
case NonFatal(e) =>
try {
onError(e)
} catch {
case NonFatal(e) => logError("Unexpected error in " + name, e)
}
}
}
} catch {
case ie: InterruptedException => // exit even if eventQueue is not empty
case NonFatal(e) => logError("Unexpected error in " + name, e)
}
}
}
def start(): Unit = {
if (stopped.get) {
throw new IllegalStateException(name + " has already been stopped")
}
// Call onStart before starting the event thread to make sure it happens before onReceive
onStart()
eventThread.start()
}
/**
* Invoked when `start()` is called but before the event thread starts.
*/
protected def onStart(): Unit = {}
/**
* Invoked in the event thread when polling events from the event queue.
*/
protected def onReceive(event: E): Unit
...
}