spark-core_10: org.apache.spark.deploy.master.Master源码解析2--Master这个RpcEndPoint是如何初始化Master

29 篇文章 4 订阅

承接上文

/**
  * RpcEndpoint的生命周期又是: onStart -> receive(receiveAndReply)* -> onStop
  * 这个MasterRpcEndpoint是线程安全的
  */
private[deploy] class Master(
    override val rpcEnv: RpcEnv,
    address: RpcAddress,
    webUiPort: Int,
    val securityMgr: SecurityManager,
    val conf: SparkConf)
  extends ThreadSafeRpcEndpoint with Logging with LeaderElectable {
  //只有一个线程的后台守护调度线程池,这个名称就是线程的名称,看方法名管理消息的线程池,从下面的源码看它是负责Worker是否死亡,及选举用
  private val forwardMessageThread =
    ThreadUtils.newDaemonSingleThreadScheduledExecutor("master-forward-message-thread")
  //这线程池只有一个线程,和上面的区别是没有调度。它是重构UI的线程(这个两个线程池的源码实现也很简单,使用google的ThreadFactory来包装的线程池)
  private val rebuildUIThread =
    ThreadUtils.newDaemonSingleThreadExecutor("master-rebuild-ui-thread")
  private val rebuildUIContext = ExecutionContext.fromExecutor(rebuildUIThread)

  //返回一个适当的(子类)配置。 创建配置可以初始化一些Hadoop子系统。//实现也很简单
//将SparkConf对象中含有spark.hadoop.foo=bar的key取出来,然后将spark.hadoop.去掉,把剩下的内容为hadoopConf的foo=bar
  private val hadoopConf = SparkHadoopUtil.get.newConfiguration(conf)

  private def createDateFormat = new SimpleDateFormat("yyyyMMddHHmmss") // For application IDs

  private val WORKER_TIMEOUT_MS = conf.getLong("spark.worker.timeout", 60) * 1000
  private val RETAINED_APPLICATIONS = conf.getInt("spark.deploy.retainedApplications", 200)
  private val RETAINED_DRIVERS = conf.getInt("spark.deploy.retainedDrivers", 200)
  private val REAPER_ITERATIONS = conf.getInt("spark.dead.worker.persistence", 15)
  /**
    * export SPARK_DAEMON_JAVA_OPTS="-Dsun.io.serialization.extendedDebugInfo=true
    * -Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=luyl153:2181,luyl154:2181,luyl155:2181
    * -Dspark.deploy.zookeeper.dir=/spark"
    * 这些SPARK_DAEMON_JAVA_OPTS是在main方法:new SparkConf的时候初始化的从环境变量中取的,当在ha的模式下:RECOVERY_MODE对应的就是ZOOKEEPER
    */
  private val RECOVERY_MODE = conf.get("spark.deploy.recoveryMode", "NONE")

  val workers = new HashSet[WorkerInfo]
  val idToApp = new HashMap[String, ApplicationInfo]
  val waitingApps = new ArrayBuffer[ApplicationInfo]
  val apps = new HashSet[ApplicationInfo]

  private val idToWorker = new HashMap[String, WorkerInfo]
  private val addressToWorker = new HashMap[RpcAddress, WorkerInfo]

  private val endpointToApp = new HashMap[RpcEndpointRef, ApplicationInfo]
  private val addressToApp = new HashMap[RpcAddress, ApplicationInfo]
  private val completedApps = new ArrayBuffer[ApplicationInfo]
  private var nextAppNumber = 0
  // Using ConcurrentHashMap so that master-rebuild-ui-thread can add a UI after asyncRebuildUI
  private val appIdToUI = new ConcurrentHashMap[String, SparkUI]

  private val drivers = new HashSet[DriverInfo] //DriverInfo针对StandaloneRestServer,即cluster模式
  private val completedDrivers = new ArrayBuffer[DriverInfo]
  // Drivers currently spooled for scheduling
  private val waitingDrivers = new ArrayBuffer[DriverInfo]
  private var nextDriverNumber = 0

  Utils.checkHost(address.host, "Expected hostname")

/** 实例的名称可以是master, worker, executor, driver, applications,将MetricsSystem初始化出来,然的将默认的指标加到
  * MetricsConfig.propertyCategories的最终的值是
  * (applications,{sink.servlet.class=org.apache.spark.metrics.sink.MetricsServlet, sink.servlet.path=/metrics/applications/json})
  * (master,{sink.servlet.class=org.apache.spark.metrics.sink.MetricsServlet, sink.servlet.path=/metrics/master/json})
  * (*,{sink.servlet.class=org.apache.spark.metrics.sink.MetricsServlet, sink.servlet.path=/metrics/json})
  */


 
privateval masterMetricsSystem = MetricsSystem.createMetricsSystem("master", conf, securityMgr)
 
privateval applicationMetricsSystem = MetricsSystem.createMetricsSystem("applications", conf, securityMgr)

1,分析一下MetricsSystem这个指标系统类在代码到底做了什么?

==》先实例化

def createMetricsSystem(
   
instance: String, conf: SparkConf, securityMgr:SecurityManager): MetricsSystem = {
 
new MetricsSystem(instance, conf, securityMgr)
}

2,看一下这个MetricsSystem相关注释,自己翻译了一下,就是将spark内置MasterSource,WorkerSource,还有worker的,executor的等source组件状态,sink到指定的servlet中

* 实例指定了指标系统的角色,在spark中像master,worker, executor, client driver,这些角色会被MetricsSystem进行监控
  * 在这个spark中 master, worker, executor, driver, applications角色,实例已经实现过
  *
  * source源指定收集的指标数据。在MetricsSystem系统中,有两个类型的source
  * 1,spark内部source,如MasterSource, WorkerSource等,它们会收集spark组件的状态,在MetricsSystem创建之后,会将它们实例加到MetricsSystem中
  * 2,共同的source ,如jvmSource,它会收集低级别的状态,可以配制可以通过加载反射或configuration
  *
  * sink指定指标数据输出的位置, 多个接收器可以共存,并将度量指标提供给所有这些接收器
  * 指标配制的格式如下:
  * [instance].[sink|source].[name].[options] = xxxx
  * instance可以是“master”,“worker”,“executor”,“driver”,“applications”,这意味着只有指定的实例才具有当前定义的属性。
    通配符“*”可用于替换实例名称,这意味着所有实例都具有此属性。
     第二个字段,只能是sink或source
  * name:指定了接收器或源的名称,可以自定义
  * options:指定了source或接收器的属性
 */

private[spark] class MetricsSystemprivate (
   
val instance:String,
   
conf: SparkConf,
   
securityMgr: SecurityManager)
 
extends Logging {

 
private[this] val metricsConfig = new MetricsConfig(conf)

 
private val sinks = new mutable.ArrayBuffer[Sink]
 
private val sources = new mutable.ArrayBuffer[Source]
 
private val registry = new MetricRegistry()

 
private var running: Boolean = false

 
// Treat MetricsServlet as a special sink as it should beexposed to add handlers to web ui

//将MetricsServlet视为特殊的接收器,因为它应该暴露给web UI中的处理程序
  private var metricsServlet: Option[MetricsServlet] = None

 
/**
  
* Get any UI handlers used by thismetrics system; can only be called after start().
   */
 
def getServletHandlers: Array[ServletContextHandler] = {
   
require(running, "Canonly call getServletHandlers on a running MetricsSystem")
   
metricsServlet.map(_.getHandlers(conf)).getOrElse(Array())
 
}

  metricsConfig.initialize()

3,先来看一下new MetricsConfig(conf).initialize(),从下面的源码看出,最终得到一个propertyCategories: HashMap[String, Properties]就是将master,appplications和所有指标,做为key,value是properties对应值sink接收器的servletclass类路径及接收器的servlet的url

private[spark] class MetricsConfig(conf:SparkConf) extends Logging {

 
private val DEFAULT_PREFIX = "*"
 
private val INSTANCE_REGEX = "^(\\*|[a-zA-Z]+)\\.(.+)".r
 
 private val DEFAULT_METRICS_CONF_FILENAME = "metrics.properties"
 
/** MetricsConfig.properties的值是
    * ("*.sink.servlet.class","org.apache.spark.metrics.sink.MetricsServlet")
    * ("*.sink.servlet.path","/metrics/json")
    * ("master.sink.servlet.path","/metrics/master/json")
    *("applications.sink.servlet.path","/metrics/applications/json")
    */

 
private[metrics]val properties = new Properties()
 
/**MetricsConfig.propertyCategories的值是
*(applications,{sink.servlet.class=org.apache.spark.metrics.sink.MetricsServlet,sink.servlet.path=/metrics/applications/json})
    *(master,{sink.servlet.class=org.apache.spark.metrics.sink.MetricsServlet,sink.servlet.path=/metrics/master/json})
    *(*,{sink.servlet.class=org.apache.spark.metrics.sink.MetricsServlet,sink.servlet.path=/metrics/json})
    */

 
private[metrics]var propertyCategories: mutable.HashMap[String, Properties] = null
 
private def
setDefaultProperties(prop: Properties) {
   
prop.setProperty("*.sink.servlet.class", "org.apache.spark.metrics.sink.MetricsServlet")
   
prop.setProperty("*.sink.servlet.path", "/metrics/json")
   
prop.setProperty("master.sink.servlet.path", "/metrics/master/json")
   
prop.setProperty("applications.sink.servlet.path", "/metrics/applications/json")
 
}

  def initialize() {
    // Add default properties in case there's no propertiesfile
    //将默认的属性及值加到Properties中

    setDefaultProperties(properties)
   
//默认spark.metrics.conf没有值,什么也没有加载上来
    loadPropertiesFromFile(conf.getOption("spark.metrics.conf"))

   
// Also look for the properties in provided Sparkconfiguration
   
val prefix= "spark.metrics.conf."
   
conf.getAll.foreach {
     
case (k, v) if k.startsWith(prefix) =>
       
properties.setProperty(k.substring(prefix.length()), v)
     
case _=>
   
}
    /** 当初始时 subProperties的返回值是HashMap[String,Properties]对应如下:
      *(applications,{sink.servlet.path=/metrics/applications/json})
      *(master,{sink.servlet.path=/metrics/master/json})
      *(*,{sink.servlet.class=org.apache.spark.metrics.sink.MetricsServlet,sink.servlet.path=/metrics/json})
      */

    propertyCategories
= subProperties(properties, INSTANCE_REGEX)
   
//DEFAULT_PREFIX: "*"
   
if (propertyCategories.contains(DEFAULT_PREFIX)) {
     
//就是(*,{sink.servlet.class=org.apache.spark.metrics.sink.MetricsServlet,sink.servlet.path=/metrics/json})
     
val defaultProperty= propertyCategories(DEFAULT_PREFIX).asScala
     
for((inst, prop) <- propertyCategoriesif (inst != DEFAULT_PREFIX);
         
(k, v) <- defaultProperty if (prop.get(k) == null)) {
        prop.put(k, v)
      }
      /**上面几行代码就是为了修改propertyCategories为这两个master,applications对应的properties增加
        *sink.servlet.class=org.apache.spark.metrics.sink.MetricsServlet
        * 所以propertyCategories最终结果如下:
*(applications,{sink.servlet.class=org.apache.spark.metrics.sink.MetricsServlet,sink.servlet.path=/metrics/applications/json})
        *(master,{sink.servlet.class=org.apache.spark.metrics.sink.MetricsServlet,sink.servlet.path=/metrics/master/json})
        *(*,{sink.servlet.class=org.apache.spark.metrics.sink.MetricsServlet,sink.servlet.path=/metrics/json})
        */

   
}
 
}

  def subProperties(prop: Properties, regex:Regex): mutable.HashMap[String, Properties] = {
   
val subProperties= new mutable.HashMap[String, Properties]
   
prop.asScala.foreach { kv =>
      if (regex.findPrefixOf(kv._1.toString).isDefined){
       
val regex(prefix, suffix) = kv._1.toString
       subProperties.getOrElseUpdate(prefix, new Properties).setProperty(suffix, kv._2.toString)
     
}
    }
    subProperties

  }

  //如果是master,则返回

{sink.servlet.class=org.apache.spark.metrics.sink.MetricsServlet,sink.servlet.path=/metrics/master/json}

  def getInstance (inst: String ):Properties = {
   
//propertyCategories=HashMap[String, Properties] ,返回值就是通过subProperties
    propertyCategories .get(inst) match {
     
case Some (s) => s
     
case None=> propertyCategories .getOrElse( DEFAULT_PREFIX , new Properties)
   
}
  }

  /**
   * Loads configuration from a configfile. If no config file is provided, try to get file
   * in class path.
    * 默认spark.metrics.conf没有值即path是None,而metrics.properties对应的当前路径下的classPath也没有这个文件
   */

 
private [ this ] def loadPropertiesFromFile (path: Option[ String ]): Unit = {
   
var is:InputStream = null
   
try
{
     
is = path match {
       
case Some (f) => new FileInputStream(f)
       
case None=> Utils.getSparkClassLoader.getResourceAsStream( DEFAULT_METRICS_CONF_FILENAME )
     
}
      if (is!= null ) {
       
properties .load(is)
     
}
    } catch {
     
case e: Exception =>
       
val file= path.getOrElse( DEFAULT_METRICS_CONF_FILENAME )
       
logError( s"Error loading configuration file $ file " , e)
   
} finally {
     
if (is!= null ) {
       
is.close()
      }
    }
  }
}

===>再从MetricsSystem回到Master中

 

//spark内部source,如MasterSource, WorkerSource等,它们会收集spark组件的状态,在MetricsSystem创建之后,会将它们实例加到MetricsSystem中
  private val masterSource = new MasterSource(this)

  // After onStart, webUi will be set
  private var webUi: MasterWebUI = null

  private val masterPublicAddress = {
    //没有这个值,RpcAddress.host的值就是当前节点:luyl152:7077
    val envVar = conf.getenv("SPARK_PUBLIC_DNS")
    if (envVar != null) envVar else address.host
  }
//masterUrl的值spark://luyl152:7077
  private val masterUrl = address.toSparkURL
  //下面onstart()初始化变成 masterWebUiUrl 值 http://luyl152:8080
  private var masterWebUiUrl: String = _

  private var state = RecoveryState.STANDBY

  private var persistenceEngine: PersistenceEngine = _

  private var leaderElectionAgent: LeaderElectionAgent = _

  private var recoveryCompletionTask: ScheduledFuture[_] = _

  private var checkForWorkerTimeOutTask: ScheduledFuture[_] = _

  // As a temporary workaround before better ways of configuring memory, we allow users to set  
a flag that will perform round-robin scheduling across the nodes (spreading out each app  among all the nodes) 
instead of trying to consolidate each app onto a small # of nodes.
  //作为更好的配置内存方式之前的一个临时解决方法,我们允许用户设置一个标志,
  // 在整个节点上执行循环调度(将所有节点中的每个应用程序分开),而不是试图将每个应用程序整合到一小块节点。
  private val spreadOutApps = conf.getBoolean("spark.deploy.spreadOut", true)

  // Default maxCores for applications that don't specify it (i.e. pass Int.MaxValue)
//core的数量默认是取Int的最大值,如果没有core报错
  private val defaultCores = conf.getInt("spark.deploy.defaultCores", Int.MaxValue)
  if (defaultCores < 1) {
    throw new SparkException("spark.deploy.defaultCores must be positive")
  }

  // Alternative application submission gateway that is stable across Spark versions
  //spark.master.rest.enabled=true,启动rest Server,这个rest Server主要是给Cluster模式用的
  private val restServerEnabled = conf.getBoolean("spark.master.rest.enabled", true)
  private var restServer: Option[StandaloneRestServer] = None
  private var restServerBoundPort: Option[Int] = None
  /**  RpcEndpoint的生命周期是: 如果有onStart会先执行 -> receive、receiveAndReply、等消息接收回复的方法-> onStop
    * 构建webui和启动rest server,定期检查Worker是否超时,进行Master HA相关的操作
    */

  override def onStart(): Unit = {
    logInfo("Starting Spark master at " + masterUrl)
    logInfo(s"Running Spark version ${org.apache.spark.SPARK_VERSION}")
    webUi = new MasterWebUI(this, webUiPort)

下面分析一下MasterWebUI(MasterRpcEndPoint,8080)如何初始化web页面


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值