spark-core_28:Executor初始化过程env.blockManager.initialize(conf.getAppId)- blockTransferService.init()分析

29 篇文章 4 订阅

查看(spark-core_25:Master通知Worker启动CoarseGrainedExecutorBackend进程及CoarseGrainedExecutorBackend初始化源码分析)

//sparkContext在初始化时也调用了_env.blockManager.initialize(_applicationId)执行的过程差不多

 

private[spark] class BlockManager(
   
executorId: String, //executorId:如是driver值是"driver"串,如果CoarseGrainedExecutorBackend则是具体的数值串
    rpcEnv: RpcEnv,
   
val master: BlockManagerMaster, //diver上的BlockManagerMaster负责对Executor上的BlockManager进行管理,
// 它里面有BlockManagerMasterEndpoint引用,Executor上通过获取的它的引用,然后给Endpoint发消息实现和Driver交互

    defaultSerializer: Serializer,
   
val conf: SparkConf,
   
memoryManager: MemoryManager, //默认使用UnifiedMemoryManager
    mapOutputTracker: MapOutputTracker, //如果是executor:MapOutputTrackerWorker,会从driver中的MapOutputTrackerMaster得到map out 的信息
                        // 如果是driver:MapOutputTrackerMaster,使用TimeStampedHashMap来跟踪 map的输出信息

    shuffleManager: ShuffleManager,
   
blockTransferService: BlockTransferService,
   
securityManager: SecurityManager,
   
numUsableCores: Int)
 
extends BlockDataManager withLogging {
…为了分析流程,将不太相关的代码去掉,读者可以对着源码跟我一块分析
   
/**
   * Initializes the BlockManager withthe given appId. This is not performed in the constructor as
   * the appId may not be known atBlockManager instantiation time (in particular for the driver,
   * where it is only learned afterregistration with the TaskScheduler).
   *
   * This method initializes theBlockTransferService and ShuffleClient, registers with the
   * BlockManagerMaster, starts theBlockManagerWorker endpoint, and registers with a local shuffle
   * service if configured.
    *
    * 该方法在sparkContext或在Executor初始化时被调用:_env.blockManager.initialize(_applicationId)
    * 该方法的作用:
    * 1,用给定的appId初始化BlockManager。 (特别是对于仅在向TaskScheduler注册之后的驱动程序)
    * 2,blockTransferService.init(this)创建一个NettyServer
    * 3,生成BlockManagerId("driver",driver的host, nettyserver的port),它是每个BlockManager的唯一标识
    * 3.1,实例化生成BlockManagerSlaveEndpoint的作用是得到master命令来执行相关操作,如从slave 的BlockManger中移除block
    * 4,master.registerBlockManager():生成BlockManagerInfo放到BlockManagerMasterEndpoint成员blockManagerInfo对应HashMap[BlockManagerId,BlockManagerInfo]集合中
    * BlockManagerInfo管理所有BlockManagerId,而BlockManagerId是BlockManager的唯一标识,同时它还有BlockManagerSlaveEndpoint(是driver和slave交互用的)
    *
    *参数appId的值类似: app-20180404172558-0000
   */

 
def initialize(appId: String): Unit = {
   
//SparkEnv.create初始化进来的:BlockTransferService:NettyBlockTransferService,它是块传输服务
   
/** NettyBlockTransferService.init(this)做了如下事情:
      1.创建RpcServer:NettyBlockRpcServer,为每个请求打开或上传注册在BlockManager中的任意Block块,每一次Chunk的传输相当于一次shuffle;
      2.构建TransportContext:TransportContext:包含创建{TransportServer:nettyServer},{TransportClientFactory用来创建TransportClient}的上下文,并使用{TransportChannelHandler}设置NettyChannel管道。
      3.客户端工厂TransportClientFactory:这个工厂实例通过使用createClient创建客户端{TransportClient}, 这个工厂实例维护一个到其他主机的连接池,并应为相同的远程主机返回相同的TransportClient。 它还为所有TransportClient共享单个线程池。

      4.创建Netty服务器TransportServer:包括编解码,还有入站事件都加到TransportServer这个nettySever中(上面各个类是围绕NettyServer来干活的)
      */

   
blockTransferService.init(this)
  
。。.

  }

1,NettyBlockTransferService这个块传输服务是如何初始化nettySever的,NettyBlockTransferService.init()做了上面注释说的四件事

/**
 * A BlockTransferService that uses Nettyto fetch a set of blocks at at time.
  * 它是由SparkEnv.create()方法初始化出来的
  * 在单位时间内使用netty取得block块集合,blockTransferService默认为NettyBlockTransferService(提供web服务及客户端,获取远程节点上的Block集合)
  * numCores:如果master是local模式会将driver对应节点cpu的线程数取出来,如果是集群模式则返回0
  * numCores如果是CoarseGrainedExecutorBackend创建的SparkEnv则它的值是:
  *     SparkConf的"spark.executor.cores"的值决定(我这设置了1所以是1),如果没有值,只启动一个CoarseGrainedExecutorBackend,把worker所有可用的core给它
 */

class NettyBlockTransferService(conf: SparkConf, securityManager: SecurityManager, numCores: Int) extendsBlockTransferService {

  //通过参数中的BlockDataManager(是BlockManager的父类)来初始化传输服务,通过BlockDataManager可以得到本地的Block(getBlockData)和put本地block(putBlockData)
  //该方法会被BlockManager中的initialize()方法调用

 
override def init(blockDataManager:BlockDataManager): Unit = {
   
/** conf.getAppId: app-20180508234845-0000
      * serializer:    JavaSerializer()
      * blockDataManager : BlockManager实例
      */

   
val rpcHandler = new NettyBlockRpcServer(conf.getAppId, serializer, blockDataManager)
  

  }

2,初始化NettyBlockRpcServer:作用为每个请求打开或上传注册在BlockManager中的任意Block块,每一次Chunk的传输相当于一次shuffle

class NettyBlockRpcServer(
   
appId: String, // app-20180508234845-0000
    serializer: Serializer, //JavaSerializer
    blockManager: BlockDataManager) //BlockManager实例
  extends RpcHandler with Logging {
 
//StreamManager允许注册Iterator<ManagedBuffer> ,通过TransportClient客户端可以得到各自的chunks,每个注册的buffer是一个chunk
  private val streamManager = new OneForOneStreamManager()
//通过openBlock和uploadBlock,可以打开和上传注册在BlockManager中的Block
 
override def receive(
     
client: TransportClient,
     
rpcMessage: ByteBuffer,
     
responseContext: RpcResponseCallback): Unit = {
   
val message= BlockTransferMessage.Decoder.fromByteBuffer(rpcMessage)
   
logTrace(s"Received request: $message")

   
message match {
     
case openBlocks:OpenBlocks =>
       
val blocks:Seq[ManagedBuffer] =
         
….

      case uploadBlock:UploadBlock =>
       
// StorageLevel is serialized as bytes using ourJavaSerializer.
       

   
}
  }
  override def getStreamManager(): StreamManager = streamManager

//再看一下OneForOneStreamManager,是由getStreamManager()进行调用的

/**
 * StreamManager which allowsregistration of an Iterator<ManagedBuffer>, which are individually fetched as chunks by the client.Each registered buffer is one chunk.
 *
 * StreamManager允许注册Iterator<ManagedBuffer> ,通过TransportClient客户端可以得到各自的chunks,每个注册的buffer是一个chunk
 */

public class OneForOneStreamManager extends StreamManager {
 
private final Logger logger = LoggerFactory.getLogger(OneForOneStreamManager.class);

 
private final AtomicLong nextStreamId;
 
private final ConcurrentHashMap<Long, StreamState>streams;

 
/** State of a single stream. */
 
private static class StreamState {
   
。。。
 
}
  //BlockManager.initialize==>NettyBlockTransferService.init()==>newNettyBlockRpcServer初始化时调用的
  //设置成员:nextStreamId:AtomicLong,赋小于Integer.MAX_VALUE*1000的值
  //streams:变成ConcurrentHashMap<Long,StreamState>()

  public OneForOneStreamManager(){
   
// For debugging purposes, start with a random stream idto help identifying different streams.
   
// This does not need to be globallyunique, only unique to this class.
    nextStreamId = new AtomicLong((long) new Random().nextInt(Integer.MAX_VALUE) * 1000);
   
streams = new ConcurrentHashMap<Long, StreamState>();
 
}

。。。。

3,再回到NettyBlockTransferService.init方法

override def init(blockDataManager: BlockDataManager): Unit = {
 
/** conf.getAppId: app-20180508234845-0000
    * serializer:    JavaSerializer()
    * blockDataManager : BlockManager实例
    * NettyBlockRpcServer作用: 为每个请求打开或上传注册在BlockManager中的任意Block块,每一次Chunk的传输相当于一次shuffle
    */

 
val rpcHandler= new NettyBlockRpcServer(conf.getAppId, serializer, blockDataManager)
 
var serverBootstrap:Option[TransportServerBootstrap] = None
 
var clientBootstrap:Option[TransportClientBootstrap] = None
 
if (authEnabled) {//默认是false,不开启认证
    serverBootstrap = Some(new SaslServerBootstrap(transportConf, securityManager))
   
clientBootstrap = Some(new SaslClientBootstrap(transportConf, conf.getAppId, securityManager,
     
securityManager.isSaslEncryptionEnabled()))
 
}
  //TransportContext:包含创建{TransportServer:nettyServer},{TransportClientFactory用来创建TransportClient}的上下文,并使用{TransportChannelHandler}设置Netty Channel管道。
  //实例化TransportContext将给成员赋值 conf:TransportConf它可以通过成员子类ConfigProvider和sparkConf关联、rpcHandler:NettyBlockRpcServer、closeIdleConnections:false、同时给出站的编码器、入站解码器,赋具体实例

  transportContext = new TransportContext(transportConf, rpcHandler)

==》先看一下NettyBlockTransferService的成员transportConf

//numCores:如果master是local模式会将driver对应节点cpu的线程数取出来,如果是集群模式则返回0
/**  SparkTransportConf.fromSparkConf():会按numCores给sparkConf的spark.shuffle.io.serverThreads或spark.shuffle.io.clientThreads设置线程数,
     是给netty的server或client使用的,如果没有给sparkConf设置值则这个值是小于等于8
fromSparkConf方法:返回TransportConf实例:会按fromSparkConf第二个参数给TransportConf为module给它的成员变量设置key类型
  如:SPARK_NETWORK_IO_MODE_KEY:spark.shuffle.io.mode、 SPARK_NETWORK_IO_SERVERTHREADS_KEY: spark.shuffle.io.serverThreads
  */

private val transportConf = SparkTransportConf.fromSparkConf(conf, "shuffle", numCores)

===>SparkTransportConf按写的源码看是通过TransportConf成员ConfigProvider的子类,和SparkConf建立关系,同时按module参数,设置key的变量值,

==》同时将指定spark.shuffle.io.serverThreads、spark.shuffle.io.clientThreads的线程数是实参numUsableCores的值

/**
 * Provides a utility for transformingfrom a SparkConf inside a Spark JVM (e.g., Executor,
 * Driver, or a standalone shuffleservice) into a TransportConf with details on our environment
 * like the number of cores that areallocated to this JVM.
  * 提供一个工具类,将当前Spark JVM中的SparkConf(例如,Executor,Driver或独立shuffle服务)转换为TransportConf,
  * 并提供有关环境的详细信息,例如分配给此JVM的核心数。
 */

object SparkTransportConf {
 
    * spark默认8个netty 线程,在实践中2-4个cores需要10Gb/s的传输,每个core需要初始大于32MB的出堆内存这个线程值可以通过serverThreads和clientThreads手动匹配来更新这个值
   */

 
private val MAX_DEFAULT_NETTY_THREADS = 8

 
/**
   * Utility for creating a
[[TransportConf]] froma [[SparkConf]].
   * @param _conf the
[[SparkConf]]
   * @param module the module name  如shuffle
    *               如果numUsableCores是非0的值,将会限制server和client的线程,只能使用给定义的cores数,而不是整个服务器的cores
   * @param numUsableCores ifnonzero, this will restrict the server and client threads to only use the givennumber of cores, rather than all of the machine's cores.
   * This restriction will only occur ifthese properties are not already set.
    *

从sparkConf创建一个TransportConf
    * numUsableCores:如果master是local模式会将driver对应节点cpu的线程数取出来,如果是集群模式则返回0
    * numUsableCores:如果是CoarseGrainedExecutorBackend创建的SparkEnv则它的值是:
    *SparkConf的"spark.executor.cores"的值决定(我这设置了1所以是1),如果没有值,只启动一个CoarseGrainedExecutorBackend,把worker所有可用的core给它
   */

 
def fromSparkConf(_conf: SparkConf, module:String, numUsableCores:Int = 0): TransportConf = {
   
val conf= _conf.clone

   
// Specify thread configuration based on our JVM'sallocation of cores (rather than necessarily
    // assuming we have all the machine'scores).
    // NB: Only set ifserverThreads/clientThreads not already set.
    //defaultNumThreads:返回给netty的client和server的线程池对应core的数量,如果是numUsableCores是0 会返回当前小于等8的线程数

    val numThreads= defaultNumThreads(numUsableCores)
   
//我这边将spark.executor.cores设置成1,所以numThreads的值是1,即spark.shuffle.io.serverThreads、spark.shuffle.io.clientThreads的值是1
    conf.setIfMissing(s"spark.$module.io.serverThreads", numThreads.toString)
   
conf.setIfMissing(s"spark.$module.io.clientThreads", numThreads.toString)
   
//ConfigProvider是抽像类,需要实现抽像方法,它的作用就是帮助实例化TransportConf
    //按module变量值去设置它们成员变量的值如:SPARK_NETWORK_IO_MODE_KEY:spark.shuffle.io.mode、
    //SPARK_NETWORK_IO_SERVERTHREADS_KEY: spark.shuffle.io.serverThreads

    new TransportConf(module, new ConfigProvider {
   
  override def get(name: String): String = conf.get(name)
   
})
  }

  /**
   * Returns the default number ofthreads for both the Netty client and server thread pools.
   * If numUsableCores is 0, we will useRuntime get an approximate number of available cores.
    * 返回默认的线程数,给netty的client和server的线程池使用,如果是numUsableCores是0 会返回当前小于等8的线程数
   */

 
private def defaultNumThreads(numUsableCores:Int): Int = {
   
val availableCores=
     
if (numUsableCores> 0) numUsableCores else Runtime.getRuntime.availableProcessors()
   
math.min(availableCores, MAX_DEFAULT_NETTY_THREADS)
 
}
}

===》再回到NettyBlockTransferService.init方法看一下new TransportContext(transportConf, rpcHandler)

override def init(blockDataManager: BlockDataManager): Unit = {
 
。。。
 
//TransportContext:包含创建{TransportServer:nettyServer},{TransportClientFactory用来创建TransportClient}的上下文,并使用{TransportChannelHandler}设置Netty Channel管道。
  //实例化TransportContext将给成员赋值 conf:TransportConf它可以通过成员子类ConfigProvider和sparkConf关联、rpcHandler:NettyBlockRpcServer、closeIdleConnections:false、同时给出站的编码器、入站解码器,赋具体实例

  transportContext = new TransportContext(transportConf, rpcHandler)

===》初始化TransportContext

* TransportContext包含创建{TransportServer},{TransportClientFactory}的上下文,并使用{TransportChannelHandler}设置Netty Channel管道。
 *
 * TransportClient提供两种通信协议,控制平面RPC和数据平面“块取出”。
 * RPC的处理在TransportContext的范围之外(即,由用户提供的处理程序)执行,并且它负责设置可使用零拷贝IO以块形式流式传输通过数据平面的流。
 *
 * TransportServer和TransportClientFactory都为每个通道创建一个TransportChannelHandler。
 * 由于每个TransportChannelHandler都包含一个TransportClient,因此可以使服务器进程在现有通道上将消息发送回客户端。
 */

public class TransportContext {
 
private final Logger logger = LoggerFactory.getLogger(TransportContext.class);
//这此成员值看下面构造方法
  private final TransportConf conf;
 
private final RpcHandler rpcHandler;
 
private final boolean closeIdleConnections;

 
private final MessageEncoder encoder;
 
private final MessageDecoder decoder;

 
public TransportContext(TransportConf conf, RpcHandler rpcHandler) {
   
this(conf, rpcHandler, false);
 
}

 
/**
   *
   * @param conf TransportConf实例:会按fromSparkConf第二个参数给TransportConf为module给它的成员变量设置key类型。如:SPARK_NETWORK_IO_MODE_KEY:spark.shuffle.io.mode、 SPARK_NETWORK_IO_SERVERTHREADS_KEY: spark.shuffle.io.serverThreads
   * @param rpcHandler: NettyBlockRpcServer作用: 为每个请求打开或上传注册在BlockManager中的任意Block块,每一次Chunk的传输相当于一次shuffle
   * @param closeIdleConnections  如果是上面两个参数的构造方法,它的值是false
     */

 
public TransportContext(
     
TransportConf conf,
     
RpcHandler rpcHandler,
     
boolean closeIdleConnections) {
   
this.conf = conf; //TransportConf,它的成员ConfigProvider子类关联到SparkConf
    this.rpcHandler = rpcHandler; //NettyBlockRpcServer
    //是MessageToMessageEncoder编码器用于出站事件
    this.encoder = new MessageEncoder();
   
//是MessageToMessageDecoder解码器用于入站事件
    this.decoder = new MessageDecoder();
   
//默认是false
    this.closeIdleConnections = closeIdleConnections;
 
}

4,再回来再回到NettyBlockTransferService.init,看一下TransportContext. createClientFactory()创建TransportClientFactory

override def init(blockDataManager: BlockDataManager): Unit = {
 
/** conf.getAppId: app-20180508234845-0000
    * serializer:    JavaSerializer()
    * blockDataManager : BlockManager实例
    * NettyBlockRpcServer作用: 为每个请求打开或上传注册在BlockManager中的任意Block块,每一次Chunk的传输相当于一次shuffle
    */

 
val rpcHandler= new NettyBlockRpcServer(conf.getAppId, serializer, blockDataManager)
 
var serverBootstrap:Option[TransportServerBootstrap] = None
 
var clientBootstrap:Option[TransportClientBootstrap] = None
 
if (authEnabled) {//默认是false,不开启认证
    serverBootstrap = Some(new SaslServerBootstrap(transportConf, securityManager))
   
clientBootstrap = Some(new SaslClientBootstrap(transportConf, conf.getAppId, securityManager,
     
securityManager.isSaslEncryptionEnabled()))
 
}
  //TransportContext:包含创建{TransportServer:nettyServer},{TransportClientFactory用来创建TransportClient}的上下文,并使用{TransportChannelHandler}设置Netty Channel管道。
  //实例化TransportContext将给成员赋值 conf:TransportConf它可以通过成员子类ConfigProvider和sparkConf关联
  // 、rpcHandler:NettyBlockRpcServer、closeIdleConnections:false、同时给出站的编码器、入站解码器,赋具体实例

  transportContext = new TransportContext(transportConf, rpcHandler)

 
/** 没有开启ssl所以clientBootstrap是空的
    *
    * TransportClientFactory:这个工厂实例通过使用createClient创建{TransportClient}。
    * 这个工厂实例维护一个到其他主机的连接池,并应为相同的远程主机返回相同的TransportClient。 它还为所有TransportClient共享单个线程池。
    *
    * 在返回新客户端之前初始化运行给定TransportClientBootstraps的ClientFactory。Bootstraps会被同步执行,并且必须运行成功才能创建Client
    * 给这个实例TransportClientFactory:赋成员
    * context:TransportContext,
    * conf:TransportConf会通过它成员:ConfigProvider子类关联SparkConf
    * 还有初始化netty的NioSocketChannel.class、NioEventLoopGroup线程组、ByteBuf分配器PooledByteBufAllocator
    */

  clientFactory
= transportContext.createClientFactory(clientBootstrap.toSeq.asJava)

==》TransportContext.createClientFactory就是将TransportClientFactory工厂初始化了一下

/**
 * Initializes a ClientFactory which runsthe given TransportClientBootstraps prior to returning  a new Client. Bootstraps will be executedsynchronously, and must run successfully in order  to create a Client.
 * 在返回新客户端之前初始化运行给定TransportClientBootstraps的ClientFactory。Bootstraps会被同步执行,并且必须运行成功才能创建Client 给这个实例TransportClientFactory:赋成员
 * context:TransportContext,
 * conf:TransportConf会通过它成员:ConfigProvider子类关联SparkConf
 * 还有初始化netty的NioSocketChannel.class、NioEventLoopGroup线程组、ByteBuf分配器PooledByteBufAllocator
 **/

public TransportClientFactory createClientFactory(List<TransportClientBootstrap> bootstraps) {
 
return new TransportClientFactory(this, bootstraps);
}

===》会将TransportContext、TransportConf,按IOMode的枚举类型,默认就是NIO, 得到NioSocketChannel.class

基于IOMode枚举创建Netty的EventLoopGroup线程组、创建一个池化的ByteBuf分配器PooledByteBufAllocator给它的在成员变量。

public TransportClientFactory(
   
TransportContext context,
   
List<TransportClientBootstrap> clientBootstraps) {
 
//保证TransportContext不为空
  this.context = Preconditions.checkNotNull(context);
 
//得到TransportConf实例:会按NettyBlockTransferService.fromSparkConf第二个参数给TransportConf为module给它的成员变量设置key类型
  //如:SPARK_NETWORK_IO_MODE_KEY:spark.shuffle.io.mode、 SPARK_NETWORK_IO_SERVERTHREADS_KEY: spark.shuffle.io.serverThreads

  this.conf = context.getConf();
 
//空的ArrayList
  this.clientBootstraps = Lists.newArrayList(Preconditions.checkNotNull(clientBootstraps));
 
//并发ConcurrentHashMap
  this.connectionPool = new ConcurrentHashMap<SocketAddress, ClientPool>();
 
//numConnectionsPerPeer: spark.shuffle.io.numConnectionsPerPeer这个key在sparkConf没有值所以是到默认值1
  this.numConnectionsPerPeer = conf.numConnectionsPerPeer();
 
this.rand= new Random();
 
//conf.ioMode(): SPARK.SHUFFLE.IO.MODE,会返回一个NIO, EPOLL的枚举值。会返回NIO串,然后变成NIO检举值
  IOMode ioMode = IOMode.valueOf(conf.ioMode());
 
//按NIO,返回如:NioSocketChannel.class
  this.socketChannelClass = NettyUtils.getClientChannelClass(ioMode);
 
// TODO:Make thread pool name configurable.
 
/**
   *  conf.clientThreads():对应spark.shuffle.io.clientThreads:是NettyBlockTransferService初始化时==>SparkTransportConf.fromSparkConf
   * 调用SparkTransportConf里面的对应ConfigProvider的get方法实现是:SparkConf.get(SPARK_NETWORK_IO_CLIENTTHREADS_KEY)
   * 而这个spark.shuffle.io.clientThreads,就是对应CoarseGrainedExecutorBackend的core的数量,我的案例设置成1了
   *
   * NettyUtils.createEventLoop:基于IOMode枚举创建Netty的EventLoopGroup线程组
   */

 
this.workerGroup = NettyUtils.createEventLoop(ioMode, conf.clientThreads(), "shuffle-client");
 
/**
   * conf.preferDirectBufs(): 找key是:spark.shuffle.io.preferDirectBufs,查看SparkConf没有这个key则返回true
   * conf.clientThreads()的值是1
   *NettyUtils.createPooledByteBufAllocator():创建一个池化的ByteBuf分配器PooledByteBufAllocator
   */

 
this.pooledAllocator = NettyUtils.createPooledByteBufAllocator(
   
conf.preferDirectBufs(), false /* allowCache */, conf.clientThreads());
}

 

再回到NettyBlockTransferService.init,创建NettyServer

(查看:spark-core_29:Executor初始化过程env.blockManager.initialize(conf.getAppId)-NettyBlockTransferService.init()-NettyServer创建源码分析)


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值