Spark2.X 原码分析 ---- Rpc初探_endpoint netty-CSDN博客

本文链接：https://blog.csdn.net/hopeatme/article/details/70195327

Spark 2.0 之后，master 和worker 之间完全不使用akka 通信，改用netty实现。先不说别的，单就netty的热门度就可以承担起这个重任，言归正题。

本文以master代码为例，讲解RpcEnv , RpcEndpoint ,RpcEndpointRef , 及NettyRpcEnv, NettyRpcEndpointRef 之间关系。

先呈上org.apache.spark.rpc 包的类图

从上面可以看到， spark master , work 继承自RpcEndpoint的，即master/worker通过RPC协议进行通信，是RPC的主体。而在使用RpcEndpoint之前，需要初始化RpcEnv， RpcEnv是RPC消息通信的环境，用RpcEndpoint 和String 字段名称来注册到RpcEnv，每个RpcEndpoint 会有一个 1 对1 的RpcEndpointRef, 是远端响应、回应RpcEndpoint 的类。

下面以master的代码为例，讲解RpcEnv, RpcEndpoint , RpcEndpointRef 。

1.1 RpcEnv

master中RpcEnv 创建的入口

val rpcEnv = RpcEnv. create( SYSTEM_NAME , host , port , conf , securityMgr)

接着先初始化RpcEnvConfig 配置实例，通过NettyRpcEnvFactory工厂类按此配置实例创建RpcEnv

val config = RpcEnvConfig(conf , name , bindAddress , advertiseAddress , port , securityManager ,
clientMode)
new NettyRpcEnvFactory().create(config)

接着初始化序列化实例，因为RpcEnv涉及网络传输对象，因此需要统一的序列化和反序列化

val javaSerializerInstance =
new JavaSerializer(sparkConf).newInstance().asInstanceOf[JavaSerializerInstance]

val nettyEnv =
new NettyRpcEnv(sparkConf , javaSerializerInstance , config.advertiseAddress ,
config.securityManager)

会初始化分离器

private val dispatcher: Dispatcher = new Dispatcher( this)

Dispatcher中维护

1 注册RpcEndpoint到NettyRpcEndpointRef 的消息信箱

2 注册RpcEndpoint和NettyRpcEndpointRef的对应关系

3 注册街处理消息信箱的队列

private val endpoints: ConcurrentMap[ String , EndpointData] =
new ConcurrentHashMap[ String , EndpointData]
private val endpointRefs: ConcurrentMap[RpcEndpoint , RpcEndpointRef] =
new ConcurrentHashMap[RpcEndpoint , RpcEndpointRef]

// Track the receivers whose inboxes may contain messages.
private val receivers = new LinkedBlockingQueue[EndpointData]

Dispatcher中创建一个线程池，用于接收处理消息（分离消息），默认是两线程

/** Thread pool used for dispatching messages. */
private val threadpool: ThreadPoolExecutor = {
val numThreads = nettyEnv.conf.getInt( "spark.rpc.netty.dispatcher.numThreads" ,
math. max( 2 , Runtime. getRuntime.availableProcessors()))
val pool = ThreadUtils. newDaemonFixedThreadPool(numThreads , "dispatcher-event-loop")
for (i <- 0 until numThreads) {
pool.execute( new MessageLoop)
}
pool
}

/** Message loop used for dispatching messages. */
private class MessageLoop extends Runnable {
override def run(): Unit = {
try {
while ( true) {
try {
val data = receivers.take()
if (data == PoisonPill) {
// Put PoisonPill back so that other MessageLoops can see it.
receivers.offer( PoisonPill)
return
}
data. inbox.process(Dispatcher. this)
} catch {
case NonFatal(e) => logError(e.getMessage , e)
}
}
} catch {
case ie: InterruptedException => // exit
}
}
}

如果receivers 为空，线程会阻塞在 val data = receivers.tabke() ，一旦receivers 不为空，就会从头开始处理，先判断一下是否为PoisonPill （自杀消息），如果不是，就调用消息信箱 process方法处理。

接着创建了一个EndpointData 默认对象

/** A poison endpoint that indicates MessageLoop should exit its message loop. */
private val PoisonPill = new EndpointData( null , null , null)

private class EndpointData(
val name: String ,
val endpoint: RpcEndpoint ,
val ref: NettyRpcEndpointRef) {
val inbox = new Inbox(ref , endpoint)
}

构造Inbox 时，会向消息队列中添加一下默认消息OnStart

inbox.synchronized {
messages.add(OnStart)
}

private val streamManager = new NettyStreamManager( this)

steamManager用于在网络上传输jar, file , dirs 。

起动监听端口的服务

Utils. startServiceOnPort(config.port , startNettyRpcEnv , sparkConf , config.name). _1

val startNettyRpcEnv: Int => (NettyRpcEnv , Int) = { actualPort =>
nettyEnv.startServer(config.bindAddress , actualPort)
(nettyEnv , nettyEnv. address.port)
}

起动服务后在分离器上注册了一个RpcEndpoint

server = transportContext.createServer(bindAddress , port , bootstraps)
dispatcher.registerRpcEndpoint(
RpcEndpointVerifier. NAME , new RpcEndpointVerifier( this , dispatcher))

名字：

endpoint-verifier

对象RpcEndpointVerifier

val addr = RpcEndpointAddress(nettyEnv. address , name)
val endpointRef = new NettyRpcEndpointRef(nettyEnv.conf , addr , nettyEnv)

endpointRef := NettyRpcEndpointRef(spark://endpoint-verifier@blackeye.com:7077)

if ( endpoints.putIfAbsent(name , new EndpointData(name , endpoint , endpointRef)) != null) {
throw new IllegalArgumentException( s"There is already an RpcEndpoint called $name ")
}
val data = endpoints.get(name)
endpointRefs.put(data.endpoint , data.ref)
receivers.offer(data) // for the OnStart message

在执行完receivers.offer(data) 后， receivers中有元素了，于是一个线程会被唤醒，处理OnStart消息

case OnStart =>
endpoint.onStart()
if (!endpoint.isInstanceOf[ThreadSafeRpcEndpoint]) {
inbox.synchronized {
if (! stopped) {
enableConcurrent = true
}
}
}

进而调到Master的onStart方法

override def onStart(): Unit = {
logInfo( "Starting Spark master at " + masterUrl)
logInfo( s"Running Spark version ${org.apache.spark. SPARK_VERSION} ")
webUi = new MasterWebUI( this , webUiPort)
webUi.bind()
masterWebUiUrl = "http://" + masterPublicAddress + ":" + webUi.boundPort
if ( reverseProxy) {
masterWebUiUrl = conf.get( "spark.ui.reverseProxyUrl" , masterWebUiUrl)
logInfo( s"Spark Master is acting as a reverse proxy. Master, Workers and " +
s"Applications UIs are available at $ masterWebUiUrl ")
}
checkForWorkerTimeOutTask = forwardMessageThread.scheduleAtFixedRate( new Runnable {
override def run(): Unit = Utils. tryLogNonFatalError {
self.send(CheckForWorkerTimeOut)
}
} , 0 , WORKER_TIMEOUT_MS , TimeUnit. MILLISECONDS)

if ( restServerEnabled) {
val port = conf.getInt( "spark.master.rest.port" , 6066)
restServer = Some( new StandaloneRestServer(address.host , port , conf , self , masterUrl))
}
restServerBoundPort = restServer.map(_.start())

masterMetricsSystem.registerSource( masterSource)
masterMetricsSystem.start()
applicationMetricsSystem.start()
// Attach the master and app metrics servlet handler to the web ui after the metrics systems are
// started.
masterMetricsSystem.getServletHandlers.foreach( webUi.attachHandler)
applicationMetricsSystem.getServletHandlers.foreach( webUi.attachHandler)

val serializer = new JavaSerializer(conf)
val (persistenceEngine_ , leaderElectionAgent_) = RECOVERY_MODE match {
case "ZOOKEEPER" =>
logInfo( "Persisting recovery state to ZooKeeper")
val zkFactory =
new ZooKeeperRecoveryModeFactory(conf , serializer)
(zkFactory.createPersistenceEngine() , zkFactory.createLeaderElectionAgent( this))
case "FILESYSTEM" =>
val fsFactory =
new FileSystemRecoveryModeFactory(conf , serializer)
(fsFactory.createPersistenceEngine() , fsFactory.createLeaderElectionAgent( this))
case "CUSTOM" =>
val clazz = Utils. classForName(conf.get( "spark.deploy.recoveryMode.factory"))
val factory = clazz.getConstructor( classOf[SparkConf] , classOf[Serializer])
.newInstance(conf , serializer)
.asInstanceOf[StandaloneRecoveryModeFactory]
(factory.createPersistenceEngine() , factory.createLeaderElectionAgent( this))
case _ =>
( new BlackHolePersistenceEngine() , new MonarchyLeaderAgent( this))
}
persistenceEngine = persistenceEngine_
leaderElectionAgent = leaderElectionAgent_
}

webUi.bind 会绑定8080端口，启动WEB服务

定期发送CheckForWorkerTimeOut消息给master处理

判断是否启动restServerEnabled ， restServer.map（_.start()) 启动6066 端口的rest服务监听

RECONVERY_MODE 默认是"NONE" ，可以配置"ZOOKEEPER","FILESYSTEM","CUSTOM"

private val RECOVERY_MODE = conf.get( "spark.deploy.recoveryMode" , "NONE")

1.2 RpcEndpoint

利用1.1中创建的rpcEnv 来创建RpcEndpoint

val masterEndpoint = rpcEnv.setupEndpoint( ENDPOINT_NAME ,
new Master(rpcEnv , rpcEnv.address , webUiPort , securityMgr , conf))

名称：Master

对象：Master

创建Endpoint

override def setupEndpoint(name: String , endpoint: RpcEndpoint): RpcEndpointRef = {
dispatcher.registerRpcEndpoint(name , endpoint)
}

def registerRpcEndpoint(name: String , endpoint: RpcEndpoint): NettyRpcEndpointRef = {
val addr = RpcEndpointAddress(nettyEnv. address , name)
val endpointRef = new NettyRpcEndpointRef(nettyEnv.conf , addr , nettyEnv)
synchronized {
if ( stopped) {
throw new IllegalStateException( "RpcEnv has been stopped")
}
if ( endpoints.putIfAbsent(name , new EndpointData(name , endpoint , endpointRef)) != null) {
throw new IllegalArgumentException( s"There is already an RpcEndpoint called $name ")
}
val data = endpoints.get(name)
endpointRefs.put(data.endpoint , data.ref)
receivers.offer(data) // for the OnStart message
}
endpointRef
}

构造一个NettyRpcEndpointRef 实例，注册到endpoints , endpointRefs 中，将EndpointData 放到receivers

同时将这个NettyRpcEndpointRef 实例返回

val portsResponse = masterEndpoint.askWithRetry[BoundPortsResponse](BoundPortsRequest)

由对端NettyRpcEndpointRef 发送消息

def askWithRetry[ T: ClassTag](message: Any, timeout: RpcTimeout): T = {
// TODO: Consider removing multiple attempts
var attempts = 0
var lastException: Exception = null
while (attempts < maxRetries) {
attempts += 1
try {
val future = ask[ T](message , timeout)
val result = timeout.awaitResult(future)
if (result == null) {
throw new SparkException( "RpcEndpoint returned null")
}
return result
} catch {
case ie: InterruptedException => throw ie
case e: Exception =>
lastException = e
logWarning( s"Error sending message [message = $message ] in $attempts attempts" , e)
}

if (attempts < maxRetries) {
Thread. sleep( retryWaitMs)
}
}

将消息发送出去，并等待返回结果，如果还没有到反复尝试的上限，则会修改几秒后再试

override def ask[ T: ClassTag](message: Any, timeout: RpcTimeout): Future[ T] = {
nettyEnv.ask( RequestMessage(nettyEnv. address , this , message) , timeout)
}

private[netty] def ask[ T: ClassTag](message: RequestMessage , timeout: RpcTimeout): Future[ T] = {
val promise = Promise[ Any]()
val remoteAddr = message.receiver.address

def onFailure(e: Throwable): Unit = {
if (!promise.tryFailure(e)) {
logWarning( s"Ignored failure: $e ")
}
}

def onSuccess(reply: Any): Unit = reply match {
case RpcFailure(e) => onFailure(e)
case rpcReply =>
if (!promise.trySuccess(rpcReply)) {
logWarning( s"Ignored message: $reply ")
}
}

try {
if (remoteAddr == address) {
val p = Promise[ Any]()
p.future.onComplete {
case Success(response) => onSuccess(response)
case Failure(e) => onFailure(e)
}(ThreadUtils. sameThread)
dispatcher.postLocalMessage(message , p)
} else {
val rpcMessage = RpcOutboxMessage(serialize(message) ,
onFailure ,
(client , response) => onSuccess(deserialize[ Any](client , response)))
postToOutbox(message.receiver , rpcMessage)
promise.future.onFailure {
case _: TimeoutException => rpcMessage.onTimeout()
case _ =>
}(ThreadUtils. sameThread)
}

val timeoutCancelable = timeoutScheduler.schedule( new Runnable {
override def run(): Unit = {
onFailure( new TimeoutException( s"Cannot receive any reply in ${timeout.duration} "))
}
} , timeout.duration.toNanos , TimeUnit. NANOSECONDS)
promise.future.onComplete { v =>
timeoutCancelable.cancel( true)
}(ThreadUtils. sameThread)
} catch {
case NonFatal(e) =>
onFailure(e)
}
promise.future.mapTo[ T].recover(timeout.addMessageIfTimeout)(ThreadUtils. sameThread)
}

如果remoteAddr == address 就走本地发送消息，否则发给Outbox 。

def postLocalMessage(message: RequestMessage , p: Promise[ Any]): Unit = {
val rpcCallContext =
new LocalNettyRpcCallContext(message.senderAddress , p)
val rpcMessage = RpcMessage(message.senderAddress , message.content , rpcCallContext)
postMessage(message.receiver.name , rpcMessage , (e) => p.tryFailure(e))
}

初始化RpcCallContext 和RpcMessage , 然后发送消息

private def postMessage(
endpointName: String ,
message: InboxMessage ,
callbackIfStopped: ( Exception) => Unit): Unit = {
val error = synchronized {
val data = endpoints.get(endpointName)
if ( stopped) {
Some( new RpcEnvStoppedException())
} else if (data == null) {
Some( new SparkException( s"Could not find $endpointName ."))
} else {
data. inbox.post(message)
receivers.offer(data)
None
}
}
// We don't need to call `onStop` in the `synchronized` block
error.foreach(callbackIfStopped)
}

首先获取需要发送消息的RpcEndpoint 名称，得到需要发送的EndpointData ，然后将消息放到RpcEndpoint的信箱里，将RpcEndpoint放到需要发送消息的队列里。

紧接着Dispatcher 分离线程处理消息：

def process(dispatcher: Dispatcher): Unit = {
var message: InboxMessage = null
inbox.synchronized {
if (! enableConcurrent && numActiveThreads != 0) {
return
}
message = messages.poll()
if (message != null) {
numActiveThreads += 1
} else {
return
}
}
while ( true) {
safelyCall(endpoint) {
message match {
case RpcMessage(_sender , content , context) =>
try {
endpoint.receiveAndReply(context).applyOrElse[ Any, Unit](content , { msg =>
throw new SparkException( s"Unsupported message $message from ${_sender} ")
})
} catch {
case NonFatal(e) =>
context.sendFailure(e)
// Throw the exception -- this exception will be caught by the safelyCall function.
// The endpoint's onError function will be called.
throw e
}

此处endpoint是master ，于是调到Master 的receiveAndReply

override def receiveAndReply(context: RpcCallContext): PartialFunction[ Any, Unit] =

case BoundPortsRequest =>
context.reply( BoundPortsResponse(address.port , webUi.boundPort , restServerBoundPort))

private[netty] abstract class NettyRpcCallContext( override val senderAddress: RpcAddress)
extends RpcCallContext with Logging {

protected def send(message: Any): Unit

override def reply(response: Any): Unit = {
send(response)
}

override def sendFailure(e: Throwable): Unit = {
send( RpcFailure(e))
}

}

override protected def send(message: Any): Unit = {
p.success(message)
}

1.3 通过以上分析，大家应该对Spark Master的启动过程有一个认识，同时也窥探spark Rpc的原理