Spark 2.0 之后,master 和worker 之间完全不使用akka 通信,改用netty实现。先不说别的,单就netty的热门度就可以承担起这个重任,言归正题。
本文以master代码为例,讲解RpcEnv , RpcEndpoint ,RpcEndpointRef , 及NettyRpcEnv, NettyRpcEndpointRef 之间关系。
先呈上org.apache.spark.rpc 包的类图
从上面可以看到, spark master , work 继承自RpcEndpoint的,即master/worker通过RPC协议进行通信,是RPC的主体。 而在使用RpcEndpoint之前,需要初始化RpcEnv, RpcEnv是RPC消息通信的环境,用RpcEndpoint 和String 字段名称来注册到RpcEnv,每个RpcEndpoint 会有一个 1 对1 的RpcEndpointRef, 是远端响应、回应RpcEndpoint 的类。
下面以master的代码为例,讲解RpcEnv, RpcEndpoint , RpcEndpointRef 。
1.1 RpcEnv
master中RpcEnv 创建的入口
val rpcEnv = RpcEnv.
create(
SYSTEM_NAME
, host
, port
, conf
, securityMgr)
接着先初始化RpcEnvConfig 配置实例,通过NettyRpcEnvFactory工厂类按此配置实例创建RpcEnv
val config =
RpcEnvConfig(conf
, name
, bindAddress
, advertiseAddress
, port
, securityManager
,
clientMode)
new NettyRpcEnvFactory().create(config)
clientMode)
new NettyRpcEnvFactory().create(config)
接着初始化序列化实例,因为RpcEnv涉及网络传输对象,因此需要统一的序列化和反序列化
val javaSerializerInstance =
new JavaSerializer(sparkConf).newInstance().asInstanceOf[JavaSerializerInstance]
new JavaSerializer(sparkConf).newInstance().asInstanceOf[JavaSerializerInstance]
val nettyEnv =
new NettyRpcEnv(sparkConf , javaSerializerInstance , config.advertiseAddress ,
config.securityManager)
new NettyRpcEnv(sparkConf , javaSerializerInstance , config.advertiseAddress ,
config.securityManager)
会初始化分离器
private val
dispatcher: Dispatcher =
new Dispatcher(
this)
Dispatcher中维护
1 注册RpcEndpoint到NettyRpcEndpointRef 的消息信箱
2 注册RpcEndpoint和NettyRpcEndpointRef的对应关系
3 注册街处理消息信箱的队列
private val
endpoints: ConcurrentMap[
String
, EndpointData] =
new ConcurrentHashMap[ String , EndpointData]
private val endpointRefs: ConcurrentMap[RpcEndpoint , RpcEndpointRef] =
new ConcurrentHashMap[RpcEndpoint , RpcEndpointRef]
// Track the receivers whose inboxes may contain messages.
private val receivers = new LinkedBlockingQueue[EndpointData]
new ConcurrentHashMap[ String , EndpointData]
private val endpointRefs: ConcurrentMap[RpcEndpoint , RpcEndpointRef] =
new ConcurrentHashMap[RpcEndpoint , RpcEndpointRef]
// Track the receivers whose inboxes may contain messages.
private val receivers = new LinkedBlockingQueue[EndpointData]
Dispatcher中创建一个线程池,用于接收处理消息(分离消息),默认是两线程
/** Thread pool used for dispatching messages. */
private val threadpool: ThreadPoolExecutor = {
val numThreads = nettyEnv.conf.getInt( "spark.rpc.netty.dispatcher.numThreads" ,
math. max( 2 , Runtime. getRuntime.availableProcessors()))
val pool = ThreadUtils. newDaemonFixedThreadPool(numThreads , "dispatcher-event-loop")
for (i <- 0 until numThreads) {
pool.execute( new MessageLoop)
}
pool
}
private val threadpool: ThreadPoolExecutor = {
val numThreads = nettyEnv.conf.getInt( "spark.rpc.netty.dispatcher.numThreads" ,
math. max( 2 , Runtime. getRuntime.availableProcessors()))
val pool = ThreadUtils. newDaemonFixedThreadPool(numThreads , "dispatcher-event-loop")
for (i <- 0 until numThreads) {
pool.execute( new MessageLoop)
}
pool
}
/** Message loop used for dispatching messages. */
private class MessageLoop extends Runnable {
override def run(): Unit = {
try {
while ( true) {
try {
val data = receivers.take()
if (data == PoisonPill) {
// Put PoisonPill back so that other MessageLoops can see it.
receivers.offer( PoisonPill)
return
}
data. inbox.process(Dispatcher. this)
} catch {
case NonFatal(e) => logError(e.getMessage , e)
}
}
} catch {
case ie: InterruptedException => // exit
}
}
}
private class MessageLoop extends Runnable {
override def run(): Unit = {
try {
while ( true) {
try {
val data = receivers.take()
if (data == PoisonPill) {
// Put PoisonPill back so that other MessageLoops can see it.
receivers.offer( PoisonPill)
return
}
data. inbox.process(Dispatcher. this)
} catch {
case NonFatal(e) => logError(e.getMessage , e)
}
}
} catch {
case ie: InterruptedException => // exit
}
}
}
如果receivers 为空,线程会阻塞在 val data = receivers.tabke() , 一 旦receivers 不为空,就会从头开始处理,先判断一下是否为PoisonPill (自杀消息),如果不是,就调用消息信箱 process方法处理。
接着创建了一个EndpointData 默认对象
/** A poison endpoint that indicates MessageLoop should exit its message loop. */
private val PoisonPill = new EndpointData( null , null , null)
private val PoisonPill = new EndpointData( null , null , null)
private class EndpointData(
val name: String ,
val endpoint: RpcEndpoint ,
val ref: NettyRpcEndpointRef) {
val inbox = new Inbox(ref , endpoint)
}
val name: String ,
val endpoint: RpcEndpoint ,
val ref: NettyRpcEndpointRef) {
val inbox = new Inbox(ref , endpoint)
}
构造Inbox 时,会向消息队列中添加一下默认消息OnStart
inbox.synchronized {
messages.add(OnStart)
}
messages.add(OnStart)
}
private val
streamManager =
new NettyStreamManager(
this)
steamManager用于在网络上传输jar, file , dirs 。
起动监听端口的服务
Utils.
startServiceOnPort(config.port
, startNettyRpcEnv
, sparkConf
, config.name).
_1
val startNettyRpcEnv:
Int => (NettyRpcEnv
, Int) = { actualPort =>
nettyEnv.startServer(config.bindAddress , actualPort)
(nettyEnv , nettyEnv. address.port)
}
nettyEnv.startServer(config.bindAddress , actualPort)
(nettyEnv , nettyEnv. address.port)
}
起动服务后在分离器上注册了一个RpcEndpoint
server =
transportContext.createServer(bindAddress
, port
, bootstraps)
dispatcher.registerRpcEndpoint(
RpcEndpointVerifier. NAME , new RpcEndpointVerifier( this , dispatcher))
dispatcher.registerRpcEndpoint(
RpcEndpointVerifier. NAME , new RpcEndpointVerifier( this , dispatcher))
名字 :
endpoint-verifier
对象RpcEndpointVerifier
val addr =
RpcEndpointAddress(nettyEnv.
address
, name)
val endpointRef = new NettyRpcEndpointRef(nettyEnv.conf , addr , nettyEnv)
val endpointRef = new NettyRpcEndpointRef(nettyEnv.conf , addr , nettyEnv)
endpointRef := NettyRpcEndpointRef(spark://endpoint-verifier@blackeye.com:7077)
if (
endpoints.putIfAbsent(name
,
new EndpointData(name
, endpoint
, endpointRef)) !=
null) {
throw new IllegalArgumentException( s"There is already an RpcEndpoint called $name ")
}
val data = endpoints.get(name)
endpointRefs.put(data.endpoint , data.ref)
receivers.offer(data) // for the OnStart message
throw new IllegalArgumentException( s"There is already an RpcEndpoint called $name ")
}
val data = endpoints.get(name)
endpointRefs.put(data.endpoint , data.ref)
receivers.offer(data) // for the OnStart message
在执行完receivers.offer(data) 后, receivers中有元素了,于是一个线程会 被唤醒,处理OnStart消息
case OnStart =>
endpoint.onStart()
if (!endpoint.isInstanceOf[ThreadSafeRpcEndpoint]) {
inbox.synchronized {
if (! stopped) {
enableConcurrent = true
}
}
}
endpoint.onStart()
if (!endpoint.isInstanceOf[ThreadSafeRpcEndpoint]) {
inbox.synchronized {
if (! stopped) {
enableConcurrent = true
}
}
}
进而调到Master的onStart方法
override def
onStart():
Unit = {
logInfo( "Starting Spark master at " + masterUrl)
logInfo( s"Running Spark version ${org.apache.spark. SPARK_VERSION} ")
webUi = new MasterWebUI( this , webUiPort)
webUi.bind()
masterWebUiUrl = "http://" + masterPublicAddress + ":" + webUi.boundPort
if ( reverseProxy) {
masterWebUiUrl = conf.get( "spark.ui.reverseProxyUrl" , masterWebUiUrl)
logInfo( s"Spark Master is acting as a reverse proxy. Master, Workers and " +
s"Applications UIs are available at $ masterWebUiUrl ")
}
checkForWorkerTimeOutTask = forwardMessageThread.scheduleAtFixedRate( new Runnable {
override def run(): Unit = Utils. tryLogNonFatalError {
self.send(CheckForWorkerTimeOut)
}
} , 0 , WORKER_TIMEOUT_MS , TimeUnit. MILLISECONDS)
if ( restServerEnabled) {
val port = conf.getInt( "spark.master.rest.port" , 6066)
restServer = Some( new StandaloneRestServer(address.host , port , conf , self , masterUrl))
}
restServerBoundPort = restServer.map(_.start())
masterMetricsSystem.registerSource( masterSource)
masterMetricsSystem.start()
applicationMetricsSystem.start()
// Attach the master and app metrics servlet handler to the web ui after the metrics systems are
// started.
masterMetricsSystem.getServletHandlers.foreach( webUi.attachHandler)
applicationMetricsSystem.getServletHandlers.foreach( webUi.attachHandler)
val serializer = new JavaSerializer(conf)
val (persistenceEngine_ , leaderElectionAgent_) = RECOVERY_MODE match {
case "ZOOKEEPER" =>
logInfo( "Persisting recovery state to ZooKeeper")
val zkFactory =
new ZooKeeperRecoveryModeFactory(conf , serializer)
(zkFactory.createPersistenceEngine() , zkFactory.createLeaderElectionAgent( this))
case "FILESYSTEM" =>
val fsFactory =
new FileSystemRecoveryModeFactory(conf , serializer)
(fsFactory.createPersistenceEngine() , fsFactory.createLeaderElectionAgent( this))
case "CUSTOM" =>
val clazz = Utils. classForName(conf.get( "spark.deploy.recoveryMode.factory"))
val factory = clazz.getConstructor( classOf[SparkConf] , classOf[Serializer])
.newInstance(conf , serializer)
.asInstanceOf[StandaloneRecoveryModeFactory]
(factory.createPersistenceEngine() , factory.createLeaderElectionAgent( this))
case _ =>
( new BlackHolePersistenceEngine() , new MonarchyLeaderAgent( this))
}
persistenceEngine = persistenceEngine_
leaderElectionAgent = leaderElectionAgent_
}
logInfo( "Starting Spark master at " + masterUrl)
logInfo( s"Running Spark version ${org.apache.spark. SPARK_VERSION} ")
webUi = new MasterWebUI( this , webUiPort)
webUi.bind()
masterWebUiUrl = "http://" + masterPublicAddress + ":" + webUi.boundPort
if ( reverseProxy) {
masterWebUiUrl = conf.get( "spark.ui.reverseProxyUrl" , masterWebUiUrl)
logInfo( s"Spark Master is acting as a reverse proxy. Master, Workers and " +
s"Applications UIs are available at $ masterWebUiUrl ")
}
checkForWorkerTimeOutTask = forwardMessageThread.scheduleAtFixedRate( new Runnable {
override def run(): Unit = Utils. tryLogNonFatalError {
self.send(CheckForWorkerTimeOut)
}
} , 0 , WORKER_TIMEOUT_MS , TimeUnit. MILLISECONDS)
if ( restServerEnabled) {
val port = conf.getInt( "spark.master.rest.port" , 6066)
restServer = Some( new StandaloneRestServer(address.host , port , conf , self , masterUrl))
}
restServerBoundPort = restServer.map(_.start())
masterMetricsSystem.registerSource( masterSource)
masterMetricsSystem.start()
applicationMetricsSystem.start()
// Attach the master and app metrics servlet handler to the web ui after the metrics systems are
// started.
masterMetricsSystem.getServletHandlers.foreach( webUi.attachHandler)
applicationMetricsSystem.getServletHandlers.foreach( webUi.attachHandler)
val serializer = new JavaSerializer(conf)
val (persistenceEngine_ , leaderElectionAgent_) = RECOVERY_MODE match {
case "ZOOKEEPER" =>
logInfo( "Persisting recovery state to ZooKeeper")
val zkFactory =
new ZooKeeperRecoveryModeFactory(conf , serializer)
(zkFactory.createPersistenceEngine() , zkFactory.createLeaderElectionAgent( this))
case "FILESYSTEM" =>
val fsFactory =
new FileSystemRecoveryModeFactory(conf , serializer)
(fsFactory.createPersistenceEngine() , fsFactory.createLeaderElectionAgent( this))
case "CUSTOM" =>
val clazz = Utils. classForName(conf.get( "spark.deploy.recoveryMode.factory"))
val factory = clazz.getConstructor( classOf[SparkConf] , classOf[Serializer])
.newInstance(conf , serializer)
.asInstanceOf[StandaloneRecoveryModeFactory]
(factory.createPersistenceEngine() , factory.createLeaderElectionAgent( this))
case _ =>
( new BlackHolePersistenceEngine() , new MonarchyLeaderAgent( this))
}
persistenceEngine = persistenceEngine_
leaderElectionAgent = leaderElectionAgent_
}
webUi.bind 会绑定8080端口,启动WEB服务
定期发送CheckForWorkerTimeOut消息给master处理
判断是否启动restServerEnabled , restServer.map(_.start()) 启动6066 端口的rest服务监听
RECONVERY_MODE 默认是"NONE" , 可以配置"ZOOKEEPER","FILESYSTEM","CUSTOM"
private val
RECOVERY_MODE = conf.get(
"spark.deploy.recoveryMode"
,
"NONE")
1.2 RpcEndpoint
利用1.1中创建的rpcEnv 来创建RpcEndpoint
val masterEndpoint = rpcEnv.setupEndpoint(
ENDPOINT_NAME
,
new Master(rpcEnv , rpcEnv.address , webUiPort , securityMgr , conf))
new Master(rpcEnv , rpcEnv.address , webUiPort , securityMgr , conf))
名称:Master
对象:Master
创建Endpoint
override def
setupEndpoint(name:
String
, endpoint: RpcEndpoint): RpcEndpointRef = {
dispatcher.registerRpcEndpoint(name , endpoint)
}
dispatcher.registerRpcEndpoint(name , endpoint)
}
def
registerRpcEndpoint(name:
String
, endpoint: RpcEndpoint): NettyRpcEndpointRef = {
val addr = RpcEndpointAddress(nettyEnv. address , name)
val endpointRef = new NettyRpcEndpointRef(nettyEnv.conf , addr , nettyEnv)
synchronized {
if ( stopped) {
throw new IllegalStateException( "RpcEnv has been stopped")
}
if ( endpoints.putIfAbsent(name , new EndpointData(name , endpoint , endpointRef)) != null) {
throw new IllegalArgumentException( s"There is already an RpcEndpoint called $name ")
}
val data = endpoints.get(name)
endpointRefs.put(data.endpoint , data.ref)
receivers.offer(data) // for the OnStart message
}
endpointRef
}
val addr = RpcEndpointAddress(nettyEnv. address , name)
val endpointRef = new NettyRpcEndpointRef(nettyEnv.conf , addr , nettyEnv)
synchronized {
if ( stopped) {
throw new IllegalStateException( "RpcEnv has been stopped")
}
if ( endpoints.putIfAbsent(name , new EndpointData(name , endpoint , endpointRef)) != null) {
throw new IllegalArgumentException( s"There is already an RpcEndpoint called $name ")
}
val data = endpoints.get(name)
endpointRefs.put(data.endpoint , data.ref)
receivers.offer(data) // for the OnStart message
}
endpointRef
}
构造一个NettyRpcEndpointRef 实例,注册到endpoints , endpointRefs 中, 将EndpointData 放到receivers
同时将这个NettyRpcEndpointRef 实例返回
val portsResponse = masterEndpoint.askWithRetry[BoundPortsResponse](BoundPortsRequest)
由对端NettyRpcEndpointRef 发送消息
def
askWithRetry[
T: ClassTag](message:
Any, timeout: RpcTimeout):
T = {
// TODO: Consider removing multiple attempts
var attempts = 0
var lastException: Exception = null
while (attempts < maxRetries) {
attempts += 1
try {
val future = ask[ T](message , timeout)
val result = timeout.awaitResult(future)
if (result == null) {
throw new SparkException( "RpcEndpoint returned null")
}
return result
} catch {
case ie: InterruptedException => throw ie
case e: Exception =>
lastException = e
logWarning( s"Error sending message [message = $message ] in $attempts attempts" , e)
}
if (attempts < maxRetries) {
Thread. sleep( retryWaitMs)
}
}
// TODO: Consider removing multiple attempts
var attempts = 0
var lastException: Exception = null
while (attempts < maxRetries) {
attempts += 1
try {
val future = ask[ T](message , timeout)
val result = timeout.awaitResult(future)
if (result == null) {
throw new SparkException( "RpcEndpoint returned null")
}
return result
} catch {
case ie: InterruptedException => throw ie
case e: Exception =>
lastException = e
logWarning( s"Error sending message [message = $message ] in $attempts attempts" , e)
}
if (attempts < maxRetries) {
Thread. sleep( retryWaitMs)
}
}
将消息发送出去,并等待返回结果,如果还没有到反复尝试的上限,则会修改几秒后再试
override def
ask[
T: ClassTag](message:
Any, timeout: RpcTimeout): Future[
T] = {
nettyEnv.ask( RequestMessage(nettyEnv. address , this , message) , timeout)
}
nettyEnv.ask( RequestMessage(nettyEnv. address , this , message) , timeout)
}
private[netty]
def
ask[
T: ClassTag](message: RequestMessage
, timeout: RpcTimeout): Future[
T] = {
val promise = Promise[ Any]()
val remoteAddr = message.receiver.address
def onFailure(e: Throwable): Unit = {
if (!promise.tryFailure(e)) {
logWarning( s"Ignored failure: $e ")
}
}
def onSuccess(reply: Any): Unit = reply match {
case RpcFailure(e) => onFailure(e)
case rpcReply =>
if (!promise.trySuccess(rpcReply)) {
logWarning( s"Ignored message: $reply ")
}
}
try {
if (remoteAddr == address) {
val p = Promise[ Any]()
p.future.onComplete {
case Success(response) => onSuccess(response)
case Failure(e) => onFailure(e)
}(ThreadUtils. sameThread)
dispatcher.postLocalMessage(message , p)
} else {
val rpcMessage = RpcOutboxMessage(serialize(message) ,
onFailure ,
(client , response) => onSuccess(deserialize[ Any](client , response)))
postToOutbox(message.receiver , rpcMessage)
promise.future.onFailure {
case _: TimeoutException => rpcMessage.onTimeout()
case _ =>
}(ThreadUtils. sameThread)
}
val timeoutCancelable = timeoutScheduler.schedule( new Runnable {
override def run(): Unit = {
onFailure( new TimeoutException( s"Cannot receive any reply in ${timeout.duration} "))
}
} , timeout.duration.toNanos , TimeUnit. NANOSECONDS)
promise.future.onComplete { v =>
timeoutCancelable.cancel( true)
}(ThreadUtils. sameThread)
} catch {
case NonFatal(e) =>
onFailure(e)
}
promise.future.mapTo[ T].recover(timeout.addMessageIfTimeout)(ThreadUtils. sameThread)
}
val promise = Promise[ Any]()
val remoteAddr = message.receiver.address
def onFailure(e: Throwable): Unit = {
if (!promise.tryFailure(e)) {
logWarning( s"Ignored failure: $e ")
}
}
def onSuccess(reply: Any): Unit = reply match {
case RpcFailure(e) => onFailure(e)
case rpcReply =>
if (!promise.trySuccess(rpcReply)) {
logWarning( s"Ignored message: $reply ")
}
}
try {
if (remoteAddr == address) {
val p = Promise[ Any]()
p.future.onComplete {
case Success(response) => onSuccess(response)
case Failure(e) => onFailure(e)
}(ThreadUtils. sameThread)
dispatcher.postLocalMessage(message , p)
} else {
val rpcMessage = RpcOutboxMessage(serialize(message) ,
onFailure ,
(client , response) => onSuccess(deserialize[ Any](client , response)))
postToOutbox(message.receiver , rpcMessage)
promise.future.onFailure {
case _: TimeoutException => rpcMessage.onTimeout()
case _ =>
}(ThreadUtils. sameThread)
}
val timeoutCancelable = timeoutScheduler.schedule( new Runnable {
override def run(): Unit = {
onFailure( new TimeoutException( s"Cannot receive any reply in ${timeout.duration} "))
}
} , timeout.duration.toNanos , TimeUnit. NANOSECONDS)
promise.future.onComplete { v =>
timeoutCancelable.cancel( true)
}(ThreadUtils. sameThread)
} catch {
case NonFatal(e) =>
onFailure(e)
}
promise.future.mapTo[ T].recover(timeout.addMessageIfTimeout)(ThreadUtils. sameThread)
}
如果remoteAddr == address 就走本地发送消息, 否则发给Outbox 。
def
postLocalMessage(message: RequestMessage
, p: Promise[
Any]):
Unit = {
val rpcCallContext =
new LocalNettyRpcCallContext(message.senderAddress , p)
val rpcMessage = RpcMessage(message.senderAddress , message.content , rpcCallContext)
postMessage(message.receiver.name , rpcMessage , (e) => p.tryFailure(e))
}
val rpcCallContext =
new LocalNettyRpcCallContext(message.senderAddress , p)
val rpcMessage = RpcMessage(message.senderAddress , message.content , rpcCallContext)
postMessage(message.receiver.name , rpcMessage , (e) => p.tryFailure(e))
}
初始化RpcCallContext 和RpcMessage , 然后发送消息
private def
postMessage(
endpointName: String ,
message: InboxMessage ,
callbackIfStopped: ( Exception) => Unit): Unit = {
val error = synchronized {
val data = endpoints.get(endpointName)
if ( stopped) {
Some( new RpcEnvStoppedException())
} else if (data == null) {
Some( new SparkException( s"Could not find $endpointName ."))
} else {
data. inbox.post(message)
receivers.offer(data)
None
}
}
// We don't need to call `onStop` in the `synchronized` block
error.foreach(callbackIfStopped)
}
endpointName: String ,
message: InboxMessage ,
callbackIfStopped: ( Exception) => Unit): Unit = {
val error = synchronized {
val data = endpoints.get(endpointName)
if ( stopped) {
Some( new RpcEnvStoppedException())
} else if (data == null) {
Some( new SparkException( s"Could not find $endpointName ."))
} else {
data. inbox.post(message)
receivers.offer(data)
None
}
}
// We don't need to call `onStop` in the `synchronized` block
error.foreach(callbackIfStopped)
}
首先获取需要发送消息的RpcEndpoint 名称, 得到需要发送的EndpointData , 然后将消息放到RpcEndpoint的信箱里,将RpcEndpoint放到需要发送消息的队列里。
紧接着Dispatcher 分离线程处理消息:
def
process(dispatcher: Dispatcher):
Unit = {
var message: InboxMessage = null
inbox.synchronized {
if (! enableConcurrent && numActiveThreads != 0) {
return
}
message = messages.poll()
if (message != null) {
numActiveThreads += 1
} else {
return
}
}
while ( true) {
safelyCall(endpoint) {
message match {
case RpcMessage(_sender , content , context) =>
try {
endpoint.receiveAndReply(context).applyOrElse[ Any, Unit](content , { msg =>
throw new SparkException( s"Unsupported message $message from ${_sender} ")
})
} catch {
case NonFatal(e) =>
context.sendFailure(e)
// Throw the exception -- this exception will be caught by the safelyCall function.
// The endpoint's onError function will be called.
throw e
}
var message: InboxMessage = null
inbox.synchronized {
if (! enableConcurrent && numActiveThreads != 0) {
return
}
message = messages.poll()
if (message != null) {
numActiveThreads += 1
} else {
return
}
}
while ( true) {
safelyCall(endpoint) {
message match {
case RpcMessage(_sender , content , context) =>
try {
endpoint.receiveAndReply(context).applyOrElse[ Any, Unit](content , { msg =>
throw new SparkException( s"Unsupported message $message from ${_sender} ")
})
} catch {
case NonFatal(e) =>
context.sendFailure(e)
// Throw the exception -- this exception will be caught by the safelyCall function.
// The endpoint's onError function will be called.
throw e
}
此处endpoint是master , 于是调到Master 的receiveAndReply
override def
receiveAndReply(context: RpcCallContext): PartialFunction[
Any, Unit] =
case BoundPortsRequest =>
context.reply( BoundPortsResponse(address.port , webUi.boundPort , restServerBoundPort))
context.reply( BoundPortsResponse(address.port , webUi.boundPort , restServerBoundPort))
private[netty]
abstract class NettyRpcCallContext(
override val senderAddress: RpcAddress)
extends RpcCallContext with Logging {
protected def send(message: Any): Unit
override def reply(response: Any): Unit = {
send(response)
}
override def sendFailure(e: Throwable): Unit = {
send( RpcFailure(e))
}
}
extends RpcCallContext with Logging {
protected def send(message: Any): Unit
override def reply(response: Any): Unit = {
send(response)
}
override def sendFailure(e: Throwable): Unit = {
send( RpcFailure(e))
}
}
override protected def
send(message:
Any):
Unit = {
p.success(message)
}
p.success(message)
}
1.3 通过以上分析,大家应该对Spark Master的启动过程有一个认识, 同时也窥探spark Rpc的原理