Spark集群中涉及分布式节点之间的通信,例如Worker启动注册信息到Master,这个过程就有消息的传输,接下来将要分析一下消息的传输过程。首先Worker的启动还是从main方法开始,之后运行期初始化OnStart方法调用registerWithMaster完成Worker向Master的注册,registerWithMaster经过一系列调用最后调用org.apache.spark.deploy.worker.Worker#registerWithMaster方法开始真正向Master发起注册信息:
/** 向master注册worker
*
* @param masterEndpoint
*/
private def registerWithMaster(masterEndpoint: RpcEndpointRef): Unit = {
//RegisterWorker是向Master发送的消息
masterEndpoint.ask[RegisterWorkerResponse](RegisterWorker(workerId, host, port, self, cores, memory, workerWebUiUrl))
.onComplete {
// This is a very fast action so we can use "ThreadUtils.sameThread"
case Success(msg) =>
Utils.tryLogNonFatalError {
handleRegisterResponse(msg)
}
case Failure(e) =>
logError(s"Cannot register with master: ${masterEndpoint.address}", e)
System.exit(1)
}(ThreadUtils.sameThread)
}
def ask[T: ClassTag](message: Any): Future[T] = ask(message, defaultAskTimeout)
接下来调用org.apache.spark.rpc.netty.NettyRpcEndpointRef#ask方法:
override def ask[T: ClassTag](message: Any, timeout: RpcTimeout): Future[T] = {
//this 是NettyRpcEndpointRef的引用,为了让消息接收到容易知道消息发送者
nettyEnv.ask(RequestMessage(nettyEnv.address, this, message), timeout)
}
在调用org.apache.spark.rpc.netty.NettyRpcEnv#ask方法:
private[netty] def ask[T: ClassTag](message: RequestMessage, timeout: RpcTimeout): Future[T] = {
//省略部分代码
try {
if (remoteAddr == address) {
val p = Promise[Any]()
p.future.onComplete {
case Success(response) => onSuccess(response)
case Failure(e) => onFailure(e)
}(ThreadUtils.sameThread)
//本地消息
dispatcher.postLocalMessage(message, p)
} else {
//外部节点消息
val rpcMessage = RpcOutboxMessage(serialize(message),
onFailure,
(client, response) => onSuccess(deserialize[Any](client, response)))
//发送消息
postToOutbox(message.receiver, rpcMessage)
promise.future.onFailure {
case _: TimeoutException => rpcMessage.onTimeout()
case _ =>
}(ThreadUtils.sameThread)
}
//省略部分代码
}
接下来调用org.apache.spark.rpc.netty.NettyRpcEnv#postToOutbox方法:
private def postToOutbox(receiver: NettyRpcEndpointRef, message: OutboxMessage): Unit = {
if (receiver.client != null) {
//TODO 使用当前接受者的endpointref的TransportClient传输消息
message.sendWith(receiver.client)
} else {
require(receiver.address != null,
"Cannot send message to client endpoint with no listen address.")
val targetOutbox = {
//TODO RpcAddress和Outbox的映射
val outbox = outboxes.get(receiver.address)
if (outbox == null) {
val newOutbox = new Outbox(this, receiver.address)
val oldOutbox = outboxes.putIfAbsent(receiver.address, newOutbox)
if (oldOutbox == null) {
newOutbox
} else {
oldOutbox
}
} else {
outbox
}
}
if (stopped.get) {
// It's possible that we put `targetOutbox` after stopping. So we need to clean it.
outboxes.remove(receiver.address)
targetOutbox.stop()
} else {
targetOutbox.send(message)
}
}
}
接下来调用org.apache.spark.rpc.netty.OutboxMessage#sendWith方法。再调用org.apache.spark.network.client.TransportClient#send方法进行消息发送,TransportClient的send方法实现为:
public void send(ByteBuffer message) {
channel.writeAndFlush(new OneWayMessage(new NioManagedBuffer(message)));
}
到此就完成消息的发送了,通过查看TransportClient代码我们可以发现有其他传输数据的方法,TransportClient充当通信的客户端。