Spark2.0.2源码分析——RPC 通信机制(消息处理)

最新推荐文章于 2022-12-21 14:50:21 发布

myllxy

最新推荐文章于 2022-12-21 14:50:21 发布

阅读量786

点赞数

分类专栏： Spark源码分析

本文链接：https://blog.csdn.net/qq_39327985/article/details/87633248

版权

Spark源码分析专栏收录该内容

9 篇文章

订阅专栏

RPC 是一种远程过程的调用，即两台节点之间的数据传输。
每个组件都有它自己的执行环境，RPC 的执行环境就是 RPCENV，RPCENV 是 Spark 2.x.x 新增加的，用于替代之前版本的 akka。
RPC 是从 SparkEnv 开始启动的：
在这里插入图片描述

一.SparkEnv 的创建

在每一个节点都会启动一个 SparkEnv ，而通过启动 SparkContext 就能启动 SparkEnv:
在这里插入图片描述
try cache 语句块中有 _env = createSparkEnv(_conf, isLocal, listenerBus).

二.RpcEnv 的创建

在 SparkEnv#create 方法中：

val rpcEnv = RpcEnv.create(systemName, hostname, port, conf, securityManager, clientMode = !isDriver)

在 NettyRpcEnvFactory#create 中会 new 一个 NettyRpcEnv

三.RpcEndPoint 的创建

作为 Rpc 请求的一种组件抽象,RpcEndPoint 代替了 akka 中的 actor，里面封装了很多方法，比如：
在这里插入图片描述
RpcEndPoint 的继承关系图如下：

在 Rpc 框架中使用 TreadSafeRpcEndpoint 这个实现类，它的继承关系如下：

四.RpcEndPointRef 的创建

akka 框架中有两个组件：actor，actorref。
前面说到 RpcEndPoint 代替了 akka 中的 actor ，而 RpcEndPointRef 就代替了 actorref。
RpcEndPointRef 可以理解为 RpcEndPoint 的引用，通过这个引用来向目标节点发送消息。
在这里插入图片描述
因此，每个节点都会有一个或多个 RpcEndPointRef 和 RpcEndPoint

RpcEndPointRef#send : 发送单项异步消息，发送了就完了，不会有返回消息。
RpcEndPointRef#ask：根据默认超时时间发送消息。
RpcEndPointRef#askWithRetry：是一个发送请求并且在默认超时时间范围内等待响应的方法

五.什么是 InboxMessage

节点之间消息传递实际封装在 InboxMessage 中，其继承关系如下：
在这里插入图片描述
接收消息的时候根据消息种类调用 RpcEndPoint 中不同的方法进行处理。

六.Dispatcher 处理消息

作为一个消息调度器，Dispatcher 有效地提高 NettyRpcEnv 对消息处理的并发度，将消息发送到对应的 RpcEndPoint 。
在这里插入图片描述
那么，Dispatcher 怎么创建出来的呢？
NettyRpcEnv#dispatcher:
private val dispatcher: Dispatcher = new Dispatcher(this)

Dispatcher 中的组件：

RpcEndPoint
RpcEndPointRef
InboxMessage
Inbox
EndpointData
receivers
threadpool

所有的 InboxMessage 被封装在 Inbox 中，Inbox#messages：
protected val messages = new java.util.LinkedList[InboxMessage]()
在这里插入图片描述
每一个 RpcEnvPoint 负责一个 Inbox

注意 EndpointData ，它将三个组件封装成一个类：
在这里插入图片描述

其中，receivers 是用于存储 EndPointData 的阻塞队列，只有当 Inbox 有新 InboxMessage 时，才会将 EndPointData 放入此队列。

既然提到并发度，那就一定有线程池 ThreadPool(默认数量是2):
在这里插入图片描述
Dispatcher#MessageLoop:

如上图所示，如果 EndPointData 是读完状态，还会将其放回去让其他 MessageLoops 看到：receivers.offer(PoisonPill)。

接下来执行 data.inbox.process(Dispatcher.this)

Dispatcher 内存模型:
在这里插入图片描述

七.Inbox#prosess

def process(dispatcher: Dispatcher): Unit = {
    var message: InboxMessage = null
    inbox.synchronized {
/*
private var enableConcurrent = false 是否允许多线程同时处理
private var numActiveThreads = 0 当前激活的线程个数
*/
      if (!enableConcurrent && numActiveThreads != 0) {
        return
      }
// 当前激活线程数为0时执行      
      message = messages.poll()
// 如果消息为空，则直接返回，否则当前激活线程数+1，表示
// 当前已经有线程执行      
      if (message != null) {
        numActiveThreads += 1
      } else {
        return
      }
    }
/*
判断消息类型，执行对应方法
*/    
    while (true) {
      safelyCall(endpoint) {
        message match {
          case RpcMessage(_sender, content, context) =>
            try {
              endpoint.receiveAndReply(context).applyOrElse[Any, Unit](content, { msg =>
                throw new SparkException(s"Unsupported message $message from ${_sender}")
              })
            } catch {
              case NonFatal(e) =>
                context.sendFailure(e)
                // Throw the exception -- this exception will be caught by the safelyCall function.
                // The endpoint's onError function will be called.
                throw e
            }

          case OneWayMessage(_sender, content) =>
            endpoint.receive.applyOrElse[Any, Unit](content, { msg =>
              throw new SparkException(s"Unsupported message $message from ${_sender}")
            })

          case OnStart =>
            endpoint.onStart()
            if (!endpoint.isInstanceOf[ThreadSafeRpcEndpoint]) {
              inbox.synchronized {
                if (!stopped) {
                  enableConcurrent = true
                }
              }
            }

          case OnStop =>
            val activeThreads = inbox.synchronized { inbox.numActiveThreads }
            assert(activeThreads == 1,
              s"There should be only a single active thread but found $activeThreads threads.")
            dispatcher.removeRpcEndpointRef(endpoint)
            endpoint.onStop()
            assert(isEmpty, "OnStop should be the last message")

          case RemoteProcessConnected(remoteAddress) =>
            endpoint.onConnected(remoteAddress)

          case RemoteProcessDisconnected(remoteAddress) =>
            endpoint.onDisconnected(remoteAddress)

          case RemoteProcessConnectionError(cause, remoteAddress) =>
            endpoint.onNetworkError(cause, remoteAddress)
        }
      }

      inbox.synchronized {
        // "enableConcurrent" will be set to false after `onStop` is called, so we should check it
        // every time.
        if (!enableConcurrent && numActiveThreads != 1) {
          // If we are not the only one worker, exit
          numActiveThreads -= 1
          return
        }
        message = messages.poll()
        if (message == null) {
          numActiveThreads -= 1
          return
        }
      }
    }
  }

tips: while (true) 语句会一直死循环，以便随时接受 Message。

总结：

文字描述

SparkContext 初始化的时候就能启动 SparkEnv，接下来经过一系列的追溯封装，在 RpcEnv#create 方法中 new 一个 NettyRpcEnvFactory 并调用 create 方法，最终由 NettyRpcEnvFactory#create 返回一个我们想要的 env : nettyEnv。

作为 Rpc 请求的一种组件抽象,RpcEndPoint 代替了 akka 中的 actor，里面封装了很多消息处理的方法，如 receive 和 receiveAndReply；akka 框架中有两个组件：actor，actorref，前面说到 RpcEndPoint 代替了 akka 中的 actor ，而 RpcEndPointRef 就代替了 actorref。RpcEndPointRef 可以理解为 RpcEndPoint 的引用，通过这个引用来向目标节点发送消息。因此，每个节点都会有一个或多个 RpcEndPointRef 和 RpcEndPoint。这里注意的是，创建 RpcEndPoint ，随后通过它和 RpcEndpointAddress 来创建 RpcEndPointRef 的同时，将这两者注册到 EndPointData 中。

节点之间消息传递实际封装在 InboxMessage 中，所有的 InboxMessage 被封装在 Inbox#messages 中，RpcEndPoint 接收消息的时候根据消息种类调用 RpcEndPoint 中不同的方法进行处理。

作为一个消息调度器，Dispatcher 有效地提高 NettyRpcEnv 对消息处理的并发度，将消息发送到对应的 RpcEndPoint ，Dispatcher 在 env 初始化的时候会被创建出来，每一个 RpcEndPoint 负责一个 Inbox ，注意 EndpointData ，它将三个组件（RpcEndPointRef RpcEndPoint Inbox）封装成一个类，所以，每一个 RpcEndPoint 负责一个 Inbox ，其中，Dispatcher#receivers 是用于存储 EndPointData 的阻塞队列，只有当 Inbox 有新 InboxMessage 时，才会将 EndPointData 放入此队列，然后被线程池处理。

Dispatcher 初始化便创建出来的线程池(默认数量是2)：private val threadpool: ThreadPoolExecutor = {...}，当上面的 receivers 队列中没内容时，会阻塞。当有 RpcEndpoint 相关请求（即 InboxMessage ）的时候就会立刻执行，这里处理 InboxMessage 本质上是调用相应 RpcEndpoint 的 inbox 去处理。在 Dispatcher#MessageLoop#run 中有一个 while(true) 语句，它会从 receivers 队列中拿取 EndPointData 数据，如果 EndPointData 是读完状态，还会将其放回去让其他 MessageLoops 看到，接下来执行 data.inbox.process(Dispatcher.this)。