e盾服务端源码_深入理解Kafka服务端之Processor线程是如何工作的

最新推荐文章于 2022-03-30 14:34:14 发布

weixin_39644139

最新推荐文章于 2022-03-30 14:34:14 发布

阅读量330

点赞数

文章标签： e盾服务端源码深入理解kafka

一、场景分析

在服务端启动的时候，创建并启动了Acceptor线程和Processor线程，每个Acceptor默认管理3个Processor线程对象。在上一篇，分析了Acceptor线程的主要工作：接收客户端连接请求，创建对应的SocketChannel并轮询交给Processor线程处理。那么Processor线程如何处理这些SocketChannel？它的工作流程又是什么？这篇进行详细的分析。

二、图示说明

三、源码分析

首先，看一下Processor线程初始化时做了什么：每个Processor线程在初始化时都会创建3个队列：

//准备创建的连接队列，容量20，里面保存了SocketChannel对象，//调用 configureNewConnections创建连接时就是从该队列中取出一个SocketChannel对象private val newConnections = new ArrayBlockingQueue[SocketChannel](connectionQueueSize)//正在发送的响应集合private val inflightResponses = mutable.Map[String, RequestChannel.Response]()//响应队列，每个 Processor 线程都会维护自己的 Response 队列private val responseQueue = new LinkedBlockingDeque[RequestChannel.Response]()

newConnections：保存Acceptor分配的SocketChannel，即要创建的新连接信息，队列的容量默认为20
inflightResponses：一个临时Response队列。当Processor将Response返回给Request发送方后，还要将Response放入这个临时队列。原因是：有些 Response 回调逻辑要在 Response 被发送回发送方之后，才能执行，因此需要将Response暂存在一个临时队列里面
responseQueue：Response队列，保存工作线程KafkaRequestHandler返回的响应。从这里可以看出，Response队列是由Processor自己维护的，每个Processor线程管理一个。

看完了Processor线程对象的初始化，下面就来看看它是如何工作的，由于Processor是Runnable接口的实现类，所以它的工作流程在其run()方法内：

override def run() {  // 等待Processor线程启动完成  startupComplete()  try {    //不断循环    while (isRunning) {      try {        //创建新连接：遍历阻塞对象中的SocketChannel，为每个SocketChannel在selector上面注册一个OP_READ事件        configureNewConnections()        //发送Response，并将Response放入到inflightResponses临时队列        processNewResponses()        //执行NIO poll，获取对应SocketChannel上准备就绪的I/O操作        poll()        //TODO 将接收到的Request放入Request队列        processCompletedReceives()        //为临时Response队列中的Response执行回调逻辑        processCompletedSends()        //处理因发送失败而导致的连接断开        processDisconnected()        //关闭超过配额限制部分的连接        closeExcessConnections()      } catch {        case e: Throwable => processException("Processor got uncaught exception.", e)      }    }  } finally {//关闭资源    debug(s"Closing selector - processor $id")    CoreUtils.swallow(closeAll(), this, Level.ERROR)    shutdownComplete()  }}

可以看到，run方法中将每一步都封装成了一个单独的方法： 1. startupComplete()：主要调用了CountDownLatch.countDown()方法，上一篇分析过，这里不再赘述，这里调用的作用就是等待processor线程启动完成。 2. configureNewConnections()：

private def configureNewConnections() {  var connectionsProcessed = 0  //如果阻塞队列不为空  while (connectionsProcessed < connectionQueueSize && !newConnections.isEmpty) {    //从阻塞队列中取出一个SocketChannel    val channel = newConnections.poll()    try {      debug(s"Processor $id listening to new connection from ${channel.socket.getRemoteSocketAddress}")      //往selector上面注册一个OP_READ事件      selector.register(connectionId(channel.socket), channel)      connectionsProcessed += 1    } catch {      case e: Throwable =>        val remoteAddress = channel.socket.getRemoteSocketAddress              close(listenerName, channel)        processException(s"Processor $id closed connection from $remoteAddress", e)    }  }}

如果newConnections阻塞队列不为空，该方法会不断的从中取出SocketChannel对象，然后调用selector.register方法，该方法内部调用了registerChannel，即每个SocketChannel均往nioSelector上面注册一个OP_READ事件，然后将连接id和KafkaChannel的对应关系保存到channels(Map结构)中，最后返回一个SelectionKey

protected SelectionKey registerChannel(String id, SocketChannel socketChannel, int interestedOps) throws IOException {      SelectionKey key = socketChannel.register(nioSelector, interestedOps);      KafkaChannel channel = buildAndAttachKafkaChannel(socketChannel, id, key);      //将这个KafkaChannel缓存起来      this.channels.put(id, channel);      if (idleExpiryManager != null)          idleExpiryManager.update(channel.id(), time.nanoseconds());      return key;  }

3. processNewResponses()：

private def processNewResponses() {  var currentResponse: RequestChannel.Response = null  //从responseQueue队列中取出一个Response对象，且该对象不为空  while ({currentResponse = dequeueResponse(); currentResponse != null}) {    //获取连接id    //从这里也可以看出，每个Response对象中都保存了对应的Request对象    val channelId = currentResponse.request.context.connectionId    try {      currentResponse match {        case response: NoOpResponse =>          updateRequestMetrics(response)          trace(s"Socket server received empty response to send, registering for read: $response")          handleChannelMuteEvent(channelId, ChannelMuteEvent.RESPONSE_SENT)          tryUnmuteChannel(channelId)        case response: SendResponse =>          sendResponse(response, response.responseSend)        case response: CloseConnectionResponse =>          updateRequestMetrics(response)          trace("Closing socket connection actively according to the response code.")          close(channelId)        case _: StartThrottlingResponse =>          handleChannelMuteEvent(channelId, ChannelMuteEvent.THROTTLE_STARTED)        case _: EndThrottlingResponse =>          // Try unmuting the channel. The channel will be unmuted only if the response has already been sent out to          // the client.          handleChannelMuteEvent(channelId, ChannelMuteEvent.THROTTLE_ENDED)          tryUnmuteChannel(channelId)        case _ =>          throw new IllegalArgumentException(s"Unknown response type: ${currentResponse.getClass}")      }    } catch {      case e: Throwable =>        processChannelException(channelId, s"Exception while processing response for $channelId", e)    }  }}

从responseQueue队列中逐个取出Response对象，获取对应Request的connectionId，有了这个id就可以拿到对应的KafkaChannel，用于返回响应。这里注意：每个Response对象中都保存了对应的Request对象
判断Response对象的类型，具体Response的类型有5种：
- SendResponse：保存Request请求返回结果的Response子类。大多数的Request处理完成后需要执行一段回调逻辑。SendResponse对象的onCompletionCallback属性就是指定处理完成之后的回调逻辑。
- NoResponse：对应无需单独执行回调逻辑的Request请求
- CloseConnectionResponse：当出错后需要关闭连接时，会给Request发送方返回一个CloseConnectionResponse响应，显示通知它关闭连接
- StartThrottlingResponse：用于通知 Broker 的 SocketServer 组件某个 TCP 连接通信通道开始被限流(throttling)
- EndThrottlingResponse：与 StartThrottlingResponse 对应，通知 Broker 的 SocketServer 组件某个 TCP 连接通信通道的限流已结束
根据不同的Response类型，执行不同的处理逻辑

由于Kafka中大多数Response为SendResponse类型，这里重点看SendResponse类型对应的处理逻辑，即调用sendResponse方法：

protected[network] def sendResponse(response: RequestChannel.Response, responseSend: Send) {  //获取连接id  val connectionId = response.request.context.connectionId  trace(s"Socket server received response to send to $connectionId, registering for write and sending data: $response")  //如果channels中指定连接id对应的KafkaChannel为null  if (channel(connectionId).isEmpty) {    warn(s"Attempting to send response via channel for which there is no open connection, connection id $connectionId")    response.request.updateRequestMetrics(0L, response)  }  //如果指定id的连接可用  if (openOrClosingChannel(connectionId).isDefined) {    //发送Response    selector.send(responseSend)    //将该Response放入inflightResponses集合    inflightResponses += (connectionId -> response)  }}

该方法先获取到Response对应的连接id，然后根据连接id从channels中获取KafkaChannel，如果该KafkaChannel可用，则发送Response，同时将该Response放入inflightResponses集合中，关于发送Response的selector.send方法：

public void send(Send send) {    String connectionId = send.destination();    //获取KafkaChannel    KafkaChannel channel = openOrClosingChannelOrFail(connectionId);    //如果channel已经关闭，将这个节点id放到failedSends中    if (closingChannels.containsKey(connectionId)) {        this.failedSends.add(connectionId);    } else {        try {            //注册OP_WRITE事件            channel.setSend(send);        } catch (Exception e) {            channel.state(ChannelState.FAILED_SEND);            this.failedSends.add(connectionId);            close(channel, CloseMode.DISCARD_NO_NOTIFY);            if (!(e instanceof CancelledKeyException)) {                log.error("Unexpected exception during send, closing connection {} and rethrowing exception {}",                        connectionId, e);                throw e;            }        }    }}

这个方法最主要的作用就是给对应的KafkaChannel在Selector上面注册了一个OP_WRITE事件 4. poll()，该方法内部调用了Selector的poll方法：

public void poll(long timeout) throws IOException {    ...    //获取已经准备好io的selectionKey(channel)个数    int numReadyKeys = select(timeout);    long endSelect = time.nanoseconds();    this.sensors.selectTime.record(endSelect - startSelect, time.milliseconds());    if (numReadyKeys > 0 || !immediatelyConnectedKeys.isEmpty() || dataInBuffers) {        //获取所有准备好的selectionKey        Set readyKeys = this.nioSelector.selectedKeys();        ...        //遍历selectionKey进行处理        pollSelectionKeys(readyKeys, false, endSelect);        readyKeys.clear();        pollSelectionKeys(immediatelyConnectedKeys, true, endSelect);        immediatelyConnectedKeys.clear();    } else {        madeReadProgressLastPoll = true; //no work is also "progress"    }    long endIo = time.nanoseconds();    this.sensors.ioTime.record(endIo - endSelect, time.milliseconds());    ...    //TODO 对接收到的响应进行处理    // 将stageReceives结构中的NetworkReceive对象放到completeReceive集合中    // stageReceives：Map>    // completeReceive:List    addToCompletedReceives();}

Selector.poll()方法是真正执行I/O操作的，底层调用了JavaNIO Selector 的select方法去处理已经准备好的I/O操作，不管是接收Request还是发送Response。之前在客户端发送消息的时候详细分析过这个方法(《深入理解Kafka客户端之消息是如何发送的》)。简单来讲，就是根据注册的不同事件，进行不同的处理，这里不再赘述。有一点需要注意：对于客户端来说，发送出去的是操作数据请求，返回的是操作结果的响应；而对于服务端来说正好相反，发送出去的是处理数据的响应，接收的是操作数据的请求。 5. processCompletedReceives()：该方法用来将接收到的客户端请求封装成Request对象，并保存到RequestQueue队列中

private def processCompletedReceives() {  //遍历已经接收到的NetworkReceive  selector.completedReceives.asScala.foreach { receive =>    try {      //获取NetworkReceive对应的连接通道KafkaChannel      openOrClosingChannel(receive.source) match {        case Some(channel) =>          val header = RequestHeader.parse(receive.payload)          if (header.apiKey() == ApiKeys.SASL_HANDSHAKE && channel.maybeBeginServerReauthentication(receive, nowNanosSupplier))            trace(s"Begin re-authentication: $channel")          else {            val nowNanos = time.nanoseconds()             如果会话已过期，则关闭连接            if (channel.serverAuthenticationSessionExpired(nowNanos)) {              debug(s"Disconnecting expired channel: $channel : $header")              close(channel.id)              expiredConnectionsKilledCount.record(null, 1, 0)            } else {              //获取连接id              val connectionId = receive.source              //构建RequestContext              val context = new RequestContext(header, connectionId, channel.socketAddress,                channel.principal, listenerName, securityProtocol)              //封装请求对象Request              val req = new RequestChannel.Request(processor = id, context = context,                startTimeNanos = nowNanos, memoryPool, receive.payload, requestChannel.metrics)              //TODO 核心代码：将Request放入requestQueue              requestChannel.sendRequest(req)              selector.mute(connectionId)              handleChannelMuteEvent(connectionId, ChannelMuteEvent.REQUEST_RECEIVED)            }         ...}

主要代码：

//封装请求对象Requestval req = new RequestChannel.Request(processor = id, context = context,  startTimeNanos = nowNanos, memoryPool, receive.payload, requestChannel.metrics)//TODO 核心代码：将Request放入requestQueuerequestChannel.sendRequest(req)

requestChannel.sendRequest方法：就是将封装好的Request放入requestQueue队列中

def sendRequest(request: RequestChannel.Request) {  requestQueue.put(request)}

6. processCompletedSends()：为放入inflightResponses集合中的Response执行回调逻辑：

private def processCompletedSends() {  selector.completedSends.asScala.foreach { send =>    try {      //获取inflightResponses队列中的Response对象      val response = inflightResponses.remove(send.destination).getOrElse {        throw new IllegalStateException(s"Send for ${send.destination} completed, but not in `inflightResponses`")      }      updateRequestMetrics(response)      //执行回调逻辑      response.onComplete.foreach(onComplete => onComplete(send))      handleChannelMuteEvent(send.destination, ChannelMuteEvent.RESPONSE_SENT)      tryUnmuteChannel(send.destination)    } catch {      case e: Throwable => processChannelException(send.destination,        s"Exception while processing completed send to ${send.destination}", e)    }  }}

这里需要执行的回调逻辑就是构建Response对象时传入的onCompleteCallback参数：

override def onComplete: Option[Send => Unit] = onCompleteCallback

7. p rocessDisconnected()：将所有已断开连接对应的Response从inflightResponses集合中移除，并更新配额数据

private def processDisconnected() {  //遍历所有已经断开的连接  selector.disconnected.keySet.asScala.foreach { connectionId =>    try {      //获取断开连接的远端主机名信息      val remoteHost = ConnectionId.fromString(connectionId).getOrElse {        throw new IllegalStateException(s"connectionId has unexpected format: $connectionId")      }.remoteHost      //将失败连接对应的Response从临时集合inflightResponses中移除      inflightResponses.remove(connectionId).foreach(updateRequestMetrics)      //更新配额数据      connectionQuotas.dec(listenerName, InetAddress.getByName(remoteHost))    } catch {      case e: Throwable => processException(s"Exception while processing disconnection of $connectionId", e)    }  }}

8. closeExcessConnections() ：关闭超限连接

private def closeExcessConnections(): Unit = {  //如果连接总数超过了单个broker允许的最大连接数  if (connectionQuotas.maxConnectionsExceeded(listenerName)) {    //找到优先需要关闭的连接    val channel = selector.lowestPriorityChannel()    if (channel != null)      //关闭连接      close(channel.id)  }}

这里优先需要关闭的连接是指在多个TCP 连接中找出最近未被使用的那个。即在最近一段时间内，没有任何 Request 经由这个连接被发送到 Processor 线程。

总结：

整个Processor线程的工作流程分为以下几步：

确认Processor线程已经启动完成
从newConnections队列中依次取出SocketChannel，在Selector上注册OP_READ事件，封装KafkaChannel，然后将连接id和KafkaChannel的对应关系保存到channels中
从responseQueue队列中取出Response对象，根据不同的Response类型执行不同的操作，并将Response放入inflightResponses集合
调用Selector的poll方法，执行真正的接收客户端请求和发送响应操作
将接收到的请求封装成Request对象，放入requestQueue队列
为放入inflightResponses集合的Response执行回调逻辑
获取已经断开的连接，将连接对应的Response从inflightResponses中移除，并更新配额数据
如果连接超限，则关闭最近未发送Request请求给Processor线程的连接