一、场景分析
在服务端启动的时候,创建并启动了Acceptor线程和Processor线程,每个Acceptor默认管理3个Processor线程对象。在上一篇,分析了Acceptor线程的主要工作:接收客户端连接请求,创建对应的SocketChannel并轮询交给Processor线程处理。那么Processor线程如何处理这些SocketChannel?它的工作流程又是什么?这篇进行详细的分析。
二、图示说明三、源码分析
首先,看一下Processor线程初始化时做了什么:每个Processor线程在初始化时都会创建3个队列://准备创建的连接队列,容量20,里面保存了SocketChannel对象,//调用 configureNewConnections创建连接时就是从该队列中取出一个SocketChannel对象private val newConnections = new ArrayBlockingQueue[SocketChannel](connectionQueueSize)//正在发送的响应集合private val inflightResponses = mutable.Map[String, RequestChannel.Response]()//响应队列,每个 Processor 线程都会维护自己的 Response 队列private val responseQueue = new LinkedBlockingDeque[RequestChannel.Response]()
- newConnections:保存Acceptor分配的SocketChannel,即要创建的新连接信息,队列的容量默认为20
- inflightResponses:一个临时Response队列。当Processor将Response返回给Request发送方后,还要将Response放入这个临时队列。原因是:有些 Response 回调逻辑要在 Response 被发送回发送方之后,才能执行,因此需要将Response暂存在一个临时队列里面
- responseQueue:Response队列,保存工作线程KafkaRequestHandler返回的响应。从这里可以看出,Response队列是由Processor自己维护的,每个Processor线程管理一个。
override def run() { // 等待Processor线程启动完成 startupComplete() try { //不断循环 while (isRunning) { try { //创建新连接:遍历阻塞对象中的SocketChannel,为每个SocketChannel在selector上面注册一个OP_READ事件 configureNewConnections() //发送Response,并将Response放入到inflightResponses临时队列 processNewResponses() //执行NIO poll,获取对应SocketChannel上准备就绪的I/O操作 poll() //TODO 将接收到的Request放入Request队列 processCompletedReceives() //为临时Response队列中的Response执行回调逻辑 processCompletedSends() //处理因发送失败而导致的连接断开 processDisconnected() //关闭超过配额限制部分的连接 closeExcessConnections() } catch { case e: Throwable => processException("Processor got uncaught exception.", e) } } } finally {//关闭资源 debug(s"Closing selector - processor $id") CoreUtils.swallow(closeAll(), this, Level.ERROR) shutdownComplete() }}
可以看到,run方法中将每一步都封装成了一个单独的方法:
1. startupComplete():主要调用了CountDownLatch.countDown()方法,上一篇分析过,这里不再赘述,这里调用的作用就是等待processor线程启动完成。
2. configureNewConnections():
private def configureNewConnections() { var connectionsProcessed = 0 //如果阻塞队列不为空 while (connectionsProcessed < connectionQueueSize && !newConnections.isEmpty) { //从阻塞队列中取出一个SocketChannel val channel = newConnections.poll() try { debug(s"Processor $id listening to new connection from ${channel.socket.getRemoteSocketAddress}") //往selector上面注册一个OP_READ事件 selector.register(connectionId(channel.socket), channel) connectionsProcessed += 1 } catch { case e: Throwable => val remoteAddress = channel.socket.getRemoteSocketAddress close(listenerName, channel) processException(s"Processor $id closed connection from $remoteAddress", e) } }}
如果newConnections阻塞队列不为空,该方法会不断的从中取出SocketChannel对象,然后调用selector.register方法,该方法内部调用了registerChannel,即每个SocketChannel均往nioSelector上面注册一个OP_READ事件,然后将连接id和KafkaChannel的对应关系保存到channels(Map结构)中,最后返回一个SelectionKey
protected SelectionKey registerChannel(String id, SocketChannel socketChannel, int interestedOps) throws IOException { SelectionKey key = socketChannel.register(nioSelector, interestedOps); KafkaChannel channel = buildAndAttachKafkaChannel(socketChannel, id, key); //将这个KafkaChannel缓存起来 this.channels.put(id, channel); if (idleExpiryManager != null) idleExpiryManager.update(channel.id(), time.nanoseconds()); return key; }
3.
processNewResponses():
private def processNewResponses() { var currentResponse: RequestChannel.Response = null //从responseQueue队列中取出一个Response对象,且该对象不为空 while ({currentResponse = dequeueResponse(); currentResponse != null}) { //获取连接id //从这里也可以看出,每个Response对象中都保存了对应的Request对象 val channelId = currentResponse.request.context.connectionId try { currentResponse match { case response: NoOpResponse => updateRequestMetrics(response) trace(s"Socket server received empty response to send, registering for read: $response") handleChannelMuteEvent(channelId, ChannelMuteEvent.RESPONSE_SENT) tryUnmuteChannel(channelId) case response: SendResponse => sendResponse(response, response.responseSend) case response: CloseConnectionResponse => updateRequestMetrics(response) trace("Closing socket connection actively according to the response code.") close(channelId) case _: StartThrottlingResponse => handleChannelMuteEvent(channelId, ChannelMuteEvent.THROTTLE_STARTED) case _: EndThrottlingResponse => // Try unmuting the channel. The channel will be unmuted only if the response has already been sent out to // the client. handleChannelMuteEvent(channelId, ChannelMuteEvent.THROTTLE_ENDED) tryUnmuteChannel(channelId) case _ => throw new IllegalArgumentException(s"Unknown response type: ${currentResponse.getClass}") } } catch { case e: Throwable => processChannelException(channelId, s"Exception while processing response for $channelId", e) } }}
- 从responseQueue队列中逐个取出Response对象,获取对应Request的connectionId,有了这个id就可以拿到对应的KafkaChannel,用于返回响应。这里注意:每个Response对象中都保存了对应的Request对象
- 判断Response对象的类型,具体Response的类型有5种:
- SendResponse:保存Request请求返回结果的Response子类。大多数的Request处理完成后需要执行一段回调逻辑。SendResponse对象的onCompletionCallback属性就是指定处理完成之后的回调逻辑。
- NoResponse:对应无需单独执行回调逻辑的Request请求
- CloseConnectionResponse:当出错后需要关闭连接时,会给Request发送方返回一个CloseConnectionResponse响应,显示通知它关闭连接
- StartThrottlingResponse:用于通知 Broker 的 SocketServer 组件某个 TCP 连接通信通道开始被限流(throttling)
- EndThrottlingResponse:与 StartThrottlingResponse 对应,通知 Broker 的 SocketServer 组件某个 TCP 连接通信通道的限流已结束
- 根据不同的Response类型,执行不同的处理逻辑
protected[network] def sendResponse(response: RequestChannel.Response, responseSend: Send) { //获取连接id val connectionId = response.request.context.connectionId trace(s"Socket server received response to send to $connectionId, registering for write and sending data: $response") //如果channels中指定连接id对应的KafkaChannel为null if (channel(connectionId).isEmpty) { warn(s"Attempting to send response via channel for which there is no open connection, connection id $connectionId") response.request.updateRequestMetrics(0L, response) } //如果指定id的连接可用 if (openOrClosingChannel(connectionId).isDefined) { //发送Response selector.send(responseSend) //将该Response放入inflightResponses集合 inflightResponses += (connectionId -> response) }}
该方法先获取到Response对应的连接id,然后根据连接id从channels中获取KafkaChannel,如果该KafkaChannel可用,则发送Response,同时将该Response放入inflightResponses集合中,关于发送Response的selector.send方法:
public void send(Send send) { String connectionId = send.destination(); //获取KafkaChannel KafkaChannel channel = openOrClosingChannelOrFail(connectionId); //如果channel已经关闭,将这个节点id放到failedSends中 if (closingChannels.containsKey(connectionId)) { this.failedSends.add(connectionId); } else { try { //注册OP_WRITE事件 channel.setSend(send); } catch (Exception e) { channel.state(ChannelState.FAILED_SEND); this.failedSends.add(connectionId); close(channel, CloseMode.DISCARD_NO_NOTIFY); if (!(e instanceof CancelledKeyException)) { log.error("Unexpected exception during send, closing connection {} and rethrowing exception {}", connectionId, e); throw e; } } }}
这个方法最主要的作用就是给对应的KafkaChannel在Selector上面注册了一个OP_WRITE事件
4. poll(),该方法内部调用了Selector的poll方法:
public void poll(long timeout) throws IOException { ... //获取已经准备好io的selectionKey(channel)个数 int numReadyKeys = select(timeout); long endSelect = time.nanoseconds(); this.sensors.selectTime.record(endSelect - startSelect, time.milliseconds()); if (numReadyKeys > 0 || !immediatelyConnectedKeys.isEmpty() || dataInBuffers) { //获取所有准备好的selectionKey Set readyKeys = this.nioSelector.selectedKeys(); ... //遍历selectionKey进行处理 pollSelectionKeys(readyKeys, false, endSelect); readyKeys.clear(); pollSelectionKeys(immediatelyConnectedKeys, true, endSelect); immediatelyConnectedKeys.clear(); } else { madeReadProgressLastPoll = true; //no work is also "progress" } long endIo = time.nanoseconds(); this.sensors.ioTime.record(endIo - endSelect, time.milliseconds()); ... //TODO 对接收到的响应进行处理 // 将stageReceives结构中的NetworkReceive对象放到completeReceive集合中 // stageReceives:Map> // completeReceive:List addToCompletedReceives();}
Selector.poll()方法是真正执行I/O操作的,底层调用了JavaNIO Selector 的select方法去处理已经准备好的I/O操作,不管是接收Request还是发送Response。
之前在客户端发送消息的时候详细分析过这个方法(《
深入理解Kafka客户端之消息是如何发送的
》)。简单来讲,就是根据注册的不同事件,进行不同的处理,这里不再赘述。有一点需要注意:对于客户端来说,发送出去的是操作数据请求,返回的是操作结果的响应;而对于服务端来说正好相反,发送出去的是处理数据的响应,接收的是操作数据的请求。
5. processCompletedReceives():该方法用来将接收到的客户端请求封装成Request对象,并保存到RequestQueue队列中
private def processCompletedReceives() { //遍历已经接收到的NetworkReceive selector.completedReceives.asScala.foreach { receive => try { //获取NetworkReceive对应的连接通道KafkaChannel openOrClosingChannel(receive.source) match { case Some(channel) => val header = RequestHeader.parse(receive.payload) if (header.apiKey() == ApiKeys.SASL_HANDSHAKE && channel.maybeBeginServerReauthentication(receive, nowNanosSupplier)) trace(s"Begin re-authentication: $channel") else { val nowNanos = time.nanoseconds() 如果会话已过期,则关闭连接 if (channel.serverAuthenticationSessionExpired(nowNanos)) { debug(s"Disconnecting expired channel: $channel : $header") close(channel.id) expiredConnectionsKilledCount.record(null, 1, 0) } else { //获取连接id val connectionId = receive.source //构建RequestContext val context = new RequestContext(header, connectionId, channel.socketAddress, channel.principal, listenerName, securityProtocol) //封装请求对象Request val req = new RequestChannel.Request(processor = id, context = context, startTimeNanos = nowNanos, memoryPool, receive.payload, requestChannel.metrics) //TODO 核心代码:将Request放入requestQueue requestChannel.sendRequest(req) selector.mute(connectionId) handleChannelMuteEvent(connectionId, ChannelMuteEvent.REQUEST_RECEIVED) } ...}
主要代码:
//封装请求对象Requestval req = new RequestChannel.Request(processor = id, context = context, startTimeNanos = nowNanos, memoryPool, receive.payload, requestChannel.metrics)//TODO 核心代码:将Request放入requestQueuerequestChannel.sendRequest(req)
requestChannel.sendRequest方法:就是将封装好的Request放入requestQueue队列中
def sendRequest(request: RequestChannel.Request) { requestQueue.put(request)}
6.
processCompletedSends():
为放入inflightResponses集合中的Response执行回调逻辑:
private def processCompletedSends() { selector.completedSends.asScala.foreach { send => try { //获取inflightResponses队列中的Response对象 val response = inflightResponses.remove(send.destination).getOrElse { throw new IllegalStateException(s"Send for ${send.destination} completed, but not in `inflightResponses`") } updateRequestMetrics(response) //执行回调逻辑 response.onComplete.foreach(onComplete => onComplete(send)) handleChannelMuteEvent(send.destination, ChannelMuteEvent.RESPONSE_SENT) tryUnmuteChannel(send.destination) } catch { case e: Throwable => processChannelException(send.destination, s"Exception while processing completed send to ${send.destination}", e) } }}
这里需要执行的回调逻辑就是构建Response对象时传入的onCompleteCallback参数:
override def onComplete: Option[Send => Unit] = onCompleteCallback
7. p
rocessDisconnected():
将所有已断开连接对应的Response从inflightResponses集合中移除,并更新配额数据
private def processDisconnected() { //遍历所有已经断开的连接 selector.disconnected.keySet.asScala.foreach { connectionId => try { //获取断开连接的远端主机名信息 val remoteHost = ConnectionId.fromString(connectionId).getOrElse { throw new IllegalStateException(s"connectionId has unexpected format: $connectionId") }.remoteHost //将失败连接对应的Response从临时集合inflightResponses中移除 inflightResponses.remove(connectionId).foreach(updateRequestMetrics) //更新配额数据 connectionQuotas.dec(listenerName, InetAddress.getByName(remoteHost)) } catch { case e: Throwable => processException(s"Exception while processing disconnection of $connectionId", e) } }}
8.
closeExcessConnections()
:关闭超限连接
private def closeExcessConnections(): Unit = { //如果连接总数超过了单个broker允许的最大连接数 if (connectionQuotas.maxConnectionsExceeded(listenerName)) { //找到优先需要关闭的连接 val channel = selector.lowestPriorityChannel() if (channel != null) //关闭连接 close(channel.id) }}
这里优先需要关闭的连接是指在多个TCP 连接中找出最近未被使用的那个。即在最近一段时间内,没有任何 Request 经由这个连接被发送到 Processor 线程。
总结:
整个Processor线程的工作流程分为以下几步:确认Processor线程已经启动完成
从newConnections队列中依次取出SocketChannel,在Selector上注册OP_READ事件,封装KafkaChannel,然后将连接id和KafkaChannel的对应关系保存到channels中
从responseQueue队列中取出Response对象,根据不同的Response类型执行不同的操作,并将Response放入inflightResponses集合
调用Selector的poll方法,执行真正的接收客户端请求和发送响应操作
将接收到的请求封装成Request对象,放入requestQueue队列
为放入inflightResponses集合的Response执行回调逻辑
获取已经断开的连接,将连接对应的Response从inflightResponses中移除,并更新配额数据
如果连接超限,则关闭最近未发送Request请求给Processor线程的连接