上节我们看到响应被写到了ResponseQueue队列中,接下来我们来看看服务端发送响应的准备工作。再次回到Processor的run方法,Processor线程是一个非常重要的线程,可以说所有的事就靠这个线程来完成的。
override def run() {
startupComplete()
while (isRunning) {
try {
// setup any new connections that have been queued up
//TODO 注册 OP_READ事件
configureNewConnections()
// register any new responses for writing
//TODO 处理队列里的响应
processNewResponses()
//TODO poll 处理读请求,把请求存入了stagedReceives,completedReceives
poll()
//TODO 处理已经接收到的请求
processCompletedReceives()
processCompletedSends()
processDisconnected()
} catch {
// We catch all the throwables here to prevent the processor thread from exiting. We do this because
// letting a processor exit might cause a bigger impact on the broker. Usually the exceptions thrown would
// be either associated with a specific socket channel or a bad request. We just ignore the bad socket channel
// or request. This behavior might need to be reviewed if we see an exception that need the entire broker to stop.
case e: ControlThrowable => throw e
case e: Throwable =>
error("Processor got uncaught exception.", e)
}
}
debug("Closing selector - processor " + id)
swallowError(closeAll())
shutdownComplete()
}
此次重点关注 processNewResponses() 方法
private def processNewResponses() {
//从队列里面获取响应
var curr = requestChannel.receiveResponse(id)
/** Get a response for the given processor if there is one */
/* def receiveResponse(processor: Int): RequestChannel.Response = {
val response = responseQueues(processor).poll()
if (response != null)
response.request.responseDequeueTimeMs = SystemTime.milliseconds
response
} */
while (curr != null) {
try {
curr.responseAction match {
case RequestChannel.NoOpAction =>
// There is no response to send to the client, we need to read more pipelined requests
// that are sitting in the server's socket buffer
curr.request.updateRequestMetrics
trace("Socket server received empty response to send, registering for read: " + curr)
selector.unmute(curr.request.connectionId)
case RequestChannel.SendAction =>
//TODO 发送响应
sendResponse(curr)
case RequestChannel.CloseConnectionAction =>
curr.request.updateRequestMetrics
trace("Closing socket connection actively according to the response code.")
close(selector, curr.request.connectionId)
}
} finally {
curr = requestChannel.receiveResponse(id)
}
}
}
继续跟踪 sendResponse(curr)
/* `protected` for test usage */
protected[network] def sendResponse(response: RequestChannel.Response) {
trace(s"Socket server received response to send, registering for write and sending data: $response")
val channel = selector.channel(response.responseSend.destination)
// `channel` can be null if the selector closed the connection because it was idle for too long
if (channel == null) {
warn(s"Attempting to send response via channel for which there is no open connection, connection id $id")
response.request.updateRequestMetrics()
}
else {
// 绑定对应连接 注册OP_WRITE事件
selector.send(response.responseSend)
// 缓存(连接-> 响应)
inflightResponses += (response.request.connectionId -> response)
}
}
点击 selector.send(response.responseSend)
public void send(Send send) {
//响应要和对应的连接绑定。
KafkaChannel channel = channelOrFail(send.destination());
try {
channel.setSend(send);
} catch (CancelledKeyException e) {
this.failedSends.add(send.destination());
close(channel);
}
}
点击 channel.setSend(send);
public void setSend(Send send) {
if (this.send != null)
throw new IllegalStateException("Attempt to begin a send operation with prior send operation still in progress.");
this.send = send;
//TODO 关键的代码出来了,增加绑定了OP_WRITE事件
this.transportLayer.addInterestOps(SelectionKey.OP_WRITE);
}
总结:上面我们看到的代码,每个Processor线程不断的从队列里获取响应,然后把响应和KafkaChannel绑定起来,让SocketChannel监听OP_WRITE事件,为后面向客户端发送响应做好初始化工作。
下面我们来看响应消息是怎么真正的发送出去的,还是找到Processor线程的run方法:
override def run() {
startupComplete()
while (isRunning) {
try {
// setup any new connections that have been queued up
//TODO 注册 OP_READ事件
configureNewConnections()
// register any new responses for writing
//TODO 处理队列里的响应
processNewResponses()
//TODO poll 处理请求
poll()
//TODO 处理已经接收到的请求
processCompletedReceives()
processCompletedSends()
processDisconnected()
} catch {
// We catch all the throwables here to prevent the processor thread from exiting. We do this because
// letting a processor exit might cause a bigger impact on the broker. Usually the exceptions thrown would
// be either associated with a specific socket channel or a bad request. We just ignore the bad socket channel
// or request. This behavior might need to be reviewed if we see an exception that need the entire broker to stop.
case e: ControlThrowable => throw e
case e: Throwable =>
error("Processor got uncaught exception.", e)
}
}
debug("Closing selector - processor " + id)
swallowError(closeAll())
shutdownComplete()
}
我们主要关注poll方法,代码跟踪如下:
public void poll(long timeout) throws IOException {
if (timeout < 0)
throw new IllegalArgumentException("timeout should be >= 0");
clear();
if (hasStagedReceives() || !immediatelyConnectedKeys.isEmpty())
timeout = 0;
/* check ready keys */
long startSelect = time.nanoseconds();
//TODO 执行select 操作,返回key
int readyKeys = select(timeout);
long endSelect = time.nanoseconds();
this.sensors.selectTime.record(endSelect - startSelect, time.milliseconds());
if (readyKeys > 0 || !immediatelyConnectedKeys.isEmpty()) {
//TODO 对所有绑定在Selector上的key进行处理 (一个broker连接就对应一个key)
pollSelectionKeys(this.nioSelector.selectedKeys(), false, endSelect);
pollSelectionKeys(immediatelyConnectedKeys, true, endSelect);
}
addToCompletedReceives();
long endIo = time.nanoseconds();
this.sensors.ioTime.record(endIo - endSelect, time.milliseconds());
// we use the time at the end of select to ensure that we don't close any connections that
// have just been processed in pollSelectionKeys
maybeCloseOldestConnection(endSelect);
}
继续点击 pollSelectionKeys
private void pollSelectionKeys(Iterable<SelectionKey> selectionKeys,
boolean isImmediatelyConnected,
long currentTimeNanos) {
//TODO 遍历每一个SelectionKey
Iterator<SelectionKey> iterator = selectionKeys.iterator();
while (iterator.hasNext()) {
SelectionKey key = iterator.next();
iterator.remove();
//TODO 根据key获取到对应的KafkaChannel
KafkaChannel channel = channel(key);
// register all per-connection metrics at once
sensors.maybeRegisterConnectionMetrics(channel.id());
if (idleExpiryManager != null)
idleExpiryManager.update(channel.id(), currentTimeNanos);
try {
/* complete any connections that have finished their handshake (either normally or immediately) */
//之前的代码 SelectionKey key = socketChannel.register(nioSelector, SelectionKey.OP_CONNECT);
//TODO 因为在之前尝试连接的时向Selector注册了OP_CONNECT事件。所以这儿代码走的是这个逻辑
if (isImmediatelyConnected || key.isConnectable()) {
//TODO 核心代码,这次我们代码进来执行的是这段代码
//调用finishConnect方法完成最后的连接建立
if (channel.finishConnect()) {
this.connected.add(channel.id());
this.sensors.connectionCreated.record();
//根据key获取到SocketChannel
SocketChannel socketChannel = (SocketChannel) key.channel();
log.debug("Created socket with SO_RCVBUF = {}, SO_SNDBUF = {}, SO_TIMEOUT = {} to node {}",
socketChannel.socket().getReceiveBufferSize(),
socketChannel.socket().getSendBufferSize(),
socketChannel.socket().getSoTimeout(),
channel.id());
} else
continue;
}
/* if channel is not ready finish prepare */
if (channel.isConnected() && !channel.ready())
channel.prepare();
/* if channel is ready read from any connections that have readable data */
//TODO 重要代码
//这儿处理的是NIO里读请求的事件,那很明显
//这个方法会在broker返回了一些服务端的响应的时候执行的。
//只有key绑定了READ事件才能执行这个方法
//但是目前key没有绑定READ事件
if (channel.ready() && key.isReadable() && !hasStagedReceive(channel)) {
NetworkReceive networkReceive;
//TODO 读取数据
// 这个地方读取响应的时候可以会读取到多个响应。所以里面的代码需要处理一个粘包的问题。
while ((networkReceive = channel.read()) != null)
addToStagedReceives(channel, networkReceive);
}
/* if channel is ready write to any sockets that have space in their buffer and for which we have data */
//TODO 重要代码
//很明显这段逻辑是要往broker端发送请求的时候调用的。
//只有key绑定了WRITE事件才能调用
//但是目前key没有绑定WRITE事件,所以还是发送不了请求。奔溃。
if (channel.ready() && key.isWritable()) {
//TODO 重要
//这次代码走进来执行的是这个方法,因为绑定了OP_WRITE事件
//发送消息,移除对OP_WRITE 事件的监听。
Send send = channel.write();
if (send != null) {
//把发送出去的send对象添加到completedSends里
this.completedSends.add(send);
this.sensors.recordBytesSent(channel.id(), send.size());
}
}
/* cancel any defunct sockets */
if (!key.isValid()) {
close(channel);
this.disconnected.add(channel.id());
}
} catch (Exception e) {
String desc = channel.socketDescription();
if (e instanceof IOException)
log.debug("Connection with {} disconnected", desc, e);
else
log.warn("Unexpected error from {}; closing connection", desc, e);
close(channel);
this.disconnected.add(channel.id());
}
}
}
因为前面我们已经注册了OP_WRITE事件,上面代码在 channel.write() 将响应消息发送了出去并移除了监听,然后把发送出去的send对象添加到completedSends里。
再次找到Processor线程的run方法这次主要关注processCompletedSends方法
private def processCompletedSends() {
//遍历completedSends里面的每个send
selector.completedSends.asScala.foreach { send =>
// 从缓存(连接-> 响应)中移除已经发送成功的响应
val resp = inflightResponses.remove(send.destination).getOrElse {
throw new IllegalStateException(s"Send for ${send.destination} completed, but not in `inflightResponses`")
}
resp.request.updateRequestMetrics()
//TODO 放下追踪会发现注册了OP_READ事件
//这样,客户端再发送过来消息也可以继续接收了
selector.unmute(send.destination)
}
}
总体流程图: