kafka源码分析---NIO

我们直接跑到最底层,看看kafka的网络层处理是怎么处理的。因为Java的NIO还是偏底层,不能直接用来做应用开发,所以一般都使用像netty的框架或者按照自己的需要封装一些nio,让上层业务不用关心网络处理的细节,只需要创建服务监听端口、接受请求、处理请求、写返回就可以了。我在看netty、thrift等涉及到网络的Java框架时比较喜欢去看他们的nio是怎么封装的,这里也是能够体现作者水平的地方。java nio的基本元素为Selector、Channel、ByteBuffer。
我们从server和client两端分别分析。

kafka server端在org.apache.kafka.common.network中进行了封装。
就像package.html里面写的。

1

2

3

4

5

6

7

8

9

10

11

12

>

The network server for kafka. No application specific code here, just general network server stuff.

 

The classes Receive and Send encapsulate the incoming and outgoing transmission of bytes. A Handler

is a mapping between a Receive and a Send, and represents the users hook to add logic for mapping requests

to actual processing code. Any uncaught exceptions in the reading or writing of the transmissions will result in

the server logging an error and closing the offending socket. As a result it is the duty of the Handler

implementation to catch and serialize any application-level errors that should be sent to the client.

 

This slightly lower-level interface that models sending and receiving rather than requests and responses

is necessary in order to allow the send or receive to be overridden with a non-user-space writing of bytes

using FileChannel.transferTo.

启动过程

网络层的启动在SocketServer.kafka中, 属于KafkaServer启动过程中的一部分
首先看一下server.properties中的网络相关配置

  • listener就是本地的hostname和端口号, 没有的话会使用InetAddress和默认值(9092)
  • num.network.threads 类比netty中的worker threads num,是负责处理请求的线程的数量,nio的reactor模式一般是前面有一个Acceptor负责连接的建立,建立后Reactor将各种读写事件分发给各个Handler处理,这个num是分发处理读写事件的io的线程数。
  • num.io.threads 就是配置的Handler的数量,每个Handler一般都是一个线程(也叫IOThread)来处理。

  • queued.max.requests 在Handler处理完成前能够排队的request的数量,相当于应用层的request buffer
  • socket.send.buffer.bytes socket options里的sendbuffer
  • socket.receive.buffer.bytes receive buffer
  • socket.request.max.bytes 请求的最大的byte大小,因为接受请求时需要申请空间来存储请求,如果太大会导致oom,这是一个保护措施。

1

2

3

4

5

6

7

8

9

10

11

12

# The number of threads that the server uses for receiving requests from the network and sending responses to the network

num.network.threads=3

# The number of threads that the server uses for processing requests, which may include disk I/O

num.io.threads=8

# The number of queued request allowed before blocking the network threads

#queued.max.requests

# The send buffer (SO_SNDBUF) used by the socket server

socket.send.buffer.bytes=102400

# The receive buffer (SO_RCVBUF) used by the socket server

socket.receive.buffer.bytes=102400

# The maximum size of a request that the socket server will accept (protection against OOM)

socket.request.max.bytes=104857600

SocketServer

这个类上的注释阐述了kafka server的io线程模型

1

2

3

4

5

6

7

8

这个类上的注释阐述了kafka server的io线程模型

 

/**

 * An NIO socket server. The threading model is

 *   1 Acceptor thread that handles new connections

 *   Acceptor has N Processor threads that each have their own selector and read requests from sockets

 *   M Handler threads that handle requests and produce responses back to the processor threads for writing.

 */

一共三种线程。一个Acceptor线程负责处理新连接请求,会有N个Processor线程,每个都有自己的Selector,负责从socket中读取请求和将返回结果写回。然后会有M个Handler线程,负责处理请求,并且将结果返回给Processor。
将Acceptor和Processor线程分开的目的是为了避免读写频繁影响新连接的接收。

SocketServer初始化

SockerServer创建的时候通过server.properties和默认的配置中获取配置,如numNetworkThread(num.network.threads,也就是线程模型中的N)、

创建processor数组、acceptorMap(因为可能会在多个Endpoint接收请求)、memoryPool(SimpleMemoryPool里主要做的事情是统计监控ByteBuffer的使用)、requestChanne等 。

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

private val endpoints = config.listeners.map(l => l.listenerName -> l).toMap

private val numProcessorThreads = config.numNetworkThreads

private val maxQueuedRequests = config.queuedMaxRequests

private val totalProcessorThreads = numProcessorThreads * endpoints.size

private val maxConnectionsPerIp = config.maxConnectionsPerIp

private val maxConnectionsPerIpOverrides = config.maxConnectionsPerIpOverrides

this.logIdent = "[Socket Server on Broker " + config.brokerId + "], "

private val memoryPoolSensor = metrics.sensor("MemoryPoolUtilization")

private val memoryPoolDepletedPercentMetricName = metrics.metricName("MemoryPoolAvgDepletedPercent", "socket-server-metrics")

memoryPoolSensor.add(memoryPoolDepletedPercentMetricName, new Rate(TimeUnit.MILLISECONDS))

private val memoryPool = if (config.queuedMaxBytes > 0) new SimpleMemoryPool(config.queuedMaxBytes, config.socketRequestMaxBytes, false, memoryPoolSensor) else MemoryPool.NONE

val requestChannel = new RequestChannel(totalProcessorThreads, maxQueuedRequests)

private val processors = new Array[Processor](totalProcessorThreads)

private[network] val acceptors = mutable.Map[EndPoint, Acceptor]()

private var connectionQuotas: ConnectionQuotas = _

RequestChannel

因为Nio带来的异步特性,就是在一个连接上可以连续发送多个应用层的请求,每个请求得到是一个返回的Future。RequestChannel中将请求和返回结果放在各自的BlockingQueue中,也就是requestQueue和responseQueue,这里的request指客户端发来的请求。requestQueue的大小是queued.max.requests定义的,默认500。而每个RequestChannel中有numProcessor大小个responseQueue(无界的LinkedBlockingQueue)。
这样Handler从requestQueue中取request处理得到response然后put到responseQueue中。Processor则把接收到的byte转换成requestput到requestQueue中,并从responseQueue中拉response写回给对应的socket。

startup

startup中创建Processor、Acceptor。创建connectionQuotas, 限制每个客户端ip的最大连接数。

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

  val sendBufferSize = config.socketSendBufferBytes

  val recvBufferSize = config.socketReceiveBufferBytes

  val brokerId = config.brokerId

  var processorBeginIndex = 0

  config.listeners.foreach { endpoint =>

    val listenerName = endpoint.listenerName

    val securityProtocol = endpoint.securityProtocol

    val processorEndIndex = processorBeginIndex + numProcessorThreads

    for (i <- processorBeginIndex until processorEndIndex)

      processors(i) = newProcessor(i, connectionQuotas, listenerName, securityProtocol, memoryPool)

    val acceptor = new Acceptor(endpoint, sendBufferSize, recvBufferSize, brokerId,

      processors.slice(processorBeginIndex, processorEndIndex), connectionQuotas)

    acceptors.put(endpoint, acceptor)

    KafkaThread.nonDaemon(s"kafka-socket-acceptor-$listenerName-$securityProtocol-${endpoint.port}", acceptor).start()

    acceptor.awaitStartup()

    processorBeginIndex = processorEndIndex

  }

}

Acceptor创建过程中启动了Processor线程。

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

/**

 * Thread that accepts and configures new connections. There is one of these per endpoint.

 */

private[kafka] class Acceptor(val endPoint: EndPoint,

                              val sendBufferSize: Int,

                              val recvBufferSize: Int,

                              brokerId: Int,

                              processors: Array[Processor],

                              connectionQuotas: ConnectionQuotas) extends AbstractServerThread(connectionQuotas) with KafkaMetricsGroup {

  private val nioSelector = NSelector.open()

  val serverChannel = openServerSocket(endPoint.host, endPoint.port)

  this.synchronized {

    processors.foreach { processor =>

      KafkaThread.nonDaemon(s"kafka-network-thread-$brokerId-${endPoint.listenerName}-${endPoint.securityProtocol}-${processor.id}",

        processor).start()

    }

  }

Acceptor和Processor启动后各自执行自己的loop。

Acceptor只负责接收新连接,并采用round-robin的方式交给各个Processor

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

/**

  * Accept loop that checks for new connection attempts

  */

 def run() {

   serverChannel.register(nioSelector, SelectionKey.OP_ACCEPT)

   startupComplete()

   try {

     var currentProcessor = 0

     while (isRunning) {

       try {

         val ready = nioSelector.select(500)

         if (ready > 0) {

           val keys = nioSelector.selectedKeys()

           val iter = keys.iterator()

           while (iter.hasNext && isRunning) {

             try {

               val key = iter.next

               iter.remove()

               if (key.isAcceptable)

                 accept(key, processors(currentProcessor))

               else

                 throw new IllegalStateException("Unrecognized key state for acceptor thread.")

               // round robin to the next processor thread

               currentProcessor = (currentProcessor + 1) % processors.length

             } catch {

               case e: Throwable => error("Error while accepting connection", e)

             }

           }

         }

       }

       catch {

         // We catch all the throwables to prevent the acceptor thread from exiting on exceptions due

         // to a select operation on a specific channel or a bad request. We don't want

         // the broker to stop responding to requests from other clients in these scenarios.

         case e: ControlThrowable => throw e

         case e: Throwable => error("Error occurred", e)

       }

     }

   } finally {

     debug("Closing server socket and selector.")

     swallowError(serverChannel.close())

     swallowError(nioSelector.close())

     shutdownComplete()

   }

 }

Acceptor接收配置socket并传给processor

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

/*

   * Accept a new connection

   */

  def accept(key: SelectionKey, processor: Processor) {

    val serverSocketChannel = key.channel().asInstanceOf[ServerSocketChannel]

    val socketChannel = serverSocketChannel.accept()

    try {

      connectionQuotas.inc(socketChannel.socket().getInetAddress)

      socketChannel.configureBlocking(false)

      socketChannel.socket().setTcpNoDelay(true)

      socketChannel.socket().setKeepAlive(true)

      if (sendBufferSize != Selectable.USE_DEFAULT_BUFFER_SIZE)

        socketChannel.socket().setSendBufferSize(sendBufferSize)

      debug("Accepted connection from %s on %s and assigned it to processor %d, sendBufferSize [actual|requested]: [%d|%d] recvBufferSize [actual|requested]: [%d|%d]"

            .format(socketChannel.socket.getRemoteSocketAddress, socketChannel.socket.getLocalSocketAddress, processor.id,

                  socketChannel.socket.getSendBufferSize, sendBufferSize,

                  socketChannel.socket.getReceiveBufferSize, recvBufferSize))

      processor.accept(socketChannel)

    } catch {

      case e: TooManyConnectionsException =>

        info("Rejected connection from %s, address already has the configured maximum of %d connections.".format(e.ip, e.count))

        close(socketChannel)

    }

  }

Processor的循环。

  • 设置新连接
  • 如果有Response尝试写回
  • 带timeout的poll一次
  • 接收Request
  • 处理已经发送成功的Response
  • 处理已经断开的连接

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

override def run() {

   startupComplete()

   while (isRunning) {

     try {

       // setup any new connections that have been queued up

       configureNewConnections()

       // register any new responses for writing

       processNewResponses()

       poll()

       processCompletedReceives()

       processCompletedSends()

       processDisconnected()

     } catch {

       // We catch all the throwables here to prevent the processor thread from exiting. We do this because

       // letting a processor exit might cause a bigger impact on the broker. Usually the exceptions thrown would

       // be either associated with a specific socket channel or a bad request. We just ignore the bad socket channel

       // or request. This behavior might need to be reviewed if we see an exception that need the entire broker to stop.

       case e: ControlThrowable => throw e

       case e: Throwable =>

         error("Processor got uncaught exception.", e)

     }

   }

   debug("Closing selector - processor " + id)

   swallowError(closeAll())

   shutdownComplete()

 }

处理新连接 configureNewConnections

Acceptor传过来的新socket放在了一个ConcorrentLinkedQueue中,
congiureNewConnections()负责获取ip端口号等信息然后注册到Processor自己的selector上。这个selector是Kafka封装了一层的KSelector

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

/**

   * Register any new connections that have been queued up

   */

  private def configureNewConnections() {

    while (!newConnections.isEmpty) {

      val channel = newConnections.poll()

      try {

        debug(s"Processor $id listening to new connection from ${channel.socket.getRemoteSocketAddress}")

        val localHost = channel.socket().getLocalAddress.getHostAddress

        val localPort = channel.socket().getLocalPort

        val remoteHost = channel.socket().getInetAddress.getHostAddress

        val remotePort = channel.socket().getPort

        val connectionId = ConnectionId(localHost, localPort, remoteHost, remotePort).toString

        selector.register(connectionId, channel)

      } catch {

        // We explicitly catch all non fatal exceptions and close the socket to avoid a socket leak. The other

        // throwables will be caught in processor and logged as uncaught exceptions.

        case NonFatal(e) =>

          val remoteAddress = channel.getRemoteAddress

          // need to close the channel here to avoid a socket leak.

          close(channel)

          error(s"Processor $id closed connection from $remoteAddress", e)

      }

    }

  }

processNewResponses

从requestChannel中poll待写回的Response,这里是将Channel的send变量设置为Response.responseSend等待Selector处理

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

private def processNewResponses() {

    var curr = requestChannel.receiveResponse(id)

    while (curr != null) {

      try {

        curr.responseAction match {

          case RequestChannel.NoOpAction =>

            // There is no response to send to the client, we need to read more pipelined requests

            // that are sitting in the server's socket buffer

            updateRequestMetrics(curr.request)

            trace("Socket server received empty response to send, registering for read: " + curr)

            val channelId = curr.request.connectionId

            if (selector.channel(channelId) != null || selector.closingChannel(channelId) != null)

                selector.unmute(channelId)

          case RequestChannel.SendAction =>

            val responseSend = curr.responseSend.getOrElse(

              throw new IllegalStateException(s"responseSend must be defined for SendAction, response: $curr"))

            sendResponse(curr, responseSend)

          case RequestChannel.CloseConnectionAction =>

            updateRequestMetrics(curr.request)

            trace("Closing socket connection actively according to the response code.")

            close(selector, curr.request.connectionId)

        }

      } finally {

        curr = requestChannel.receiveResponse(id)

      }

    }

  }

Selector.send

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

/**

    * Queue the given request for sending in the subsequent {@link #poll(long)} calls

    * @param send The request to send

    */

   public void send(Send send) {

       String connectionId = send.destination();

       if (closingChannels.containsKey(connectionId))

           this.failedSends.add(connectionId);

       else {

           KafkaChannel channel = channelOrFail(connectionId, false);

           try {

               channel.setSend(send);

           } catch (CancelledKeyException e) {

               this.failedSends.add(connectionId);

               close(channel, false);

           }

       }

   }

processCompletedReceives

Selector在接收到请求后,将数据放到一个List中,Processor取出后put到requestChannel的requestQueue中

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

private def processCompletedReceives() {

    selector.completedReceives.asScala.foreach { receive =>

      try {

        val openChannel = selector.channel(receive.source)

        // Only methods that are safe to call on a disconnected channel should be invoked on 'openOrClosingChannel'.

        val openOrClosingChannel = if (openChannel != null) openChannel else selector.closingChannel(receive.source)

        val session = RequestChannel.Session(new KafkaPrincipal(KafkaPrincipal.USER_TYPE, openOrClosingChannel.principal.getName), openOrClosingChannel.socketAddress)

        val req = new RequestChannel.Request(processor = id, connectionId = receive.source, session = session,

          startTimeNanos = time.nanoseconds, listenerName = listenerName, securityProtocol = securityProtocol,

          memoryPool, receive.payload)

        requestChannel.sendRequest(req)

        selector.mute(receive.source)

      } catch {

        case e @ (_: InvalidRequestException | _: SchemaException) =>

          // note that even though we got an exception, we can assume that receive.source is valid. Issues with constructing a valid receive object were handled earlier

          error(s"Closing socket for ${receive.source} because of error", e)

          close(selector, receive.source)

      }

    }

  }

processCompletedSends

在Selector发送完成Resposne后,从inflightResponse中remove掉这个connnection -> resposne的键值对,当前inflightResposne只用于验证response的正确性,就是一个Channel写的数据必须在发送后先记录在inflightResponse中

1

2

3

4

5

6

7

8

9

private def processCompletedSends() {

  selector.completedSends.asScala.foreach { send =>

    val resp = inflightResponses.remove(send.destination).getOrElse {

      throw new IllegalStateException(s"Send for ${send.destination} completed, but not in `inflightResponses`")

    }

    updateRequestMetrics(resp.request)

    selector.unmute(send.destination)

  }

}

processDisconnected

写失败的连接和由于各种原因close的连接,需要清理已经占用的内存空间,例如inflightResponses。

1

2

3

4

5

6

7

8

9

10

private def processDisconnected() {

    selector.disconnected.keySet.asScala.foreach { connectionId =>

      val remoteHost = ConnectionId.fromString(connectionId).getOrElse {

        throw new IllegalStateException(s"connectionId has unexpected format: $connectionId")

      }.remoteHost

      inflightResponses.remove(connectionId).foreach(response => updateRequestMetrics(response.request))

      // the channel has been closed by the selector but the quotas still need to be updated

      connectionQuotas.dec(InetAddress.getByName(remoteHost))

    }

  }

至此网络部分基本分析完成,后面有涉及到的要注意的地方会单独介绍。
startup完成后,KafkaServer继续完成其他的startup

Kafka Client端网络代码

clients包里分成主要Send、Receive、KafkaChannel和Selector四部分

Selectable是其中的网络操作的接口, Selector是具体的实现, 包括了发送请求、接收返回、建立连接、断开连接等操作。

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

/**

 * An interface for asynchronous, multi-channel network I/O

 */

public interface Selectable {

    /**

     * See {@link #connect(String, InetSocketAddress, int, int) connect()}

     */

    public static final int USE_DEFAULT_BUFFER_SIZE = -1;

    /**

     * Begin establishing a socket connection to the given address identified by the given address

     * @param id The id for this connection

     * @param address The address to connect to

     * @param sendBufferSize The send buffer for the socket

     * @param receiveBufferSize The receive buffer for the socket

     * @throws IOException If we cannot begin connecting

     */

    public void connect(String id, InetSocketAddress address, int sendBufferSize, int receiveBufferSize) throws IOException;

    /**

     * Wakeup this selector if it is blocked on I/O

     */

    public void wakeup();

    /**

     * Close this selector

     */

    public void close();

    /**

     * Close the connection identified by the given id

     */

    public void close(String id);

    /**

     * Queue the given request for sending in the subsequent {@link #poll(long) poll()} calls

     * @param send The request to send

     */

    public void send(Send send);

    /**

     * Do I/O. Reads, writes, connection establishment, etc.

     * @param timeout The amount of time to block if there is nothing to do

     * @throws IOException

     */

    public void poll(long timeout) throws IOException;

    /**

     * The list of sends that completed on the last {@link #poll(long) poll()} call.

     */

    public List<Send> completedSends();

    /**

     * The list of receives that completed on the last {@link #poll(long) poll()} call.

     */

    public List<NetworkReceive> completedReceives();

    /**

     * The connections that finished disconnecting on the last {@link #poll(long) poll()}

     * call. Channel state indicates the local channel state at the time of disconnection.

     */

    public Map<String, ChannelState> disconnected();

    /**

     * The list of connections that completed their connection on the last {@link #poll(long) poll()}

     * call.

     */

    public List<String> connected();

    ...

}

Send作为要发送数据的接口, 子类实现complete()方法用于判断是否已经发送完成,实现writeTo(GatheringByteChannel channel)方法来实现写入到Channel中,
size()方法返回要发送的数据的大小

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

/**

 * This interface models the in-progress sending of data to a destination identified by an integer id.

 */

public interface Send {

    /**

     * The numeric id for the destination of this send

     */

    String destination();

    /**

     * Is this send complete?

     */

    boolean completed();

    /**

     * Write some as-yet unwritten bytes from this send to the provided channel. It may take multiple calls for the send

     * to be completely written

     * @param channel The Channel to write to

     * @return The number of bytes written

     * @throws IOException If the write fails

     */

    long writeTo(GatheringByteChannel channel) throws IOException;

    /**

     * Size of the send

     */

    long size();

}

以ByteBufferSend实现为例, 保存ByteBuffer数组作为要发送的内容,size就是这些ByteBuffer.remaining()的和,发送只需要委托给channe.write即可,每次发送后检查剩余待发送的大小,当没有待发送的内容并且channel中也都已经发送完成就表示Send已经完成了。

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

public class ByteBufferSend implements Send {

    private final String destination;

    private final int size;

    protected final ByteBuffer[] buffers;

    private int remaining;

    private boolean pending = false;

    public ByteBufferSend(String destination, ByteBuffer... buffers) {

        this.destination = destination;

        this.buffers = buffers;

        for (ByteBuffer buffer : buffers)

            remaining += buffer.remaining();

        this.size = remaining;

    }

    @Override

    public String destination() {

        return destination;

    }

    @Override

    public boolean completed() {

        return remaining <= 0 && !pending;

    }

    @Override

    public long size() {

        return this.size;

    }

    @Override

    public long writeTo(GatheringByteChannel channel) throws IOException {

        long written = channel.write(buffers);

        if (written < 0)

            throw new EOFException("Wrote negative bytes to channel. This shouldn't happen.");

        remaining -= written;

        pending = TransportLayers.hasPendingWrites(channel);

        return written;

    }

}

NetworkSend类继承了ByteBufferSend,增加了4字节表示内容大小(不包含这4byte)。

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

/**

 * A size delimited Send that consists of a 4 byte network-ordered size N followed by N bytes of content

 */

public class NetworkSend extends ByteBufferSend {

    public NetworkSend(String destination, ByteBuffer buffer) {

        super(destination, sizeDelimit(buffer));

    }

    private static ByteBuffer[] sizeDelimit(ByteBuffer buffer) {

        return new ByteBuffer[] {sizeBuffer(buffer.remaining()), buffer};

    }

    private static ByteBuffer sizeBuffer(int size) {

        ByteBuffer sizeBuffer = ByteBuffer.allocate(4);

        sizeBuffer.putInt(size);

        sizeBuffer.rewind();

        return sizeBuffer;

    }

}

与Send对应的是Receive,表示从Channel中读取的数据

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

public interface Receive extends Closeable {

    /**

     * The numeric id of the source from which we are receiving data.

     */

    String source();

    /**

     * Are we done receiving data?

     */

    boolean complete();

    /**

     * Read bytes into this receive from the given channel

     * @param channel The channel to read from

     * @return The number of bytes read

     * @throws IOException If the reading fails

     */

    long readFrom(ScatteringByteChannel channel) throws IOException;

    /**

     * Do we know yet how much memory we require to fully read this

     */

    boolean requiredMemoryAmountKnown();

    /**

     * Has the underlying memory required to complete reading been allocated yet?

     */

    boolean memoryAllocated();

}

org.apache.kafka.common.network.Selector类则负责具体的连接写入读取等操作
下面分析下这几个操作的实现

connect过程,由于connect是异步的,所以connect方法返回后不一定已经连接成功了,需要等SelectionKey.isConnectable()后判断一次Channel.finishConnect才算连接成功。

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

public void connect(String id, InetSocketAddress address, int sendBufferSize, int receiveBufferSize) throws IOException {

        if (this.channels.containsKey(id))

            throw new IllegalStateException("There is already a connection for id " + id);

        SocketChannel socketChannel = SocketChannel.open();

        socketChannel.configureBlocking(false);

        Socket socket = socketChannel.socket();

        socket.setKeepAlive(true);

        if (sendBufferSize != Selectable.USE_DEFAULT_BUFFER_SIZE)

            socket.setSendBufferSize(sendBufferSize);

        if (receiveBufferSize != Selectable.USE_DEFAULT_BUFFER_SIZE)

            socket.setReceiveBufferSize(receiveBufferSize);

        socket.setTcpNoDelay(true);

        boolean connected;

        try {

            connected = socketChannel.connect(address);

        } catch (UnresolvedAddressException e) {

            socketChannel.close();

            throw new IOException("Can't resolve address: " + address, e);

        } catch (IOException e) {

            socketChannel.close();

            throw e;

        }

        SelectionKey key = socketChannel.register(nioSelector, SelectionKey.OP_CONNECT);

        KafkaChannel channel;

        try {

            channel = channelBuilder.buildChannel(id, key, maxReceiveSize, memoryPool);

        } catch (Exception e) {

            try {

                socketChannel.close();

            } finally {

                key.cancel();

            }

            throw new IOException("Channel could not be created for socket " + socketChannel, e);

        }

        key.attach(channel);

        this.channels.put(id, channel);

        if (connected) {

            // OP_CONNECT won't trigger for immediately connected channels

            log.debug("Immediately connected to node {}", channel.id());

            immediatelyConnectedKeys.add(key);

            key.interestOps(0);

        }

    }

poll方法,poll方法会调用一次JavaSelector的select方法,然后处理SelectionKey,分成可连接可读可写

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

public void poll(long timeout) throws IOException {

     if (timeout < 0)

         throw new IllegalArgumentException("timeout should be >= 0");

     boolean madeReadProgressLastCall = madeReadProgressLastPoll;

     clear();

     boolean dataInBuffers = !keysWithBufferedRead.isEmpty();

     if (hasStagedReceives() || !immediatelyConnectedKeys.isEmpty() || (madeReadProgressLastCall && dataInBuffers))

         timeout = 0;

     if (!memoryPool.isOutOfMemory() && outOfMemory) {

         //we have recovered from memory pressure. unmute any channel not explicitly muted for other reasons

         log.trace("Broker no longer low on memory - unmuting incoming sockets");

         for (KafkaChannel channel : channels.values()) {

             if (channel.isInMutableState() && !explicitlyMutedChannels.contains(channel)) {

                 channel.unmute();

             }

         }

         outOfMemory = false;

     }

     /* check ready keys */

     long startSelect = time.nanoseconds();

     int numReadyKeys = select(timeout);

     long endSelect = time.nanoseconds();

     this.sensors.selectTime.record(endSelect - startSelect, time.milliseconds());

     if (numReadyKeys > 0 || !immediatelyConnectedKeys.isEmpty() || dataInBuffers) {

         Set<SelectionKey> readyKeys = this.nioSelector.selectedKeys();

         keysWithBufferedRead.removeAll(readyKeys); //so no channel gets polled twice

         //poll from channels that have buffered data (but nothing more from the underlying socket)

         if (!keysWithBufferedRead.isEmpty()) {

             Set<SelectionKey> toPoll = keysWithBufferedRead;

             keysWithBufferedRead = new HashSet<>(); //poll() calls will repopulate if needed

             pollSelectionKeys(toPoll, false, endSelect);

         }

         //poll from channels where the underlying socket has more data

         pollSelectionKeys(readyKeys, false, endSelect);

         pollSelectionKeys(immediatelyConnectedKeys, true, endSelect);

     } else {

         madeReadProgressLastPoll = true; //no work is also "progress"

     }

     long endIo = time.nanoseconds();

     this.sensors.ioTime.record(endIo - endSelect, time.milliseconds());

     // we use the time at the end of select to ensure that we don't close any connections that

     // have just been processed in pollSelectionKeys

     maybeCloseOldestConnection(endSelect);

     // Add to completedReceives after closing expired connections to avoid removing

     // channels with completed receives until all staged receives are completed.

     addToCompletedReceives();

 }

pollSelectionKeys

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

/**

     * handle any ready I/O on a set of selection keys

     * @param selectionKeys set of keys to handle

     * @param isImmediatelyConnected true if running over a set of keys for just-connected sockets

     * @param currentTimeNanos time at which set of keys was determined

     */

    private void pollSelectionKeys(Set<SelectionKey> selectionKeys,

                                   boolean isImmediatelyConnected,

                                   long currentTimeNanos) {

        Iterator<SelectionKey> iterator = determineHandlingOrder(selectionKeys).iterator();

        while (iterator.hasNext()) {

            SelectionKey key = iterator.next();

            iterator.remove();

            KafkaChannel channel = channel(key);

            long channelStartTimeNanos = recordTimePerConnection ? time.nanoseconds() : 0;

            // register all per-connection metrics at once

            sensors.maybeRegisterConnectionMetrics(channel.id());

            if (idleExpiryManager != null)

                idleExpiryManager.update(channel.id(), currentTimeNanos);

            try {

                /* complete any connections that have finished their handshake (either normally or immediately) */

                if (isImmediatelyConnected || key.isConnectable()) {

                    if (channel.finishConnect()) {

                        this.connected.add(channel.id());

                        this.sensors.connectionCreated.record();

                        SocketChannel socketChannel = (SocketChannel) key.channel();

                        log.debug("Created socket with SO_RCVBUF = {}, SO_SNDBUF = {}, SO_TIMEOUT = {} to node {}",

                                socketChannel.socket().getReceiveBufferSize(),

                                socketChannel.socket().getSendBufferSize(),

                                socketChannel.socket().getSoTimeout(),

                                channel.id());

                    } else

                        continue;

                }

                /* if channel is not ready finish prepare */

                if (channel.isConnected() && !channel.ready()) {

                    channel.prepare();

                }

                attemptRead(key, channel);

                if (channel.hasBytesBuffered()) {

                    //this channel has bytes enqueued in intermediary buffers that we could not read

                    //(possibly because no memory). it may be the case that the underlying socket will

                    //not come up in the next poll() and so we need to remember this channel for the

                    //next poll call otherwise data may be stuck in said buffers forever.

                    keysWithBufferedRead.add(key);

                }

                /* if channel is ready write to any sockets that have space in their buffer and for which we have data */

                if (channel.ready() && key.isWritable()) {

                    Send send = channel.write();

                    if (send != null) {

                        this.completedSends.add(send);

                        this.sensors.recordBytesSent(channel.id(), send.size());

                    }

                }

                /* cancel any defunct sockets */

                if (!key.isValid())

                    close(channel, true);

            } catch (Exception e) {

                String desc = channel.socketDescription();

                if (e instanceof IOException)

                    log.debug("Connection with {} disconnected", desc, e);

                else

                    log.warn("Unexpected error from {}; closing connection", desc, e);

                close(channel, true);

            } finally {

                maybeRecordTimePerConnection(channel, channelStartTimeNanos);

            }

        }

    }

Selector.send(Send send)方法只需要找到对应的channel然后调用KafkaChanne.setSend(Send send), KafkaChannel中同时只允许写一个Send对象,发送完成才能发送下一个

KafkaClient是Kafka定义的高层的接口

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

/**

 * Queue up the given request for sending. Requests can only be sent on ready connections.

 * @param request The request

 * @param now The current timestamp

 */

void send(ClientRequest request, long now);

/**

 * Do actual reads and writes from sockets.

 *

 * @param timeout The maximum amount of time to wait for responses in ms, must be non-negative. The implementation

 *                is free to use a lower value if appropriate (common reasons for this are a lower request or

 *                metadata update timeout)

 * @param now The current time in ms

 * @throws IllegalStateException If a request is sent to an unready node

 */

List<ClientResponse> poll(long timeout, long now);

关键的接口有send和poll, send方法将要发送的内容保存起来,真正的Channel读写发生在poll方法中
KafkaClient的实现类是NetworkClient。
ClientRequest中通过requetBuilder给不同类型的请求设置不同的请求内容

1

2

3

4

5

6

7

8

public final class ClientRequest {

    private final String destination;

    private final AbstractRequest.Builder<?> requestBuilder;

    private final int correlationId;

    private final String clientId;

    private final long createdTimeMs;

    private final boolean expectResponse;

    private final RequestCompletionHandler callback;

同样的,ClientResponse也有对应各个类型不同的返回体

1

2

3

4

5

6

7

8

9

public class ClientResponse {

    private final RequestHeader requestHeader;

    private final RequestCompletionHandler callback;

    private final String destination;

    private final long receivedTimeMs;

    private final long latencyMs;

    private final boolean disconnected;

    private final UnsupportedVersionException versionMismatch;

    private final AbstractResponse responseBody;

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

private void doSend(ClientRequest clientRequest, boolean isInternalRequest, long now, AbstractRequest request) {

    String nodeId = clientRequest.destination();

    ...

    Send send = request.toSend(nodeId, header);

    InFlightRequest inFlightRequest = new InFlightRequest(

            header,

            clientRequest.createdTimeMs(),

            clientRequest.destination(),

            clientRequest.callback(),

            clientRequest.expectResponse(),

            isInternalRequest,

            request,

            send,

            now);

    this.inFlightRequests.add(inFlightRequest);

    selector.send(inFlightRequest.send);

}

1

2

3

4

5

6

7

8

9

10

11

12

13

14

public void send(Send send) {

    String connectionId = send.destination();

    if (closingChannels.containsKey(connectionId))

        this.failedSends.add(connectionId);

    else {

        KafkaChannel channel = channelOrFail(connectionId, false);

        try {

            channel.setSend(send);

        } catch (CancelledKeyException e) {

            this.failedSends.add(connectionId);

            close(channel, false);

        }

    }

}

poll的处理流程为

  1. 调用selector.poll
  2. 处理已经发送完成的Send, 有一些请求不需要等待返回
  3. 处理收到的返回结果
  4. 处理断开的连接
  5. 处理新连接
  6. 处理新建连接后获取api版本号的请求
  7. 处理超时请求
  8. 调用各个Response的onComplete方法, onComplete实际调用的是ClientRequest中设置的callback

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

public List<ClientResponse> poll(long timeout, long now) {

    if (!abortedSends.isEmpty()) {

        // If there are aborted sends because of unsupported version exceptions or disconnects,

        // handle them immediately without waiting for Selector#poll.

        List<ClientResponse> responses = new ArrayList<>();

        handleAbortedSends(responses);

        completeResponses(responses);

        return responses;

    }

    long metadataTimeout = metadataUpdater.maybeUpdate(now);

    try {

        this.selector.poll(Utils.min(timeout, metadataTimeout, requestTimeoutMs));

    } catch (IOException e) {

        log.error("Unexpected error during I/O", e);

    }

    // process completed actions

    long updatedNow = this.time.milliseconds();

    List<ClientResponse> responses = new ArrayList<>();

    handleCompletedSends(responses, updatedNow);

    handleCompletedReceives(responses, updatedNow);

    handleDisconnections(responses, updatedNow);

    handleConnections();

    handleInitiateApiVersionRequests(updatedNow);

    handleTimedOutRequests(responses, updatedNow);

    completeResponses(responses);

    return responses;

}

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值