一、背景
在本专题的第一篇文章中,我们简单介绍了Netty网络I/O框架推出的必要性,并给出了一个十分简单的Netty Server端实例。虽然它可以直接运行,但是由于非常简陋,因此并不适宜在生产环境直接使用。从本文开始,我们将对Netty中涉及到的主要组件简单加以介绍,本文主要介绍Netty Server端程序涉及到的Netty常用组件。
二、Netty Server端常用组件
我们先回到Netty使用概述中涉及到的范例,从中不难发现,Netty Server端代码中,主要用到的组件包含以下几个:
- ServerBootstrap:工作引导器;
- Channel:封装了网络通信的通道;
- EventLoopGroup:封装了通道事件的处理线程池;
- ChannelHandler:对Channel中的事件的处理器。
要想更好地了解这些组件都是如何工作的,如何构成一个整体完成Server端的启动、监听与事件处理,不妨从ServerBootstrap启动,bind到指定端口的流程开始分析,一步一步深入探究一下其中的奥秘。
2.1 端口绑定过程
bootstrap.group(eventLoopGroup,childEventLoopGroup).channel(serverSocketChannelClass).childHandler(
new ChannelInitializer<SocketChannel>() {
@Override
protected void initChannel(SocketChannel socketChannel) throws Exception {
socketChannel.pipeline().addLast(new SimpleChannelHandler());
}
}).option(ChannelOption.TCP_NODELAY, true).childOption(ChannelOption.SO_KEEPALIVE, true);
首先看最开始的一步,本质上是向引导器ServerBootstrap中设置EventLoopGroup、Channel和ChannelHandler。我们可以将ServerBootstrap理解为Netty Server端组件的容器,最开始的这一步即是将各组件设置到其中,在后续的处理流程中,将一一发挥作用。
接下来是关键的一步:
bootstrap.bind(port).addListener(
(ChannelFutureListener) channelFuture -> {
if (channelFuture.isSuccess()) {
LOG.info("netty server started at port:{}", port);
} else {
LOG.error("netty server failed to start at port:{}", port);
throw new Exception("failed to start netty server at port: " + port);
}
}).sync();
关注其中的bind(port),层层深入下去:
public ChannelFuture bind(SocketAddress localAddress) {
this.validate();
if (localAddress == null) {
throw new NullPointerException("localAddress");
} else {
return this.doBind(localAddress);
}
}
不难看出,根据传入的port构造SocketAddress,然后进行校验,之后调用this.doBind进行实际的绑定端口操作。
private ChannelFuture doBind(final SocketAddress localAddress) {
final ChannelFuture regFuture = initAndRegister();
final Channel channel = regFuture.channel();
if (regFuture.cause() != null) {
return regFuture;
}
if (regFuture.isDone()) {
// At this point we know that the registration was complete and successful.
ChannelPromise promise = channel.newPromise();
doBind0(regFuture, channel, localAddress, promise);
return promise;
} else {
// Registration future is almost always fulfilled already, but just in case it's not.
final PendingRegistrationPromise promise = new PendingRegistrationPromise(channel);
regFuture.addListener(new ChannelFutureListener() {
@Override
public void operationComplete(ChannelFuture future) throws Exception {
Throwable cause = future.cause();
if (cause != null) {
// Registration on the EventLoop failed so fail the ChannelPromise directly to not cause an
// IllegalStateException once we try to access the EventLoop of the Channel.
promise.setFailure(cause);
} else {
// Registration was successful, so set the correct executor to use.
// See https://github.com/netty/netty/issues/2586
promise.executor = channel.eventLoop();
}
doBind0(regFuture, channel, localAddress, promise);
}
});
return promise;
}
}
doBind方法比较长,我们一步一步来。
首先看initAndRegister。
final ChannelFuture initAndRegister() {
final Channel channel = channelFactory().newChannel();
try {
init(channel);
} catch (Throwable t) {
channel.unsafe().closeForcibly();
// as the Channel is not registered yet we need to force the usage of the GlobalEventExecutor
return new DefaultChannelPromise(channel, GlobalEventExecutor.INSTANCE).setFailure(t);
}
ChannelFuture regFuture = group().register(channel);
if (regFuture.cause() != null) {
if (channel.isRegistered()) {
channel.close();
} else {
channel.unsafe().closeForcibly();
}
}
// If we are here and the promise is not failed, it's one of the following cases:
// 1) If we attempted registration from the event loop, the registration has been completed at this point.
// i.e. It's safe to attempt bind() or connect() now because the channel has been registered.
// 2) If we attempted registration from the other thread, the registration request has been successfully
// added to the event loop's task queue for later execution.
// i.e. It's safe to attempt bind() or connect() now:
// because bind() or connect() will be executed *after* the scheduled registration task is executed
// because register(), bind(), and connect() are all bound to the same thread.
return regFuture;
}
调用channelFactory().newChannel()创建Channel,假设ServerBootstrap中设置的Channel为NioServerSocketChannel,则在创建的过程中,调用NioServerSocketChannel的构造函数,设置blocking为false,创建DefaultChannelPipeline,将创建好的channel注册到DefaultChannelPipeline中,后续由DefaultChannelPipeline来维护关注此Channel事件的ChannelHandler。
接下来调用init方法,将ServerBootstrap中设置的childHandler加入到channel的pipeline的尾部。
之后调用group().register(channel),将channel注册到引导器中设置的EventLoopGroup中。我们依然以NioEventLoopGroup为例,这是一个MultithreadEventLoopGroup,其register(Channel channel)方法实际是调用了SingleThreadEventExecutor的register(Channel channel)方法中是调用了channel.unsafe().register(Eventloop eventloop, final ChannelPromise promise)方法,其代码详情如下:
@Override
public final void register(EventLoop eventLoop, final ChannelPromise promise) {
if (eventLoop == null) {
throw new NullPointerException("eventLoop");
}
if (isRegistered()) {
promise.setFailure(new IllegalStateException("registered to an event loop already"));
return;
}
if (!isCompatible(eventLoop)) {
promise.setFailure(
new IllegalStateException("incompatible event loop type: " + eventLoop.getClass().getName()));
return;
}
AbstractChannel.this.eventLoop = eventLoop;
if (eventLoop.inEventLoop()) {
register0(promise);
} else {
try {
eventLoop.execute(new OneTimeTask() {
@Override
public void run() {
register0(promise);
}
});
} catch (Throwable t) {
logger.warn(
"Force-closing a channel whose registration task was not accepted by an event loop: {}",
AbstractChannel.this, t);
closeForcibly();
closeFuture.setClosed();
safeSetFailure(promise, t);
}
}
}
本质上就是将selector注册到channel中,这一点落到最底层是基于JAVA NIO的。通过注册,selector可以感知channel上的事件,从而进一步完成对事件按业务需求进行处理的操作。
回到顶层doBind方法,关注第二步doBind0。
private static void doBind0(
final ChannelFuture regFuture, final Channel channel,
final SocketAddress localAddress, final ChannelPromise promise) {
// This method is invoked before channelRegistered() is triggered. Give user handlers a chance to set up
// the pipeline in its channelRegistered() implementation.
channel.eventLoop().execute(new OneTimeTask() {
@Override
public void run() {
if (regFuture.isSuccess()) {
channel.bind(localAddress, promise).addListener(ChannelFutureListener.CLOSE_ON_FAILURE);
} else {
promise.setFailure(regFuture.cause());
}
}
});
}
其执行步骤中,最为重要的是取得channel的eventLoop,并执行execute。在这一步中,Netty使用EventLoopGroup和EventLoop两个组件,封装了JAVA线程池以及生产者消费者模式的实现。通过这两者,Netty将绑定端口操作封装成Task,启动线程(或复用当前线程),将Task提交到TaskQueue,完成Task生成过程。然后再由Task的消费者对Task进行消费。
在Task的执行过程中,首先判断上边执行过程中向channel中注册Selector的执行结果,如果成功,则绑定指定的IP和端口,并通知channel所属的pipeline中的ChannelInboundHandler,调用其channelActive方法。由于ChannelHandler在用途定位上就是用于业务方针对Channel上的事件,定制业务所需的处理逻辑的,因此可以在channelActive方法中填写相应业务逻辑。
2.2 从Channel中读取信息
其实从Channel中读取信息、写入信息与绑定端口等操作的操作流程大同小异。
在介绍读取信息流程之前,我们先来简单回顾一下在这个过程中起到重要作用的两个组件:EventLoop和EventLoopGroup。
上文中对于Netty中重要的组件有过简单的介绍,这里举个例子简单回顾一下:
ServerBootstrap相当于Netty各组件执行的容器,Channel相当于通信的管道,所有的I/O操作都是通过Channel进行的。为了解决BIO通信方式的阻塞性,采用NIO通信方式的Netty,通过选择器Selector“选择”当前发生的I/O事件。Netty对于Channel上事件的处理是可以支持链式处理的,即ChannelPipeline,ChannelPipeline与Channel相绑定,并在逻辑上按照链式存储的方式维护了一组ChannelHandler(Channel事件处理器)。当指定的Channel上发生I/O事件时,该事件将在ChannelPipeline中的各个ChannelHandler间通过ChannelHandlerContext进行传递。
在这个流程中,为了更好地提升并发处理性能,需要引入线程池的概念,而EventLoop和EventLoopGroup就是Netty中对JAVA线程池的封装。
我们首先来看一下EventLoopGroup的实例化过程。
childEventLoopGroup = new NioEventLoopGroup(2);
顺着其构造函数,一路向底层,可定位到如下代码:
protected MultithreadEventExecutorGroup(int nThreads, ThreadFactory threadFactory, Object... args) {
if (nThreads <= 0) {
throw new IllegalArgumentException(String.format("nThreads: %d (expected: > 0)", nThreads));
}
if (threadFactory == null) {
threadFactory = newDefaultThreadFactory();
}
children = new SingleThreadEventExecutor[nThreads];
if (isPowerOfTwo(children.length)) {
chooser = new PowerOfTwoEventExecutorChooser();
} else {
chooser = new GenericEventExecutorChooser();
}
for (int i = 0; i < nThreads; i ++) {
boolean success = false;
try {
children[i] = newChild(threadFactory, args);
success = true;
} catch (Exception e) {
// TODO: Think about if this is a good exception type
throw new IllegalStateException("failed to create a child event loop", e);
} finally {
if (!success) {
for (int j = 0; j < i; j ++) {
children[j].shutdownGracefully();
}
for (int j = 0; j < i; j ++) {
EventExecutor e = children[j];
try {
while (!e.isTerminated()) {
e.awaitTermination(Integer.MAX_VALUE, TimeUnit.SECONDS);
}
} catch (InterruptedException interrupted) {
Thread.currentThread().interrupt();
break;
}
}
}
}
}
final FutureListener<Object> terminationListener = new FutureListener<Object>() {
@Override
public void operationComplete(Future<Object> future) throws Exception {
if (terminatedChildren.incrementAndGet() == children.length) {
terminationFuture.setSuccess(null);
}
}
};
for (EventExecutor e: children) {
e.terminationFuture().addListener(terminationListener);
}
}
我们关注其中比较重要的这一句:
children[i] = newChild(threadFactory, args);
以NioEventLoopGroup为例,实际上这个方法执行的是
@Override
protected EventExecutor newChild(
ThreadFactory threadFactory, Object... args) throws Exception {
return new NioEventLoop(this, threadFactory, (SelectorProvider) args[0],
((SelectStrategyFactory) args[1]).newSelectStrategy());
}
进一步向下:
NioEventLoop(NioEventLoopGroup parent, ThreadFactory threadFactory, SelectorProvider selectorProvider,
SelectStrategy strategy) {
super(parent, threadFactory, false);
if (selectorProvider == null) {
throw new NullPointerException("selectorProvider");
}
if (strategy == null) {
throw new NullPointerException("selectStrategy");
}
provider = selectorProvider;
selector = openSelector();
selectStrategy = strategy;
}
关注其中的第一句,实际上是执行了其父类的构造函数:
protected SingleThreadEventExecutor(
EventExecutorGroup parent, ThreadFactory threadFactory, boolean addTaskWakesUp) {
if (threadFactory == null) {
throw new NullPointerException("threadFactory");
}
this.parent = parent;
this.addTaskWakesUp = addTaskWakesUp;
thread = threadFactory.newThread(new Runnable() {
@Override
public void run() {
boolean success = false;
updateLastExecutionTime();
try {
SingleThreadEventExecutor.this.run();
success = true;
} catch (Throwable t) {
logger.warn("Unexpected exception from an event executor: ", t);
} finally {
for (;;) {
int oldState = STATE_UPDATER.get(SingleThreadEventExecutor.this);
if (oldState >= ST_SHUTTING_DOWN || STATE_UPDATER.compareAndSet(
SingleThreadEventExecutor.this, oldState, ST_SHUTTING_DOWN)) {
break;
}
}
// Check if confirmShutdown() was called at the end of the loop.
if (success && gracefulShutdownStartTime == 0) {
logger.error(
"Buggy " + EventExecutor.class.getSimpleName() + " implementation; " +
SingleThreadEventExecutor.class.getSimpleName() + ".confirmShutdown() must be called " +
"before run() implementation terminates.");
}
try {
// Run all remaining tasks and shutdown hooks.
for (;;) {
if (confirmShutdown()) {
break;
}
}
} finally {
try {
cleanup();
} finally {
STATE_UPDATER.set(SingleThreadEventExecutor.this, ST_TERMINATED);
threadLock.release();
if (!taskQueue.isEmpty()) {
logger.warn(
"An event executor terminated with " +
"non-empty task queue (" + taskQueue.size() + ')');
}
terminationFuture.setSuccess(null);
}
}
}
}
});
threadProperties = new DefaultThreadProperties(thread);
taskQueue = newTaskQueue();
}
从构造函数中我们不难看出,EventLoop本质上是一个singleThread(单线程)的线程池,其run方法中,给出了其执行的主要逻辑。一个是创建线程;一个是创建TaskQueue(I/O事件存储队列,用于解耦生产者和消费者)。
以NioEventLoop为例,其线程的run方法如下:
@Override
protected void run() {
for (;;) {
try {
switch (selectStrategy.calculateStrategy(selectNowSupplier, hasTasks())) {
case SelectStrategy.CONTINUE:
continue;
case SelectStrategy.SELECT:
select(wakenUp.getAndSet(false));
// 'wakenUp.compareAndSet(false, true)' is always evaluated
// before calling 'selector.wakeup()' to reduce the wake-up
// overhead. (Selector.wakeup() is an expensive operation.)
//
// However, there is a race condition in this approach.
// The race condition is triggered when 'wakenUp' is set to
// true too early.
//
// 'wakenUp' is set to true too early if:
// 1) Selector is waken up between 'wakenUp.set(false)' and
// 'selector.select(...)'. (BAD)
// 2) Selector is waken up between 'selector.select(...)' and
// 'if (wakenUp.get()) { ... }'. (OK)
//
// In the first case, 'wakenUp' is set to true and the
// following 'selector.select(...)' will wake up immediately.
// Until 'wakenUp' is set to false again in the next round,
// 'wakenUp.compareAndSet(false, true)' will fail, and therefore
// any attempt to wake up the Selector will fail, too, causing
// the following 'selector.select(...)' call to block
// unnecessarily.
//
// To fix this problem, we wake up the selector again if wakenUp
// is true immediately after selector.select(...).
// It is inefficient in that it wakes up the selector for both
// the first case (BAD - wake-up required) and the second case
// (OK - no wake-up required).
if (wakenUp.get()) {
selector.wakeup();
}
default:
// fallthrough
}
cancelledKeys = 0;
needsToSelectAgain = false;
final int ioRatio = this.ioRatio;
if (ioRatio == 100) {
processSelectedKeys();
runAllTasks();
} else {
final long ioStartTime = System.nanoTime();
processSelectedKeys();
final long ioTime = System.nanoTime() - ioStartTime;
runAllTasks(ioTime * (100 - ioRatio) / ioRatio);
}
if (isShuttingDown()) {
closeAll();
if (confirmShutdown()) {
break;
}
}
} catch (Throwable t) {
logger.warn("Unexpected exception in the selector loop.", t);
// Prevent possible consecutive immediate failures that lead to
// excessive CPU consumption.
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
// Ignore.
}
}
}
}
从中不难看出,当线程启动之后(后续会将何时才将线程启动,此处只是对线程池进行初始化),将根据Selector“选择”出的SelectKey(从而区分不同的I/O事件),分发到ChannelHandler的不同方法,进行相应业务逻辑的处理,向TaskQueue中添加Task;另外还会runAllTasks,消费TaskQueue中的Task。对于这两个方法我们在此先不深究,在线程启动之后,再来进行分析。
接下来我们看一下EventLoopGroup是如何被设置到ServerBootstrap中去的。ServerBootstrap中调用group方法可以将EventLoopGroup设置到引导器中,之所以有parentGroup和childGroup之分,是因为Netty在这块实现中遵循了Reactor模式,即采用parentGroup处理Channel上的Accept事件,并将其他I/O事件dispatch(分发)到childGroup中进行处理。
public ServerBootstrap group(EventLoopGroup parentGroup, EventLoopGroup childGroup) {
super.group(parentGroup);
if (childGroup == null) {
throw new NullPointerException("childGroup");
} else if (this.childGroup != null) {
throw new IllegalStateException("childGroup set already");
} else {
this.childGroup = childGroup;
return this;
}
}
现在再看线程是如何被启动的。关注2.1节中介绍的绑定端口的操作,执行到底层时如下:
private static void doBind0(
final ChannelFuture regFuture, final Channel channel,
final SocketAddress localAddress, final ChannelPromise promise) {
// This method is invoked before channelRegistered() is triggered. Give user handlers a chance to set up
// the pipeline in its channelRegistered() implementation.
channel.eventLoop().execute(new OneTimeTask() {
@Override
public void run() {
if (regFuture.isSuccess()) {
channel.bind(localAddress, promise).addListener(ChannelFutureListener.CLOSE_ON_FAILURE);
} else {
promise.setFailure(regFuture.cause());
}
}
});
}
获取channel上分配的eventLoop,进而执行execute方法,添加Task,或者调用NioEventLoop的run方法并添加Task。
public void execute(Runnable task) {
if (task == null) {
throw new NullPointerException("task");
} else {
boolean inEventLoop = this.inEventLoop();
if (inEventLoop) {
this.addTask(task);
} else {
this.startThread();
this.addTask(task);
if (this.isShutdown() && this.removeTask(task)) {
reject();
}
}
if (!this.addTaskWakesUp && this.wakesUpForTask(task)) {
this.wakeup(inEventLoop);
}
}
}
至此,我们可以回到NioEventLoop的run方法中一探究竟。从代码中不难看出其执行过程,其封装了JAVA NIO的读写操作。
private static void processSelectedKey(SelectionKey k, AbstractNioChannel ch) {
final NioUnsafe unsafe = ch.unsafe();
if (!k.isValid()) {
// close the channel if the key is not valid anymore
unsafe.close(unsafe.voidPromise());
return;
}
try {
int readyOps = k.readyOps();
// Also check for readOps of 0 to workaround possible JDK bug which may otherwise lead
// to a spin loop
if ((readyOps & (SelectionKey.OP_READ | SelectionKey.OP_ACCEPT)) != 0 || readyOps == 0) {
unsafe.read();
if (!ch.isOpen()) {
// Connection already closed - no need to handle write.
return;
}
}
if ((readyOps & SelectionKey.OP_WRITE) != 0) {
// Call forceFlush which will also take care of clear the OP_WRITE once there is nothing left to write
ch.unsafe().forceFlush();
}
if ((readyOps & SelectionKey.OP_CONNECT) != 0) {
// remove OP_CONNECT as otherwise Selector.select(..) will always return without blocking
// See https://github.com/netty/netty/issues/924
int ops = k.interestOps();
ops &= ~SelectionKey.OP_CONNECT;
k.interestOps(ops);
unsafe.finishConnect();
}
} catch (CancelledKeyException ignored) {
unsafe.close(unsafe.voidPromise());
}
}
当接收到I/O Read事件的时候,依次通知pipeline中的ChannelHandler,通过调用channelRead方法对读取到的数据进行处理。
2.3 向Channel中写入信息
与上一节中介绍的从Channel中读取信息类似,依然是在上述的线程中对I/O Write事件进行获取并处理。