netty是一个大家所熟悉的nio框架。但是你知道它具体是怎么实现的吗?
接下来让我们看一看一段有netty3编写的server端代码,开始netty的源码分析:
public class NettyServer {
public static void main(String[] args) {
ServerBootstrap bootstrap=new ServerBootstrap();
//boss用于监听端口,worker用于读写
ExecutorService boss=Executors.newCachedThreadPool();
ExecutorService worker=Executors.newCachedThreadPool();
//niosocket
bootstrap.setFactory(new NioServerSocketChannelFactory(boss, worker));
//设置管道工厂
bootstrap.setPipelineFactory(new ChannelPipelineFactory() {
@Override
public ChannelPipeline getPipeline() throws Exception {
ChannelPipeline pipeline=Channels.pipeline();
pipeline.addLast("decoder", new StringDecoder());
pipeline.addLast("encoder", new StringEncoder());
pipeline.addLast("handler", new NettyHandler());
return pipeline;
}
});
bootstrap.bind(new InetSocketAddress(9999));
System.out.println("start!!!");
}
}
netty3的api比netty5的更有可读性。但netty的实现原理是不变的,netty5也只是在netty3的基础上对某些部分再做了封装。所以今天的源码由netty3开始。
以上是一段netty3的服务端代码,很简单。首先创建两个线程池,boss和worker。其中boss线程池用于端口的监听,worker用于数据读写。之后创建一个socket工厂和一个管道工厂。其中管道工厂相当于一个过滤器,可以再到中添加一些额外操作。
而其中标红的部分,也就是今天源码的重点。虽然只是一个创建对象的方法,可当中却包含了nio在netty上的实现体现。
首先像这样几个问题:
- 我们都知道nio中可以用一个线程处理多个客户端的请求。但是单线程处理效率毕竟还是有限的,那么如何提升nio的性能。你可能想到将单线程变成多线程,那又如何在nio中使用多线程?
- 一个NIO是不是只能有一个selector?
- selector是不是只能注册一个ServerSocketChannel?
想明白这几个问题,就能理解netty的实现原理了。很显然,提高nio性能使用多线程是一种可行的方法,同时一个nio也可以有多个selector,一个selector中也可以注册多个ServerSocketChannel,这些东西都能在netty这个框架中找到例证。
在讲解源码前,我这里先梳理先几个类的关系,这有助于你的源码阅读
在你进入new NioServerSocketChannelFactory(boss, worker)这个方法后,你会依次看到上图上面的几个类,该图代表的是继承关系,左边的为父级,右边为子级。
从图中可以看到,netty中最重要的几个元素:
- 这几个类按角色分可分为boss和worker。boss负责端口监听相当于nio中的serverSocketChannel的职责,worker负责读写操作,相当于nio中的socketChannel的职责。
- 这些类按功能分被分为pool和selector
从以上分析我们可以有一些疑问:
- 既然我们之前说到boss用于监听端口,worker用于读写,并且他们各自都拥有一个pool(线程池),但是监听端口和读写之间不可能没有交互吧?nio中我们需要severSocketChannel的accpet方法获取到有个socketChannel并将它注册到selector中去,那这里又是怎么实现这个过程的呢?
- 每个pool下都有一个数组(bosses,workers)这两个数组用来干什么的?
源码解析
我们首先来看一下第一段代码中的new NioServerSocketChannelFactory(boss, worker)方法做了什么
public NioServerSocketChannelFactory(
Executor bossExecutor, Executor workerExecutor) {
this(bossExecutor, workerExecutor, getMaxThreads(workerExecutor));
}
public NioServerSocketChannelFactory(
Executor bossExecutor, Executor workerExecutor,
int workerCount) {
this(bossExecutor, 1, workerExecutor, workerCount);
}
public NioServerSocketChannelFactory(
Executor bossExecutor, int bossCount, Executor workerExecutor,
int workerCount) {
this(bossExecutor, bossCount, new NioWorkerPool(workerExecutor, workerCount));
}
public NioServerSocketChannelFactory(
Executor bossExecutor, int bossCount, WorkerPool<NioWorker> workerPool) {
this(new NioServerBossPool(bossExecutor, bossCount, null), workerPool);
}
这几个方法是逐级调用的,四个方法运行完后,new了两个对象分别是NioServerBossPool和NioWorkerPool,而它们创建对象的时候同时传入了一个长度,可以看到NioServerBossPool传入的长度默认为1,而NioWorkerPool传入的长度为最大线程数(代码中标红处)。这两个长度其实就是bosses和workers的初始化长度。
我们先看NioServerBossPool的初始化过程:
public NioServerBossPool(Executor bossExecutor, int bossCount, ThreadNameDeterminer determiner) {
super(bossExecutor, bossCount, false);
this.determiner = determiner;
init();
}
AbstractNioBossPool(Executor bossExecutor, int bossCount, boolean autoInit) {
if (bossExecutor == null) {
throw new NullPointerException("bossExecutor");
}
if (bossCount <= 0) {
throw new IllegalArgumentException(
"bossCount (" + bossCount + ") " +
"must be a positive integer.");
}
bosses = new Boss[bossCount];
this.bossExecutor = bossExecutor;
if (autoInit) {
init();
}
}
protected void init() {
if (!initialized.compareAndSet(false, true)) {
throw new IllegalStateException("initialized already");
}
for (int i = 0; i < bosses.length; i++) {
bosses[i] = newBoss(bossExecutor);
}
waitForBossThreads();
}
以上代码很明了,创建对象的时候,将传入的线程池付给对象并创建bosses数组并像数组中装入NioServerBoss对象(该类的父类为Boss).
接下来我们看看newBoss中发生了什么
AbstractNioSelector(Executor executor, ThreadNameDeterminer determiner) {
this.executor = executor;
openSelector(determiner);
}
private void openSelector(ThreadNameDeterminer determiner) {
try {
selector = SelectorUtil.open();
} catch (Throwable t) {
throw new ChannelException("Failed to create a selector.", t);
}
// Start the worker thread with the new Selector.
boolean success = false;
try {
DeadLockProofWorker.start(executor, newThreadRenamingRunnable(id, determiner));
success = true;
} finally {
if (!success) {
// Release the Selector if the execution fails.
try {
selector.close();
} catch (Throwable t) {
logger.warn("Failed to close a selector.", t);
}
selector = null;
// The method will return to the caller at this point.
}
}
assert selector != null && selector.isOpen();
}
其中openSelector是AbstractNioSelector中的方法,而这个方法也是netty源码中的一个重点。可以看到在newBoss的时候它获取到了一个selector并且将它付给了Boss,可以看出Boss其实就是一个和selector有着类似功能的类,只是在selector上又包装了一些东西,它的本质其实就是selector。那么booses数组和workers数组就可以理解成两个selector的数组。
其中标红的那一段它的作用是开启boss的线程。Boss是NioServerBoss的父类,这里创建的其实是NioServerBoss,而这个类又是继承runnable接口的。
那看下NioServerBoss中的run方法是怎么写的吧
public void run() {
thread = Thread.currentThread();
startupLatch.countDown();
int selectReturnsImmediately = 0;
Selector selector = this.selector;
if (selector == null) {
return;
}
// use 80% of the timeout for measure
final long minSelectTimeout = SelectorUtil.SELECT_TIMEOUT_NANOS * 80 / 100;
boolean wakenupFromLoop = false;
for (;;) {
wakenUp.set(false);
try {
long beforeSelect = System.nanoTime();
int selected = select(selector);
if (selected == 0 && !wakenupFromLoop && !wakenUp.get()) {
long timeBlocked = System.nanoTime() - beforeSelect;
if (timeBlocked < minSelectTimeout) {
boolean notConnected = false;
// loop over all keys as the selector may was unblocked because of a closed channel
for (SelectionKey key: selector.keys()) {
SelectableChannel ch = key.channel();
try {
if (ch instanceof DatagramChannel && !ch.isOpen() ||
ch instanceof SocketChannel && !((SocketChannel) ch).isConnected() &&
// Only cancel if the connection is not pending
// See https://github.com/netty/netty/issues/2931
!((SocketChannel) ch).isConnectionPending()) {
notConnected = true;
// cancel the key just to be on the safe side
key.cancel();
}
} catch (CancelledKeyException e) {
// ignore
}
}
if (notConnected) {
selectReturnsImmediately = 0;
} else {
if (Thread.interrupted() && !shutdown) {
// Thread was interrupted but NioSelector was not shutdown.
// As this is most likely a bug in the handler of the user or it's client
// library we will log it.
//
// See https://github.com/netty/netty/issues/2426
if (logger.isDebugEnabled()) {
logger.debug("Selector.select() returned prematurely because the I/O thread " +
"has been interrupted. Use shutdown() to shut the NioSelector down.");
}
selectReturnsImmediately = 0;
} else {
// Returned before the minSelectTimeout elapsed with nothing selected.
// This may be because of a bug in JDK NIO Selector provider, so increment the counter
// which we will use later to see if it's really the bug in JDK.
selectReturnsImmediately ++;
}
}
} else {
selectReturnsImmediately = 0;
}
} else {
selectReturnsImmediately = 0;
}
if (SelectorUtil.EPOLL_BUG_WORKAROUND) {
if (selectReturnsImmediately == 1024) {
// The selector returned immediately for 10 times in a row,
// so recreate one selector as it seems like we hit the
// famous epoll(..) jdk bug.
rebuildSelector();
selector = this.selector;
selectReturnsImmediately = 0;
wakenupFromLoop = false;
// try to select again
continue;
}
} else {
// reset counter
selectReturnsImmediately = 0;
}
// 'wakenUp.compareAndSet(false, true)' is always evaluated
// before calling 'selector.wakeup()' to reduce the wake-up
// overhead. (Selector.wakeup() is an expensive operation.)
//
// However, there is a race condition in this approach.
// The race condition is triggered when 'wakenUp' is set to
// true too early.
//
// 'wakenUp' is set to true too early if:
// 1) Selector is waken up between 'wakenUp.set(false)' and
// 'selector.select(...)'. (BAD)
// 2) Selector is waken up between 'selector.select(...)' and
// 'if (wakenUp.get()) { ... }'. (OK)
//
// In the first case, 'wakenUp' is set to true and the
// following 'selector.select(...)' will wake up immediately.
// Until 'wakenUp' is set to false again in the next round,
// 'wakenUp.compareAndSet(false, true)' will fail, and therefore
// any attempt to wake up the Selector will fail, too, causing
// the following 'selector.select(...)' call to block
// unnecessarily.
//
// To fix this problem, we wake up the selector again if wakenUp
// is true immediately after selector.select(...).
// It is inefficient in that it wakes up the selector for both
// the first case (BAD - wake-up required) and the second case
// (OK - no wake-up required).
if (wakenUp.get()) {
wakenupFromLoop = true;
selector.wakeup();
} else {
wakenupFromLoop = false;
}
cancelledKeys = 0;
processTaskQueue();
selector = this.selector; // processTaskQueue() can call rebuildSelector()
if (shutdown) {
this.selector = null;
// process one time again
processTaskQueue();
for (SelectionKey k: selector.keys()) {
close(k);
}
try {
selector.close();
} catch (IOException e) {
logger.warn(
"Failed to close a selector.", e);
}
shutdownLatch.countDown();
break;
} else {
process(selector);
}
} catch (Throwable t) {
logger.warn(
"Unexpected exception in the selector loop.", t);
// Prevent possible consecutive immediate failures that lead to
// excessive CPU consumption.
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
// Ignore.
}
}
}
}
这个方法是继承自AbstractNioSelector的。也就是说这个方法不是NioServerBoss独有的而NioWorker也会用到这个方法。
这段代码很长,不过我们只需要关注标红的地方就好了。这两个方法分别是processTaskQueue和process。从字面意思理解,processTaskQueue是执行任务队列的意思。process在AbstractNioSelector是个抽象方法,具体实现放在子类中。也就是说NioServerBoss的本身需要执行的东西其实是放在自身的process方法中的。
processTaskQueue操作的任务队列是用来干什么的呢?
private void processTaskQueue() {
for (;;) {
final Runnable task = taskQueue.poll();
if (task == null) {
break;
}
task.run();
try {
cleanUpCancelledKeys();
} catch (IOException e) {
// Ignore
}
}
}
以下是NioServerBoss中process的实现
protected void process(Selector selector) {
Set<SelectionKey> selectedKeys = selector.selectedKeys();
if (selectedKeys.isEmpty()) {
return;
}
for (Iterator<SelectionKey> i = selectedKeys.iterator(); i.hasNext();) {
SelectionKey k = i.next();
i.remove();
NioServerSocketChannel channel = (NioServerSocketChannel) k.attachment();
try {
// accept connections in a for loop until no new connection is ready
for (;;) {
SocketChannel acceptedSocket = channel.socket.accept();
if (acceptedSocket == null) {
break;
}
registerAcceptedChannel(channel, acceptedSocket, thread);
}
} catch (CancelledKeyException e) {
// Raised by accept() when the server socket was closed.
k.cancel();
channel.close();
} catch (SocketTimeoutException e) {
// Thrown every second to get ClosedChannelException
// raised.
} catch (ClosedChannelException e) {
// Closed as requested.
} catch (Throwable t) {
if (logger.isWarnEnabled()) {
logger.warn(
"Failed to accept a connection.", t);
}
try {
Thread.sleep(1000);
} catch (InterruptedException e1) {
// Ignore
}
}
}
}
看到这个方法你就明白nio是怎么在netty中运用了的吧。NioServerBoss获取到一个socketChannel后会注册到一个selector上,而究竟是怎么注册的呢?这就是 registerAcceptedChannel(channel, acceptedSocket, thread);要干的事情了。
private static void registerAcceptedChannel(NioServerSocketChannel parent, SocketChannel acceptedSocket,
Thread currentThread) {
try {
ChannelSink sink = parent.getPipeline().getSink();
ChannelPipeline pipeline =
parent.getConfig().getPipelineFactory().getPipeline();
NioWorker worker = parent.workerPool.nextWorker();
worker.register(new NioAcceptedSocketChannel(
parent.getFactory(), pipeline, parent, sink
, acceptedSocket,
worker, currentThread), null);
} catch (Exception e) {
if (logger.isWarnEnabled()) {
logger.warn(
"Failed to initialize an accepted socket.", e);
}
try {
acceptedSocket.close();
} catch (IOException e2) {
if (logger.isWarnEnabled()) {
logger.warn(
"Failed to close a partially accepted socket.",
e2);
}
}
}
}
其实很简单,上面我们不是看到了有一个执行队列的方法(processTaskQueue)了吗。其实这个方法就是想worker的队列中放任务。这个任务是一个runnable的对象,它的任务就是讲channel注册到worker的selector上。但是这个执行并是有worker自己进行的,而boss 只管想任务队列中添加任务。这样就做到了两个线程很好的独立性。
相信学过多线程的人都知道一种叫做生产者和消费者的模式,boss就是一个生产者,而worker是一个消费者,代码读到这里,相信很多人应该对netty的实现有了一定了解了。
那worker呢?其实它和boss的运行模式很像,之前不是提到一个workers数组吗,里面放的就是worker的实体对象。当boss获取到一个channel后会从一个workers中选取一个worker并将channel绑定到它的selector上。当worker接收到事件的时候,它会在自己的selector上选取就绪channel进行相应读写操作。
netty的运行原理总结
上图可以看出,nio中只有一个服务员(selector)并且该服务员会负责所有的客人。而netty中有多个服务员(selector),每个服务员负责一片区域,这样就能更加的高效。而实现方式其实就是上面提到的workers这个数组。