Netty源码分析之Reactor线程模型

最新推荐文章于 2024-08-20 18:48:08 发布

xxb249

最新推荐文章于 2024-08-20 18:48:08 发布

阅读量2.4k

点赞数 2

分类专栏： java基础网络文章标签： Netty源码分析 Reactor线程模型

本文链接：https://blog.csdn.net/xxb249/article/details/78866583

版权

java基础同时被 2 个专栏收录

16 篇文章 0 订阅

订阅专栏

网络

12 篇文章 7 订阅

订阅专栏

一、背景

最近在研究netty的源代码，今天发表一篇关于netty的线程框架--Reactor线程模型，作为最近研究成果。如果有还不了解Reactor模型请自行百度，网上有很多关于Reactor模式。

研究netty的时候，先看了下《netty权威指南》，里面讲解不错，从原理到源码均有介绍，那为什么要写本篇博客呢？《netty权威指南》在介绍线程模型时候，介绍不够细腻，流程没有打通。我个人认为，这部分是基石，只要把这部分搞清楚，对后面Channel和Pipe流水线处理就可游刃有余了。此次分析Netty是基于5.0版本

今天以《netty权威指南》中TimeServer实例进行分析，具体实现方法（核心）如下：

public void bind(int port) throws Exception {
    EventLoopGroup bossGroup = new NioEventLoopGroup();
    EventLoopGroup workerGroup = new NioEventLoopGroup();
    try {
        ServerBootstrap b = new ServerBootstrap();
        b.group(bossGroup, workerGroup)
                .channel(NioServerSocketChannel.class)
                .option(ChannelOption.SO_BACKLOG, 1024)
                .childHandler(new ChildChannelHandler());
        ChannelFuture f = b.bind(port).sync();
        f.channel().closeFuture().sync();
    } finally {
        bossGroup.shutdownGracefully();
        workerGroup.shutdownGracefully();
    }
}

private class ChildChannelHandler extends ChannelInitializer<SocketChannel> {

    protected void initChannel(SocketChannel ch) throws Exception {
        // 将监听事件 注册到ChannelPipe流水线中 放到链表中  也可以注册多个监听事件 可以指定名字如果没有名字 会自动生成
        ch.pipeline().addLast("GetTime", new TimeServerHandler());
    }
}

通过上面的代码可知，最重要的两个类是：NioEventLoopGroup和ServerBootstrap（如果是客户端则是Bootstrap），下面是这两个类的UML类图：

二、NioEventLoopGroup线程组

NioEventLoopGroup类主要工作是，创建一个线程池。上述代码中创建了两个EventLoop，一个是boosGroup，主要是用于监听，另外一个是workerGroup主要用于C/S通信。这两个线程池是实现Reactor线程模型的基础。接下来分析按照uml类图关系进行介绍，从下往上开始。

NioEventLoopGroup类代码较少，其中最重要的方式就是下面。这个方法是父类MultithreadEventLoopGroup定义的抽象方法，此方法主要用XXXX，是一个线程，后面会看到调用的地方。

@Override
protected EventLoop newChild(Executor executor, Object... args) throws Exception {
    return new NioEventLoop(this, executor, (SelectorProvider) args[0]);
}

调用无参的NioEventLoopGroup的构造函数最终会调用，

public NioEventLoopGroup(
        int nThreads, ThreadFactory threadFactory, final SelectorProvider selectorProvider) {
    super(nThreads, threadFactory, selectorProvider);
}

说一下此处的实参，nThreads是0，threadFactory是null，selectorProvider是调用SelectorProvider.provider()。第三个参数是生成Selector选择器（Java底层网络模型采用的linux epoll模型，而非select模型），最后调用父类的MultithreadEventLoopGroup的构造方法。

protected MultithreadEventExecutorGroup(int nThreads, Executor executor, Object... args) {
    if (nThreads <= 0) {
        throw new IllegalArgumentException(String.format("nThreads: %d (expected: > 0)", nThreads));
    }

    if (executor == null) {
        executor = new ThreadPerTaskExecutor(newDefaultThreadFactory());
    }

    children = new EventExecutor[nThreads];
    for (int i = 0; i < nThreads; i ++) {
        boolean success = false;
        try {
            children[i] = newChild(executor, args);
            success = true;
        } catch (Exception e) {
            // TODO: Think about if this is a good exception type
            throw new IllegalStateException("failed to create a child event loop", e);
        } finally {
            if (!success) {
                for (int j = 0; j < i; j ++) {
                    children[j].shutdownGracefully();
                }

                for (int j = 0; j < i; j ++) {
                    EventExecutor e = children[j];
                    try {
                        while (!e.isTerminated()) {
                            e.awaitTermination(Integer.MAX_VALUE, TimeUnit.SECONDS);
                        }
                    } catch (InterruptedException interrupted) {
                        Thread.currentThread().interrupt();
                        break;
                    }
                }
            }
        }
    }

    final FutureListener<Object> terminationListener = new FutureListener<Object>() {
        @Override
        public void operationComplete(Future<Object> future) throws Exception {
            if (terminatedChildren.incrementAndGet() == children.length) {
                terminationFuture.setSuccess(null);
            }
        }
    };

    for (EventExecutor e: children) {
        e.terminationFuture().addListener(terminationListener);
    }

    Set<EventExecutor> childrenSet = new LinkedHashSet<EventExecutor>(children.length);
    Collections.addAll(childrenSet, children);
    readonlyChildren = Collections.unmodifiableSet(childrenSet);
}

此方法有两点说明：

1) 这个地方的executor一直都是都null，所以在这个地方创建一个默认executor执行器。这个ThreadPerTaskExecutor类中只有一个具体方法，是实现execute方法。这个方法在后面会调用到。

2) 第一个for循环主要是创建线程的。其中方法newChild()，实际调用的是NioEventLoopGroup类中的newChild方法。

三、NioEventLoop线程

下面是NioEventLoop的UML类图。

在NioEventLoop构造方法中，主要做了两件事情：

1、将excutor赋值给父类并且父类创建Task队列。

2、创建selector选择器并且初始胡selectorKey。

在NioEventLoop类中有一个最重要的方法，就是run方法，此方法是一个死循环（除非关闭、异常才退出），这run方法就是用于轮训事件消息，包括accept事件、read事件、write事件。这个方法在初始化NioEventLoopGroup不会调用到（是bind时调用），后面再详细介绍run方法。

三、ServerBootStrap服务启动

通过上面的代码可知，ServerBootStrap需要设置线程池，Channel以及流水线Pipe，设置完这些则调用bind开始监听流程，最终会调用到doBind方法，方法如下：

private ChannelFuture doBind(final SocketAddress localAddress) {
    final ChannelFuture regFuture = initAndRegister();
    final Channel channel = regFuture.channel();
    if (regFuture.cause() != null) {
        return regFuture;
    }

    final ChannelPromise promise;
    if (regFuture.isDone()) {
        promise = channel.newPromise();
        doBind0(regFuture, channel, localAddress, promise);
    } else {
        // Registration future is almost always fulfilled already, but just in case it's not.
        promise = new DefaultChannelPromise(channel, GlobalEventExecutor.INSTANCE);
        regFuture.addListener(new ChannelFutureListener() {
            @Override
            public void operationComplete(ChannelFuture future) throws Exception {
                doBind0(regFuture, channel, localAddress, promise);
            }
        });
    }

    return promise;
}

initAndRegister初始化并注册，此函数中有createChannel和init(channel)

final ChannelFuture initAndRegister() {
    Channel channel;
    try {
        channel = createChannel();
    } catch (Throwable t) {
        return VoidChannel.INSTANCE.newFailedFuture(t);
    }

    try {
        init(channel);
    } catch (Throwable t) {
        channel.unsafe().closeForcibly();
        return channel.newFailedFuture(t);
    }

    ChannelPromise regFuture = channel.newPromise();
    channel.unsafe().register(regFuture);
    if (regFuture.cause() != null) {
        if (channel.isRegistered()) {
            channel.close();
        } else {
            channel.unsafe().closeForcibly();
        }
    }

    return regFuture;
}

createChannel方法实现，在类ServerBootstrap中，其中group()是获取bossGroup，next()是从bossGroup线程池中取一个线程，此线程主要用监听socket。newChannel中第二参数childGroup是workerGroup线程池，该线程池主要用于客户端建链成功之后，提供C/S服务线程，这也就是Reactor线程模型。

@Override
Channel createChannel() {
    EventLoop eventLoop = group().next();
    return channelFactory().newChannel(eventLoop, childGroup);
}

newChannel方法，是通过反射方式动态创建类对象即创建NioServerSocketChannel。

对于init(channel)方法比较简单，主要用于设置options和流水线pipe。

下面是register方法：

public final void register(final ChannelPromise promise) {
    if (eventLoop.inEventLoop()) {
        register0(promise);
    } else {
        try {
            eventLoop.execute(new Runnable() {
                @Override
                public void run() {
                    register0(promise);
                }
            });
        } catch (Throwable t) {
            logger.warn(
                    "Force-closing a channel whose registration task was not accepted by an event loop: {}",
                    AbstractChannel.this, t);
            closeForcibly();
            closeFuture.setClosed();
            promise.setFailure(t);
        }
    }
}

该方法第一步判断执行register线程与eventLoop线程是否相同（eventLoop是来自bossGroup，在方法createChannel中设置），第一次肯定不相同，因此当前线程是main线程，所以会进入else分支。eventLoop.execute方法实现在类SingleThreadEventExecutor：

public void execute(Runnable task) {
    if (task == null) {
        throw new NullPointerException("task");
    }

    boolean inEventLoop = inEventLoop();
    if (inEventLoop) {
        addTask(task);
    } else {
        startThread();
        addTask(task);
        if (isShutdown() && removeTask(task)) {
            reject();
        }
    }

    if (!addTaskWakesUp) {
        wakeup(inEventLoop);
    }
}

根据上面分析，这个会进入else分支，启动线程并且将task添加到阻塞队列中，启动的线程会从队列中取出task并且执行task。

方法startThread会调用到doStartThread，执行executor.execute接口，此接口的实现方法是类ThreadPerTaskExecutor中execute方法，该方法会调用start方法，将线程激活。下面看一下run方法，这个run方法中最重要的一行代码是：SingleThreadEventExecutor.this.run();第一次调用run接口，该接口实现方法是在NioEventLoop.java中run方法。

当main线程启动子线程-A后，会把task加入到队列中，然后main线程就去执行doBind0方法。而子线程-A启动成功后对从队列中取出这个task并且执行这个task。doBind0方法是由main线程执行，main线程会把doBind0具体操作放到队列中，然后由子线程-A去执行bind操作。至此，main线程所做的事情就结束了，最后会回到main方法中阻塞。

五、子线程-A执行task

子线程执行的task，定义在doStartThread方法中，这段代码最终一行代码就是SingleThreadEventExecutor.this.run();这个是一个接口，那么实现在哪里呢？

private void doStartThread() {
    assert thread == null;
    executor.execute(new Runnable() {
        @Override
        public void run() {
            thread = Thread.currentThread();
            if (interrupted) {
                thread.interrupt();
            }

            boolean success = false;
            updateLastExecutionTime();
            try {
                SingleThreadEventExecutor.this.run();
                success = true;
            } catch (Throwable t) {
                logger.warn("Unexpected exception from an event executor: ", t);
            } finally {
                if (state < ST_SHUTTING_DOWN) {
                    state = ST_SHUTTING_DOWN;
                }

                // Check if confirmShutdown() was called at the end of the loop.
                if (success && gracefulShutdownStartTime == 0) {
                    logger.error("Buggy " + EventExecutor.class.getSimpleName() + " implementation; " +
                            SingleThreadEventExecutor.class.getSimpleName() + ".confirmShutdown() must be called " +
                            "before run() implementation terminates.");
                }

                try {
                    // Run all remaining tasks and shutdown hooks.
                    for (;;) {
                        if (confirmShutdown()) {
                            break;
                        }
                    }
                } finally {
                    try {
                        cleanup();
                    } finally {
                        synchronized (stateLock) {
                            state = ST_TERMINATED;
                        }
                        threadLock.release();
                        if (!taskQueue.isEmpty()) {
                            logger.warn(
                                    "An event executor terminated with " +
                                            "non-empty task queue (" + taskQueue.size() + ')');
                        }

                        terminationFuture.setSuccess(null);
                    }
                }
            }
        }
    });
}

run的实现方法：NioEventLoop.java中run方法，这里就和前面串起来了。哈哈

protected void run() {
    for (;;) {
        oldWakenUp = wakenUp.getAndSet(false);
        try {
            if (hasTasks()) {
                selectNow();
            } else {
                select();
                if (wakenUp.get()) {
                    selector.wakeup();
                }
            }

            cancelledKeys = 0;

            final long ioStartTime = System.nanoTime();
            needsToSelectAgain = false;
            if (selectedKeys != null) {
                processSelectedKeysOptimized(selectedKeys.flip());
            } else {
                processSelectedKeysPlain(selector.selectedKeys());
            }
            final long ioTime = System.nanoTime() - ioStartTime;

            final int ioRatio = this.ioRatio;
            runAllTasks(ioTime * (100 - ioRatio) / ioRatio);

            if (isShuttingDown()) {
                closeAll();
                if (confirmShutdown()) {
                    break;
                }
            }
        } catch (Throwable t) {
            logger.warn("Unexpected exception in the selector loop.", t);

            // Prevent possible consecutive immediate failures that lead to
            // excessive CPU consumption.
            try {
                Thread.sleep(1000);
            } catch (InterruptedException e) {
                // Ignore.
            }
        }
    }
}

从这个方法看，是一个死循环，主要用于轮训事件，如果有task存在则立即触发select，否则睡眠一段时间，这个和linux select模型类似。接下来是处理SelectKey，默认会进入processSelectedKeysOptimized方法，开始循环遍历，默认进入if分支，下面是processSelectedKey方法，主要内容是三段if判断：

if ((readyOps & (SelectionKey.OP_READ | SelectionKey.OP_ACCEPT)) != 0 || readyOps == 0) {
    unsafe.read();
    if (!ch.isOpen()) {
        // Connection already closed - no need to handle write.
        return;
    }
}

OP_READ和OP_ACCEPT事件：主要是用于客户端连接、客户端发来的消息。

if ((readyOps & SelectionKey.OP_WRITE) != 0) {
    // Call forceFlush which will also take care of clear the OP_WRITE once there is nothing left to write
    ch.unsafe().forceFlush();
}

OP_WRITE事件：用于给对端发送消息，当调用flush时候会触发这个。

if ((readyOps & SelectionKey.OP_CONNECT) != 0) {
    // remove OP_CONNECT as otherwise Selector.select(..) will always return without blocking
    // See https://github.com/netty/netty/issues/924
    int ops = k.interestOps();
    ops &= ~SelectionKey.OP_CONNECT;
    k.interestOps(ops);

    unsafe.finishConnect();
}

OP_CONNECT这个是客户端程序会进入，表示tcp连接完成。这个地方需要把OP_CONNECT标志清除掉。

我们着重分析一下Read事件。Reactor线程模式：如果有新的接入，则创建一个新的线程，为新连接服务。那么我们顺着unsafe.read()，去查看在什么地方创建的新线程？这个unsafe.read是一个接口，它的实现有两个：

1）如果是监听线程--NioServerSocketChannel，主要处理客户端接入请求Accept

实现方法在类AbstractNioMessageChannel.java中read()

@Override
public void read() {
    assert eventLoop().inEventLoop();
    if (!config().isAutoRead()) {
        removeReadOp();
    }

    final ChannelConfig config = config();
    final int maxMessagesPerRead = config.getMaxMessagesPerRead();
    final boolean autoRead = config.isAutoRead();
    final ChannelPipeline pipeline = pipeline();
    boolean closed = false;
    Throwable exception = null;
    try {
        for (;;) {
            int localRead = doReadMessages(readBuf);
            if (localRead == 0) {
                break;
            }
            if (localRead < 0) {
                closed = true;
                break;
            }

            if (readBuf.size() >= maxMessagesPerRead | !autoRead) {
                break;
            }
        }
    } catch (Throwable t) {
        exception = t;
    }

    int size = readBuf.size();
    for (int i = 0; i < size; i ++) {
        pipeline.fireChannelRead(readBuf.get(i));
    }
    readBuf.clear();
    pipeline.fireChannelReadComplete();

    if (exception != null) {
        if (exception instanceof IOException) {
            // ServerChannel should not be closed even on IOException because it can often continue
            // accepting incoming connections. (e.g. too many open files)
            closed = !(AbstractNioMessageChannel.this instanceof ServerChannel);
        }

        pipeline.fireExceptionCaught(exception);
    }

    if (closed) {
        if (isOpen()) {
            close(voidPromise());
        }
    }
}

这方法中最重要的方法就是doReadMessages()

protected int doReadMessages(List<Object> buf) throws Exception {
    SocketChannel ch = javaChannel().accept();

    try {
        if (ch != null) {
            buf.add(new NioSocketChannel(this, childEventLoopGroup().next(), ch));
            return 1;
        }
    } catch (Throwable t) {
        logger.warn("Failed to create a new channel from an accepted socket.", t);

        try {
            ch.close();
        } catch (Throwable t2) {
            logger.warn("Failed to close a socket.", t2);
        }
    }

    return 0;
}

注意：上面的add操作，其中childEventLoopGroup().next()，就是从workGroup中挑选一个线程，这个线程就是服务于客户端与服务端。这个地方就是Reactor线程模型核心之地。

2）如果是服务线程--即与客户端通信线程NioSocketChannel，主要处理对端发送过来的消息

如果是其他的消息（例如客户端正常发送消息）就会进入下面方法：

@Override
    public void read() {
        final ChannelConfig config = config();
        final ChannelPipeline pipeline = pipeline();
        final ByteBufAllocator allocator = config.getAllocator();
        final int maxMessagesPerRead = config.getMaxMessagesPerRead();
        RecvByteBufAllocator.Handle allocHandle = this.allocHandle;
        if (allocHandle == null) {
            this.allocHandle = allocHandle = config.getRecvByteBufAllocator().newHandle();
        }
        if (!config.isAutoRead()) {
            removeReadOp();
        }

        ByteBuf byteBuf = null;
        int messages = 0;
        boolean close = false;
        try {
            int byteBufCapacity = allocHandle.guess();
            int totalReadAmount = 0;
            do {
                byteBuf = allocator.ioBuffer(byteBufCapacity);
                int writable = byteBuf.writableBytes();
                int localReadAmount = doReadBytes(byteBuf);
                if (localReadAmount <= 0) {
                    // not was read release the buffer
                    byteBuf.release();
                    close = localReadAmount < 0;
                    break;
                }

                pipeline.fireChannelRead(byteBuf);
                byteBuf = null;

                if (totalReadAmount >= Integer.MAX_VALUE - localReadAmount) {
                    // Avoid overflow.
                    totalReadAmount = Integer.MAX_VALUE;
                    break;
                }

                totalReadAmount += localReadAmount;
                if (localReadAmount < writable) {
                    // Read less than what the buffer can hold,
                    // which might mean we drained the recv buffer completely.
                    break;
                }
            } while (++ messages < maxMessagesPerRead);

            pipeline.fireChannelReadComplete();
            allocHandle.record(totalReadAmount);

            if (close) {
                closeOnRead(pipeline);
                close = false;
            }
        } catch (Throwable t) {
            handleReadException(pipeline, byteBuf, t, close);
        }
    }
}

doReadBytes方法是通过socket读取报文，通过fireChannelRead方法将数据传递到handler进行处理。

通过上面两种场景，可以有一个总结：先通过底层socket读取数据，然后触发fireChannelRead事件，当所有数据读完成最后触发fireChannelReadComplete事件。

至此，netty服务启动以及Reactor线程模型源码分析就结束了。后面会介绍Channel以及流水线Pipe。

【补充知识】

上面介绍Selector时候，会出现空轮训。什么是空轮训呢？就是本次select操作，没有发生任何事件，这样会造成Selector假死，CPU100%。这个是java epoll模型的bug。因此Netty提供了一个解决方法：重建Selector。就是重新new Selector然后把旧的Selector注册的事件全部移植到新的Selector中，然后重新轮训新的Selector。Netty中设置了一定次数，如果空轮训了N次（代码中有静态变量），就会重建Selector。Netty通过这种间接方式处理java epoll模型bug，不过还是希望java jdk能早日解决这个问题（java 7中仍然没有解决这个问题）。