netty源码浅析-NioEventLoop-reactor线程分析

胖柯G

已于 2023-07-30 15:43:56 修改

阅读量130

点赞数

分类专栏： netty 文章标签： java

于 2021-05-09 21:01:26 首次发布

本文链接：https://blog.csdn.net/GeekerJava/article/details/116566631

版权

netty 专栏收录该内容

10 篇文章 0 订阅

订阅专栏

NioEventLoop-reactor线程分析

经过上面的分析我们知道nioEventLoop线程会在第一次执行execute时，向task任务队列中添加任务时，在ThreadPerTaskExecutor中完成线程启动

public void execute(Runnable command) {
        //通过线程工厂创建线程，启动并执行command
        threadFactory.newThread(command).start();
    }

线程启动后会执行runnable执行单元逻辑

private void doStartThread() {
  assert thread == null;
  executor.execute(new Runnable() {
      @Override
      public void run() {
          //1.首先将thread赋值为当前线程
          thread = Thread.currentThread();
          if (interrupted) {
              thread.interrupt();
          }

          boolean success = false;
          //2.更新最后一次执行时间
          updateLastExecutionTime();
          try {
              //3.执行SingleThreadEventExecutor的run方法
              SingleThreadEventExecutor.this.run();
              success = true;
          } catch (Throwable t) {
              logger.warn("Unexpected exception from an event executor: ", t);
          } finally {
              for (;;) {
                  int oldState = state;
                  if (oldState >= ST_SHUTTING_DOWN || STATE_UPDATER.compareAndSet(
                          SingleThreadEventExecutor.this, oldState, ST_SHUTTING_DOWN)) {
                      break;
                  }
              }
              部分代码省略...

这个也是我们上面已经分析过得，这里我们没有分析SingleThreadEventExecutor.this.run()方法这里我们继续分析，我们跟踪run方法会走到NioEventLoop的run方法

protected void run() {
       for (;;) {
           try {
               switch (selectStrategy.calculateStrategy(selectNowSupplier, hasTasks())) {
                   case SelectStrategy.CONTINUE:
                       continue;
                       //1.处理selectedKeys事件
                   case SelectStrategy.SELECT:
                       select(wakenUp.getAndSet(false));
                       if (wakenUp.get()) {
                           selector.wakeup();
                       }
                   default:
               }

               cancelledKeys = 0;
               needsToSelectAgain = false;
               //ioRatio为线程执行IO事件的事件和执行任务队列中任务的时间比
               final int ioRatio = this.ioRatio;
               //只考虑运行IO时间
               if (ioRatio == 100) {
                   try {
                       processSelectedKeys();
                   } finally {
                       runAllTasks();
                   }
               } else {
                   final long ioStartTime = System.nanoTime();
                   try {
                       //2.处理就绪的事件
                       processSelectedKeys();
                   } finally {
                       // Ensure we always run tasks.
                       final long ioTime = System.nanoTime() - ioStartTime;
                       //3.执行task任务
                       runAllTasks(ioTime * (100 - ioRatio) / ioRatio);
                   }
               }
           } catch (Throwable t) {
               handleLoopException(t);
           }
           try {
               if (isShuttingDown()) {
                   closeAll();
                   if (confirmShutdown()) {
                       return;
                   }
               }
           } catch (Throwable t) {
               handleLoopException(t);
           }
       }
   }

可以看到这里线程执行的是一个死循环，nioEventLoop线程主要做的事情可以分为三种：
1.轮询注册到selector上的所有channel的IO事件

select(wakenUp.getAndSet(false));

2.处理就绪的IO事件

processSelectedKeys();

3.执行所有添加到任务队列的任务（包括定时时间即将到的定时任务，会从定时任务队列转移到普通任务队列）

runAllTasks(ioTime * (100 - ioRatio) / ioRatio);

我们逐步来分析，首先分析select方法

private void select(boolean oldWakenUp) throws IOException {
        Selector selector = this.selector;
        try {
            int selectCnt = 0;
            long currentTimeNanos = System.nanoTime();
            //取出第一个定时任务的执行时间
            long selectDeadLineNanos = currentTimeNanos + delayNanos(currentTimeNanos);
            for (;;) {
                //发现当前定时任务的截止时间快到0.5ms就跳出循环
                long timeoutMillis = (selectDeadLineNanos - currentTimeNanos + 500000L) / 1000000L;
                if (timeoutMillis <= 0) {
                    if (selectCnt == 0) {
                        selector.selectNow();
                        selectCnt = 1;
                    }
                    break;
                }
                //轮询过程中发现有任务加入，中断本次select
                if (hasTasks() && wakenUp.compareAndSet(false, true)) {
                    selector.selectNow();
                    selectCnt = 1;
                    break;
                }
                //阻塞的执行timeoutMillis时间的select操作，timeoutMillis即为第一个定时任务的截止时间
                int selectedKeys = selector.select(timeoutMillis);
                selectCnt ++;

                if (selectedKeys != 0 || oldWakenUp || wakenUp.get() || hasTasks() || hasScheduledTasks()) {
                    break;
                }
                if (Thread.interrupted()) {
                    if (logger.isDebugEnabled()) {
                        logger.debug("Selector.select() returned prematurely because " +
                                "Thread.currentThread().interrupt() was called. Use " +
                                "NioEventLoop.shutdownGracefully() to shutdown the NioEventLoop.");
                    }
                    selectCnt = 1;
                    break;
                }
                //解决jdk空轮询bug
                long time = System.nanoTime();
                if (time - TimeUnit.MILLISECONDS.toNanos(timeoutMillis) >= currentTimeNanos) {
                    // timeoutMillis elapsed without anything selected.
                    selectCnt = 1;
                } else if (SELECTOR_AUTO_REBUILD_THRESHOLD > 0 &&
                        selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD) {
                    logger.warn(
                            "Selector.select() returned prematurely {} times in a row; rebuilding Selector {}.",
                            selectCnt, selector);
                    //重新创建selector
                    rebuildSelector();
                    selector = this.selector;

                    // Select again to populate selectedKeys.
                    selector.selectNow();
                    selectCnt = 1;
                    break;
                }

                currentTimeNanos = time;
            }

            if (selectCnt > MIN_PREMATURE_SELECTOR_RETURNS) {
                if (logger.isDebugEnabled()) {
                    logger.debug("Selector.select() returned prematurely {} times in a row for Selector {}.",
                            selectCnt - 1, selector);
                }
            }
        } catch (CancelledKeyException e) {
            if (logger.isDebugEnabled()) {
                logger.debug(CancelledKeyException.class.getSimpleName() + " raised by a Selector {} - JDK bug?",
                        selector, e);
            }
        }
    }

通过源码我们可以看到select方法也是一个死循环，我们逐个分析跳出这个死循环的方法

 //select执行次数
 int selectCnt = 0;
 //当前时间
 long currentTimeNanos = System.nanoTime();
 //取出第一个定时任务的执行时间
 long selectDeadLineNanos = currentTimeNanos + delayNanos(currentTimeNanos);
 for (;;) {
     //发现当前定时任务的截止时间快到0.5ms就跳出循环
     long timeoutMillis = (selectDeadLineNanos - currentTimeNanos + 500000L) / 1000000L;
     if (timeoutMillis <= 0) {
         if (selectCnt == 0) {
             selector.selectNow();
             selectCnt = 1;
         }
         break;
     }

首先将select调用次数设置为0，然后获取当前的系统纳秒时间，然后从定时队列中获取第一个将要执行的方法的时间，我们先追踪到delayNanos方法中

 protected long delayNanos(long currentTimeNanos) {
        ScheduledFutureTask<?> scheduledTask = peekScheduledTask();
        //如果定时任务队列没有定时任务返回一秒
        if (scheduledTask == null) {
            return SCHEDULE_PURGE_INTERVAL;
        }
        //如果有获取它的延迟时间
        return scheduledTask.delayNanos(currentTimeNanos);
    }
 //从任务队列中获取一个定时任务
    final ScheduledFutureTask<?> peekScheduledTask() {
        Queue<ScheduledFutureTask<?>> scheduledTaskQueue = this.scheduledTaskQueue;
        if (scheduledTaskQueue == null) {
            return null;
        }
        return scheduledTaskQueue.peek();
    }

这里会获取定时任务队列的第一个任务，如果有定时任务就获取他要执行的时间单位是纳秒，如果没有就返回一对应的纳秒值，这里为什么获取定时队列第一个任务呢?我们先看看这个定时任务队列

PriorityQueue<ScheduledFutureTask<?>> scheduledTaskQueue;
PriorityQueue<ScheduledFutureTask<?>> scheduledTaskQueue() {
        if (scheduledTaskQueue == null) {
            scheduledTaskQueue = new DefaultPriorityQueue<ScheduledFutureTask<?>>(
                    SCHEDULED_FUTURE_TASK_COMPARATOR,
                    // Use same initial capacity as java.util.PriorityQueue
                    11);
        }
        return scheduledTaskQueue;
    }
private static final Comparator<ScheduledFutureTask<?>> SCHEDULED_FUTURE_TASK_COMPARATOR =
            new Comparator<ScheduledFutureTask<?>>() {
                @Override
                public int compare(ScheduledFutureTask<?> o1, ScheduledFutureTask<?> o2) {
                    return o1.compareTo(o2);
                }
            };

从名称我们可以看到这是一个优先级的队列，那这个优先级的队列是通过什么设置优先级的呢?我们猜测是根据定时时间来进行设置优先级的。我们继续跟踪到这个ScheduledFutureTask的compareTo方法

 public int compareTo(Delayed o) {
        if (this == o) {
            return 0;
        }

        ScheduledFutureTask<?> that = (ScheduledFutureTask<?>) o;
        long d = deadlineNanos() - that.deadlineNanos();
        //比较时间，返回时间小的
        if (d < 0) {
            return -1;
        } else if (d > 0) {
            return 1;
            //如果时间相同比较id
        } else if (id < that.id) {
            return -1;
            //如果时间相同报错
        } else if (id == that.id) {
            throw new Error();
        } else {
            return 1;
        }
    }

果然我们可以发现是根据定时任务要执行的时间进行排序的，将最先要执行的任务放在了最前面，所以前面我们获取了队列的第一个元素的时间。

long timeoutMillis = (selectDeadLineNanos - currentTimeNanos + 500000L) / 1000000L;

判断如果第一个要执行的定时任务+500000L，没有达到1000000纳秒则timeoutMillis就为0，及如果要执行的定时任务时间小于了0.5ms就退出当前的for循环，同时判断是不是还没执行一次select方法，如果是就执行一次立即返回的selectNow方法，然后退出循环。
我们接着分析

//轮询过程中发现有任务加入，中断本次select
if (hasTasks() && wakenUp.compareAndSet(false, true)) {
     selector.selectNow();
     selectCnt = 1;
     break;
 }

我们可以看到每次执行select之前都会将wakenUp的值设置为false

//执行select操作并且将wakenUp的值设置为false
select(wakenUp.getAndSet(false));

第二步如果在轮询过程中如果有新的task任务添加到任务队列，就先执行一次selectNow，然后从循环中退出。

//阻塞的执行timeoutMillis时间的select操作，timeoutMillis即为第一个定时任务的截止时间
int selectedKeys = selector.select(timeoutMillis);
selectCnt ++;
//以下条件跳出循环
if (selectedKeys != 0 || oldWakenUp || wakenUp.get() || hasTasks() || hasScheduledTasks()) {
    break;
}

走到这里说明没有快要执行的定时任务需要执行，同时任务队列中也没有需要执行的任务，则执行一次阻塞的 select方法，阻塞时间就是第一个定时任务要执行的时间，如果没有定时任务默认就是1s。这里如果定时任务时间非常长会不会发生长时间的阻塞呢?其实并不会，我们在前面分析execute方法添加任务到任务队列时，会在最后执行selector的wakeUp方法。

public void execute(Runnable task) {
        //校验
        if (task == null) {
            throw new NullPointerException("task");
        }
        //1.判断是不是EventLoop线程
        boolean inEventLoop = inEventLoop();
        //2.如果是eventLoop线程直接添加一个任务到任务队列中
        if (inEventLoop) {
            addTask(task);
        } else {
            //3.如果是外部线程，而且eventLoop线程还没有启动则启动该线程
            startThread();
            addTask(task);
            if (isShutdown() && removeTask(task)) {
                reject();
            }
        }
        //4.判断是不是需要将NioEventLoop线程，从阻塞的select操作中唤醒
        if (!addTaskWakesUp && wakesUpForTask(task)) {
            wakeup(inEventLoop);
        }
    }

所以当有外部线程添加任务到任务队列时就会退出阻塞式的selelct方法第四步以下几种情况也会退出循环:

轮询到了事件
用户进行了唤醒
有新任务添加到任务队列
有定时任务需要执行
执行这个方法时传入的参数为true

//如果线程运行过程中被中断也会跳出当前循环
if (Thread.interrupted()) {
    if (logger.isDebugEnabled()) {
        logger.debug("Selector.select() returned prematurely because " +
                "Thread.currentThread().interrupt() was called. Use " +
                "NioEventLoop.shutdownGracefully() to shutdown the NioEventLoop.");
    }
    selectCnt = 1;
    break;
}

如果当前线程发生了中断也会跳出当前循环

//解决jdk空轮询bug
    long time = System.nanoTime();
    //判断select时间是不是阻塞了timeoutMillis时间
    if (time - TimeUnit.MILLISECONDS.toNanos(timeoutMillis) >= currentTimeNanos) {
        // timeoutMillis elapsed without anything selected.
        selectCnt = 1;
    } else if (SELECTOR_AUTO_REBUILD_THRESHOLD > 0 &&
            selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD) {
        logger.warn(
                "Selector.select() returned prematurely {} times in a row; rebuilding Selector {}.",
                selectCnt, selector);
        //重新创建selector
        rebuildSelector();
        selector = this.selector;

        // Select again to populate selectedKeys.
        selector.selectNow();
        selectCnt = 1;
        break;
    }

    currentTimeNanos = time;
}

在这里netty规避了nio的空轮训导致cpu的100%问题
首先判断select时间是不是阻塞了timeoutMillis时间，如果是说明是有效的阻塞的select操作。如果不是说明可能发送了空轮询，判断select次数是不是超过了阈值，阈值默认为512，如果达到了阈值就将当前的selelctor废弃重新创建一个selector并且将老的selector上的channel都注册到新的selector上。

private void rebuildSelector0() {
        final Selector oldSelector = selector;
        final SelectorTuple newSelectorTuple;

        if (oldSelector == null) {
            return;
        }

        try {
            newSelectorTuple = openSelector();
        } catch (Exception e) {
            logger.warn("Failed to create a new Selector.", e);
            return;
        }

        // Register all channels to the new Selector.
        int nChannels = 0;
        //遍历所有SelectionKey
        for (SelectionKey key: oldSelector.keys()) {
            Object a = key.attachment();
            try {
                if (!key.isValid() || key.channel().keyFor(newSelectorTuple.unwrappedSelector) != null) {
                    continue;
                }

                int interestOps = key.interestOps();
                key.cancel();
                //将原来selector上的channel都注册到新的selector上
                SelectionKey newKey = key.channel().register(newSelectorTuple.unwrappedSelector, interestOps, a);
                if (a instanceof AbstractNioChannel) {
                    // Update SelectionKey
                    ((AbstractNioChannel) a).selectionKey = newKey;
                }
                nChannels ++;
            } catch (Exception e) {
                logger.warn("Failed to re-register a Channel to the new Selector.", e);
                if (a instanceof AbstractNioChannel) {
                    AbstractNioChannel ch = (AbstractNioChannel) a;
                    ch.unsafe().close(ch.unsafe().voidPromise());
                } else {
                    @SuppressWarnings("unchecked")
                    NioTask<SelectableChannel> task = (NioTask<SelectableChannel>) a;
                    invokeChannelUnregistered(task, key, e);
                }
            }
        }

        selector = newSelectorTuple.selector;
        unwrappedSelector = newSelectorTuple.unwrappedSelector;

        try {
            // time to close the old selector as everything else is registered to the new one
            oldSelector.close();
        } catch (Throwable t) {
            if (logger.isWarnEnabled()) {
                logger.warn("Failed to close the old Selector.", t);
            }
        }

        logger.info("Migrated " + nChannels + " channel(s) to the new Selector.");
    }

可以看到就是遍历原来的selector上注册的SelectionKey，并将原来注册在旧的selector上的key先取消事件注册然后重新注册到新创建的selector上。转移完成后再将的selector关闭掉。
到这里我们总结下select方法，其实就是不停的轮训是否有10事件接入，并在轮询的过程中判断是不是有新任务加入;定时任务需要执行，如果线程发生了中断等条件都会退出轮询操作，同时通过计算selelc的时间判断是不是发生了空轮询问题，通过重建selector来规避这个bug。
下面我们继续分析run方法

//ioRatio为线程执行IO事件的事件和执行任务队列中任务的时间比
final int ioRatio = this.ioRatio;
//只考虑运行IO时间
if (ioRatio == 100) {
    try {
        processSelectedKeys();
    } finally {
        // Ensure we always run tasks.
        runAllTasks();
    }
} else {
    final long ioStartTime = System.nanoTime();
    try {
        //2.处理就绪的事件
        processSelectedKeys();
    } finally {
        // Ensure we always run tasks.
        final long ioTime = System.nanoTime() - ioStartTime;
        //3.执行task任务
        runAllTasks(ioTime * (100 - ioRatio) / ioRatio);
    }
}

上面代码可以看到处理事件和运行任务是通过ioRatio进行空值的默认是50，及分配处理事件时间和运行任务时间各占用一半如果设置为100表示先处理完事件后再执行任务队列中的任务。
我们先来分析processSelectedKeys()处理IO事件

private void processSelectedKeys() {
        //优化和非优化
        if (selectedKeys != null) {
            processSelectedKeysOptimized();
        } else {
            processSelectedKeysPlain(selector.selectedKeys());
        }
    }

这个我们在server启动过程中也分析过了，netty对其进行了优化将hashset结构优化成了数组，数组遍历时的效率
会更好，默认就会执行优化我们继续跟进方法

//优化后的事件处理
    private void processSelectedKeysOptimized() {
        //selectedKeys是SelectedSelectionKeySet，netty对set<SelectionKey>的优化
        //由原来的hashSet结构，改为了数组结构，遍历速度更快
        for (int i = 0; i < selectedKeys.size; ++i) {
            //SelectionKey集合
            final SelectionKey k = selectedKeys.keys[i];
            //将该位置上的SelectionKey设置为null，因为和原始的Nio一样selector.select方法执行后
            //无法删除SelectionKey
            selectedKeys.keys[i] = null;
            //获取附带对象，AbstractNioChannel
            final Object a = k.attachment();
            //一般为AbstractNioChannel
            if (a instanceof AbstractNioChannel) {
                processSelectedKey(k, (AbstractNioChannel) a);
            } else {
                @SuppressWarnings("unchecked")
                NioTask<SelectableChannel> task = (NioTask<SelectableChannel>) a;
                processSelectedKey(k, task);
            }

            if (needsToSelectAgain) {
                selectedKeys.reset(i + 1);

                selectAgain();
                i = -1;
            }
        }
    }

我们可以看到这个是对优化后的遍历selectedKeys，将selectedKeys.keys[]设置为null，由于有attachment属性可能会很大如果不设置为ull可能会很占用内存如果高峰期有很多事件接入可能导致内存溢出。我们将其设置为null，有助于GC的回收。
上面源代码我们可以看到kattachment()获取到的实例就是AbstractNioChannel，因为我们在服务端注册的时候就注册了this属性，就是一个AbstractNioChannel对象，所有这里会走到processSelectedKey方法。
注册时候附带的this

  //1.执行nio底层的register方法，并且attch了channel，不关系任何事件
selectionKey = javaChannel().register(eventLoop().unwrappedSelector(), 0, this);

//处理nio的事件
    private void processSelectedKey(SelectionKey k, AbstractNioChannel ch) {
        //获取设置IO事件处理的NioUnsafe类
        final AbstractNioChannel.NioUnsafe unsafe = ch.unsafe();
        //SelectionKey不可用
        if (!k.isValid()) {
            final EventLoop eventLoop;
            try {
                eventLoop = ch.eventLoop();
            } catch (Throwable ignored) {
                return;
            }

            if (eventLoop != this || eventLoop == null) {
                return;
            }
            // close the channel if the key is not valid anymore
            unsafe.close(unsafe.voidPromise());
            return;
        }

        try {
            //获取IO就绪事件
            int readyOps = k.readyOps();
            //接入事件就绪
            if ((readyOps & SelectionKey.OP_CONNECT) != 0) {
                int ops = k.interestOps();
                ops &= ~SelectionKey.OP_CONNECT;
                k.interestOps(ops);

                unsafe.finishConnect();
            }
            //写事件就绪
            if ((readyOps & SelectionKey.OP_WRITE) != 0) {
                //调用forceFlush，如果内容写完它将清除OP_WRITE
                ch.unsafe().forceFlush();
            }
            //读事件和连接事件就绪，执行unsafe.read()
            if ((readyOps & (SelectionKey.OP_READ | SelectionKey.OP_ACCEPT)) != 0 || readyOps == 0) {
                unsafe.read();
            }
        } catch (CancelledKeyException ignored) {
            unsafe.close(unsafe.voidPromise());
        }
    }

这里我们可以看到这里就是处理轮询到的各种事件，如果是bossGroup则主要接收到的都是OP_ACCEPT事件，然后将这些事件通过pipeline中添加的ServerBootstrapAcceptor进行处理，将连接成功的channel丢给workGroup进行处理，而wrokGroup主要处理的就是读写事件。这里详细内容我们会在下面在分析。

//如果已经有256个channel从selector取消连接，就将selectedKeys设置为空
//重新执行一次selectNow
if (needsToSelectAgain) {
    selectedKeys.reset(i + 1);
    selectAgain();
    i = -1;
}
private void selectAgain() {
    needsToSelectAgain = false;
    try {
        selector.selectNow();
    } catch (Throwable t) {
        logger.warn("Failed to update SelectionKeys.", t);
    }
}

那什么时候再执行一次selectAgain方法呢我们先来查看needsToSelectAgain在什么时候被设置为true的，经过
追踪我们找到了该位置

void cancel(SelectionKey key) {
        key.cancel();
        cancelledKeys ++;
        //当cancel方法执行达到256的时候将needsToSelectAgain设置为true
        if (cancelledKeys >= CLEANUP_INTERVAL) {
            cancelledKeys = 0;
            needsToSelectAgain = true;
        }
    }

也就是如果有256个channel从selector上移除，就将这个属性设置为true 然后将selectedKeys上所有的值设
为null，从新执行轮询，我猜可能是因为256个channel连接断开后为了防止selectedKeys数组太长导致内存溢出就将这些设置为null，重新执行selectNow，保证了selectedKeys的有效性。
到这里我们已经分析完了1.执行select方法2.处理轮询到的selectedKeys下面我们开始分析runAllTasks方法

//ioRatio为线程执行IO事件的事件和执行任务队列中任务的时间比
final int ioRatio = this.ioRatio;
//只考虑运行IO时间
if (ioRatio == 100) {
    try {
        processSelectedKeys();
    } finally {
        // Ensure we always run tasks.
        runAllTasks();
    }
} else {
    final long ioStartTime = System.nanoTime();
    try {
        //2.处理就绪的事件
        processSelectedKeys();
    } finally {
        // Ensure we always run tasks.
        final long ioTime = System.nanoTime() - ioStartTime;
        //3.执行task任务
        runAllTasks(ioTime * (100 - ioRatio) / ioRatio);
    }
}

默认情况下ioRatio的值为50，也就是执行processSelectedKeys的时间和runAllTasks的时间相等。我们可以看到首先获取processSelectedKeys()的运行时间，然后通过ioTime*(100-ioRatio)/ioRatio的比例计算runAllTasks的运行时间，我们继续跟进runAllTasks方法内部

protected boolean runAllTasks(long timeoutNanos) {
        //将定时任务队列中将要达到定时时间的任务加入到普通任务队列中
        fetchFromScheduledTaskQueue();
        //从普通任务队列中取出不是WAKEUP_TASK空任务的task
        Runnable task = pollTask();
        if (task == null) {
            //执行运行任务之后的功能
            afterRunningAllTasks();
            return false;
        }
        //获取允许运行任务的截止时间
        final long deadline = ScheduledFutureTask.nanoTime() + timeoutNanos;
        long runTasks = 0;
        long lastExecutionTime;
        for (;;) {
            //运行任务
            safeExecute(task);
            //记录运行了多少个任务
            runTasks ++;

            // Check timeout every 64 tasks because nanoTime() is relatively expensive.
            // XXX: Hard-coded value - will make it configurable if it is really a problem.
            //每运行了64个任务就判断时间是否超过了分配的运行时间，如果是退出循环
            if ((runTasks & 0x3F) == 0) {
                lastExecutionTime = ScheduledFutureTask.nanoTime();
                if (lastExecutionTime >= deadline) {
                    break;
                }
            }
            //如果普通任务队列中没有任务就赋值最后一次执行任务时间
            task = pollTask();
            if (task == null) {
                lastExecutionTime = ScheduledFutureTask.nanoTime();
                break;
            }
        }
        //执行运行任务之后的功能
        afterRunningAllTasks();
        this.lastExecutionTime = lastExecutionTime;
        return true;
    }

我们先分析一下添加任务到nioEventLoop任务队列的几种方式
1.通过添加普通任务

ctx.channel().eventLoop().execute(new Runnable() {
            @Override
            public void run() {

            }
        });

继续跟踪execute方法，来到了SingleThreadEventExecutor#execute方法

public void execute(Runnable task) {
        //校验
        if (task == null) {
            throw new NullPointerException("task");
        }
        //1.判断是不是EventLoop线程
        boolean inEventLoop = inEventLoop();
        //2.如果是eventLoop线程直接添加一个任务到任务队列中
        if (inEventLoop) {
            addTask(task);
        } else {
            //3.如果是外部线程，而且eventLoop线程还没有启动则启动该线程
            startThread();
            addTask(task);
            if (isShutdown() && removeTask(task)) {
                reject();
            }
        }
        //4.判断是不是需要将NioEventLoop线程，从阻塞的select操作中唤醒
        if (!addTaskWakesUp && wakesUpForTask(task)) {
            wakeup(inEventLoop);
        }
    }

这个方法我们在上面已经分析过，是当时启动NioEventLoop线程调用方法，这里已经启动了NioEventLoop线程，只需要将任务添加到任务队列，我们继续查看addTask方法

protected void addTask(Runnable task) {
        if (task == null) {
            throw new NullPointerException("task");
        }
        //1.添加任务
        if (!offerTask(task)) {
            //2.添加失败执行拒绝策略
            reject(task);
        }
    }
final boolean offerTask(Runnable task) {
        //1.判断线程是不是被shutDown
        if (isShutdown()) {
            reject();
        }
        return taskQueue.offer(task);
    }

可以看到这个方法就是向队列中添加任务，而这个队列就是LinkedBlockingQueue。如果添加失败，直接调用拒绝策略。
2.运行定时任务

ctx.channel().eventLoop().scheduleAtFixedRate(new Runnable() {
            @Override
            public void run() {

            }
        },0,10, TimeUnit.SECONDS);
<V> ScheduledFuture<V> schedule(final ScheduledFutureTask<V> task) {
        if (inEventLoop()) {
            scheduledTaskQueue().add(task);
        } else {
            execute(new Runnable() {
                @Override
                public void run() {
                    scheduledTaskQueue().add(task);
                }
            });
        }

        return task;
    }

可以看到将定时任务添加到了定时任务队列中。
我们继续回到runAllTasks方法，我们先对这个方法分开来看

//将定时任务队列中将要达到定时时间的任务加入到普通任务队列中
fetchFromScheduledTaskQueue();

private boolean fetchFromScheduledTaskQueue() {
        //获取启动了多少时间，单位是纳秒
        long nanoTime = AbstractScheduledEventExecutor.nanoTime();
        Runnable scheduledTask  = pollScheduledTask(nanoTime);
        while (scheduledTask != null) {
            //将到定时时间的任务从定时任务队列中转移到普通队列中
            if (!taskQueue.offer(scheduledTask)) {
                // No space left in the task queue add it back to the scheduledTaskQueue so we pick it up again.
                //如果放入普通任务队列失败，就将任务放回定时任务队列中
                scheduledTaskQueue().add((ScheduledFutureTask<?>) scheduledTask);
                return false;
            }
            //再次从定时任务队列中获取定时任务，直到获取不到任务
            scheduledTask  = pollScheduledTask(nanoTime);
        }
        return true;
    }
protected final Runnable pollScheduledTask(long nanoTime) {
        //保证是内存线程调用
        assert inEventLoop();

        Queue<ScheduledFutureTask<?>> scheduledTaskQueue = this.scheduledTaskQueue;
        //判断定时任务队列是不是null如果有定时任务，就取出第一个定时任务，否则直接返回null
        ScheduledFutureTask<?> scheduledTask = scheduledTaskQueue == null ? null : scheduledTaskQueue.peek();
        if (scheduledTask == null) {
            return null;
        }
        //定时任务达到了定时时间
        if (scheduledTask.deadlineNanos() <= nanoTime) {
            scheduledTaskQueue.remove();
            return scheduledTask;
        }
        return null;
    }

首先我们可以看到获取定时任务队列的第一个定时任务，只有当达到了定时时间才返回这个定时任务，然后将达到定时时间的定时任务添加到普通任务队列中，并且如果添加失败了就重新将定时任务加入到定时任务队列中。

 //从普通任务队列中取出不是WAKEUP_TASK空任务的task
        Runnable task = pollTask();
        if (task == null) {
            //执行运行任务之后的功能
            afterRunningAllTasks();
            return false;
        }
protected Runnable pollTask() {
        assert inEventLoop();
        return pollTaskFrom(taskQueue);
    }
protected static Runnable pollTaskFrom(Queue<Runnable> taskQueue) {
        for (;;) {
            Runnable task = taskQueue.poll();
            if (task == WAKEUP_TASK) {
                continue;
            }
            return task;
        }
    }

遍历普通任务队列，从普通任务队列中获取不是WAKEUP_TASK的空任务，如果没有获取到任务，就执行一些收尾工作

protected void afterRunningAllTasks() {
        runAllTasksFrom(tailTasks);
    }
protected final boolean runAllTasksFrom(Queue<Runnable> taskQueue) {
        Runnable task = pollTaskFrom(taskQueue);
        if (task == null) {
            return false;
        }
        for (;;) {
            safeExecute(task);
            task = pollTaskFrom(taskQueue);
            if (task == null) {
                return true;
            }
        }
    }

可以看到收尾工作也是一个任务队列，执行收尾的工作

//获取允许运行任务的截止时间
 final long deadline = ScheduledFutureTask.nanoTime() + timeoutNanos;
 long runTasks = 0;
 long lastExecutionTime;

获取本次运行任务的截止时间，对一些属性进行赋值，如:运行了多少个任务和最后运行任务时间

for (;;) {
      //运行任务
      safeExecute(task);
      //记录运行了多少个任务
      runTasks ++;
      //每运行了64个任务就判断时间是否超过了分配的运行时间，如果是退出循环
      if ((runTasks & 0x3F) == 0) {
          lastExecutionTime = ScheduledFutureTask.nanoTime();
          if (lastExecutionTime >= deadline) {
              break;
          }
      }
      //如果普通任务队列中没有任务就赋值最后一次执行任务时间
      task = pollTask();
      if (task == null) {
          lastExecutionTime = ScheduledFutureTask.nanoTime();
          break;
      }
  }

进入循环，运行task任务的run方法，然后记录运行了多少个任务。并且没当运行了64个任务之后就判断是不是达到了运行的截止时间，如果是就退出循环，等待下次循环再执行任务队列中任务。这里netty主要是为了防止一直运行task任务，有大量的用户请求再次阻塞。如果队列中的普通队列中的任务已经执行结束了，就记录下最后执行任务的时间。这里我们可以看到如果是运行一个运行时间较长的任务可能会导致，大量用户请求任务阻塞，得不到及时处理，所以建议不要在eventLoop的线程执行阻塞的任务。

//执行运行任务之后的功能
afterRunningAllTasks();
this.lastExecutionTime = lastExecutionTime;
return true;

最后和上面一样执行一些收尾的工作，然后记录最后执行任务的时间。到这里我们分析完了runAllTasks方法。我们来分析一下上面提到的定时任务。
定时任务分为三种:

延迟一定时间执行的任务
延迟一定时间周期执行的任务
每次执行结束后，延迟一定时间再执行的任务
netty通过periodNanos的值区分上面的三种情况

/* 0 - no repeat, >0 - repeat at fixed rate, <0 - repeat with fixed delay */
    private final long periodNanos;

我们先来分析第一种

ctx.channel().eventLoop().schedule(new Runnable() {
            @Override
            public void run() {
                System.out.println(1);
            }
        },10,TimeUnit.SECONDS);

我们跟踪到schedule方法

public ScheduledFuture<?> schedule(Runnable command, long delay, TimeUnit unit) {
        //执行一些校验
        ObjectUtil.checkNotNull(command, "command");
        ObjectUtil.checkNotNull(unit, "unit");
        if (delay < 0) {
            delay = 0;
        }
        //封装成ScheduledFutureTask
        return schedule(new ScheduledFutureTask<Void>(
                this, command, null, ScheduledFutureTask.deadlineNanos(unit.toNanos(delay))));
    }
ScheduledFutureTask(
            AbstractScheduledEventExecutor executor,
            Runnable runnable, V result, long nanoTime) {

        this(executor, toCallable(runnable, result), nanoTime);
    }
ScheduledFutureTask(
            AbstractScheduledEventExecutor executor,
            Callable<V> callable, long nanoTime) {

        super(executor, callable);
        deadlineNanos = nanoTime;
        periodNanos = 0;
    }

这里可以看到创建了ScheduledFutureTask任务并且将periodNanos默认设置为0然后调用schedule方法

<V> ScheduledFuture<V> schedule(final ScheduledFutureTask<V> task) {
        //判断是不是内部线程，如果是直接加入定时任务队列
        if (inEventLoop()) {
            scheduledTaskQueue().add(task);
        } else {
            //如果不是内部线程就封装一个添加任务放入普通任务队列，后续执行
            execute(new Runnable() {
                @Override
                public void run() {
                    scheduledTaskQueue().add(task);
                }
            });
        }

        return task;
    }

可以看到将task添加到了DefaultPriorityQueue定时任务队列中。而这个队列我们在前面也提到过是一个根据时间对比定时时间小的排在前面，如果定时时间相同就比较id大小id小的放在前面，如果id也相同就报错

public int compareTo(Delayed o) {
        if (this == o) {
            return 0;
        }

        ScheduledFutureTask<?> that = (ScheduledFutureTask<?>) o;
        long d = deadlineNanos() - that.deadlineNanos();
        //比较时间，返回时间小的
        if (d < 0) {
            return -1;
        } else if (d > 0) {
            return 1;
            //如果时间相同比较id
        } else if (id < that.id) {
            return -1;
            //如果时间相同报错
        } else if (id == that.id) {
            throw new Error();
        } else {
            return 1;
        }
    }

我们看下如果是第二种情况，延迟一定时间周期执行的任务

ctx.channel().eventLoop().scheduleAtFixedRate(new Runnable() {
            @Override
            public void run() {
                System.out.println(1);
            }
        },0,10, TimeUnit.SECONDS);
public ScheduledFuture<?> scheduleAtFixedRate(Runnable command, long initialDelay, long period, TimeUnit unit) {
        //执行一些校验
        ObjectUtil.checkNotNull(command, "command");
        ObjectUtil.checkNotNull(unit, "unit");
        if (initialDelay < 0) {
            throw new IllegalArgumentException(
                    String.format("initialDelay: %d (expected: >= 0)", initialDelay));
        }
        if (period <= 0) {
            throw new IllegalArgumentException(
                    String.format("period: %d (expected: > 0)", period));
        }

        return schedule(new ScheduledFutureTask<Void>(
                this, Executors.<Void>callable(command, null),
                ScheduledFutureTask.deadlineNanos(unit.toNanos(initialDelay)), unit.toNanos(period)));
    }
ScheduledFutureTask(
            AbstractScheduledEventExecutor executor,
            Callable<V> callable, long nanoTime, long period) {

        super(executor, callable);
        if (period == 0) {
            throw new IllegalArgumentException("period: 0 (expected: != 0)");
        }
        //延迟运行时间
        deadlineNanos = nanoTime;
        //周期运行时间
        periodNanos = period;
    }
<V> ScheduledFuture<V> schedule(final ScheduledFutureTask<V> task) {
        //判断是不是内部线程，如果是直接加入定时任务队列
        if (inEventLoop()) {
            scheduledTaskQueue().add(task);
        } else {
            //如果不是内部线程就封装一个添加任务放入普通任务队列，后续执行
            execute(new Runnable() {
                @Override
                public void run() {
                    scheduledTaskQueue().add(task);
                }
            });
        }

        return task;
    }

和上面的一样也是创建了个ScheduledFutureTask实例包装需要执行的定时任务，但是对periodNanos进行了赋
值。
下面看看第三种情况每次执行结束后，延迟一定时间再执行的任务

ctx.channel().eventLoop().scheduleWithFixedDelay(new Runnable() {
            @Override
            public void run() {
                System.out.println(1);
            }
        },0,10, TimeUnit.SECONDS);
public ScheduledFuture<?> scheduleWithFixedDelay(Runnable command, long initialDelay, long delay, TimeUnit unit) {
        //执行一些校验，验证传入的参数是否合法
        ObjectUtil.checkNotNull(command, "command");
        ObjectUtil.checkNotNull(unit, "unit");
        if (initialDelay < 0) {
            throw new IllegalArgumentException(
                    String.format("initialDelay: %d (expected: >= 0)", initialDelay));
        }
        if (delay <= 0) {
            throw new IllegalArgumentException(
                    String.format("delay: %d (expected: > 0)", delay));
        }
        //创建一个定时任务的task然后添加到定时任务中
        return schedule(new ScheduledFutureTask<Void>(
                this, Executors.<Void>callable(command, null),
                ScheduledFutureTask.deadlineNanos(unit.toNanos(initialDelay)), -unit.toNanos(delay)));
    }

可以看到对delay值进行了取负数操作，则periodNanos的就小于0。上面提到如果定时时间到了会将定时任务从定时任务队列转移到普通任务队列然后执行任务的run方法我们现在看看定时任务的run方法是什么样的。

public void run() {
        //保证是内存线程执行
        assert executor().inEventLoop();
        try {
            //如果定时周期时间为0，代表运行的是延迟一定时间运行的任务
            if (periodNanos == 0) {
                //设置promise的状态为不可取消
                if (setUncancellableInternal()) {
                    //运行任务并返回运行结果
                    V result = task.call();
                    //保存运行结果，可以通过注册的listener回调
                    setSuccessInternal(result);
                }
            } else {
                // check if is done as it may was cancelled
                //如果是需要周期执行的任务
                //检查定时任务是不是被取消
                //调用方可以通过返回的ScheduledFuture控制任务是不是取消
                if (!isCancelled()) {
                    //运行任务
                    task.call();
                    //检查线程是不是被关闭
                    if (!executor().isShutdown()) {
                        //如果没有被关闭，获取周期执行时间
                        long p = periodNanos;
                        //判断周期运行时间，如果是大于0说明不管之前任务是否执行结束，
                        //设置下一次定时任务运行时间
                        //传入periodNanos值是负数，所以这里是减操作
                        if (p > 0) {
                            deadlineNanos += p;
                        } else {
                            deadlineNanos = nanoTime() - p;
                        }
                        //再次判断是不是取消了定时任务
                        if (!isCancelled()) {
Queue<ScheduledFutureTask<?>> scheduledTaskQueue =
                                    ((AbstractScheduledEventExecutor) executor()).scheduledTaskQueue;
                            //确保定时任务队列不能为null
                            assert scheduledTaskQueue != null;
                            //继续将定时任务添加到定时任务队列，等待下一个周期执行
                            scheduledTaskQueue.add(this);
                        }
                    }
                }
            }
        } catch (Throwable cause) {
            //如果执行报错就保存报错信息，所有的运行结果都保存在了task中
            setFailureInternal(cause);
        }
    }

这里我们就可以看到根据periodNanos的值为0，小于0和大于0三种情况执行上面说的三种不同逻辑，其实就是对定时任务的deadlineNanos赋值方式不同。第一种是不会在对deadlineNanos赋值运行完也不会添加到定时任务队列中;第二种方式deadlineNanos + = p直接将下一次执行的时间往后延迟p时间，然后继续添加到定时任务队列中;第三种方式deadlineNanos = nanoTime() - p，根据上次执行任务的结束时间再延迟p时间，然后将任务添加到定时任务队列中。而且在运行中判断定时任务是不是被取消或者是线程池被关闭，如果是就退出任务，最后将执行结果封装到了传入的task任务中可以通过scheduledFuture获取执行结果。由于所有的运行都是在NioEventLoop内部其实就是NioEventLoop单个线程在执行，所以不用考虑并发问题。这也是nettv串行无锁化设计的体现。到这里我们终于分析完了NioEventLoop的run方法执行，希望同学们可以记住。关于netty的reactor线程的介绍就到这里，如果有分析错误的地方还请不吝指正。感谢！！！

胖柯G

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
netty源码浅析-NioEventLoop-reactor线程分析

NioEventLoop-reactor线程分析经过上面的分析我们知道nioEventLoop线程会在第一次执行execute时，向task任务队列中添加任务时，在ThreadPerTaskExecutor中完成线程启动public void execute(Runnable command) { //通过线程工厂创建线程，启动并执行command threadFactory.newThread(command).start(); }线程启动后会执行runna
复制链接

扫一扫

专栏目录