BIO ,NIO ,select ,poll ,epoll

C0oOder

已于 2022-03-13 22:45:33 修改

阅读量1.9k

点赞数

分类专栏： Redis 计算机基础文章标签： linux epoll nio

于 2022-03-13 22:44:28 首次发布

本文链接：https://blog.csdn.net/weixin_44244088/article/details/123468190

版权

Redis 同时被 2 个专栏收录

12 篇文章 1 订阅

订阅专栏

计算机基础

1 篇文章 0 订阅

订阅专栏

BIO, NIO ,select , poll ,epoll

1.知识铺垫

1.1 系统架构

就比如说是Linux 的系统架构如图

在这里插入图片描述

1.2 用户态与内核态

用户态：处于内核态的 CPU 可以访问任意的数据，包括外围设备，比如网卡、硬盘等，处于内核态的 CPU 可以从一个程序切换到另外一个程序，并且占用 CPU 不会发生抢占情况，一般处于特权级 0 的状态我们称之为内核态。

用户态：处于用户态的 CPU 只能受限的访问内存，并且不允许访问外围设备，用户态下的 CPU 不允许独占，也就是说 CPU 能够被其他程序获取；

系统调用将Linux整个体系分为用户态和内核态；也叫用户空间和内核空间;

用户程序的一切行为都是需要底层API 的支持；

1.3系统调用

这些系统调用组成了用户态跟内核态交互的基本接口，用户态和内核态之间最大的不同，是他们各自可以调用的系统资源不一样，权限不一样，当用户态线程需要执行更高权限的操作时，需要像内核态切换过去，在内核态执行对应的操作；上下文的切换系统开销比较大

本章节中说到的IO 操作就是一种系统调用；

1.4 阻塞

阻塞：阻塞就是发起一个请求，调用者一直等待请求结果返回，也就是当前线程会被挂起，无法从事其他任务，只有当条件就绪才能继续
非阻塞：非阻塞就是发起一个请求，调用者不用一直等着结果返回，可以先去干其他事情

1.5 同步与异步

同步：同步就是发起一个调用后，被调用者未处理完请求之前，调用不返回
异步：异步就是发起一个调用后，立刻得到被调用者的回应表示已接收到请求，但是被调用者并没有返回结果，此时我们可以处理其他的请求，被调用者通常依靠事件，回调等机制来通知调用者其返回结果

2.BIO

BIO : Block-IO 阻塞IO ;BIO的实质就是调用内核的bind(),listen(),accept()等内核AIP完成服务端对客户端的监听;

读取socket 流是阻塞的，服务端一次只能处理一个，一个线程去监控听时(accept),就会挂起 同步阻塞

java 代码示例

/**
     * @Description  单线程服务端
     * @Date 23:39 2022/3/12
     **/
    public static void startBIOServerV1() {
        try {
            long currentTimeMillis = System.currentTimeMillis();
            ServerSocket serverSocket = new ServerSocket();
            serverSocket.bind(new InetSocketAddress(8888));
            while (true) {
                System.out.println("listening ...start ");
                Socket socket = serverSocket.accept();
                InputStream inputStream = socket.getInputStream();
                BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(inputStream));
                String message = bufferedReader.readLine();
                while (message != null && message != "") {
                    //模拟处理时间
                    Thread.sleep(100);
                    System.out.println("client message:" + message);
                    message = bufferedReader.readLine();
                }
                System.out.println(" one client over ");
                System.out.println(" listening ...end ");
                long handle = System.currentTimeMillis();
                System.out.println("handle time : " + (handle - currentTimeMillis));
            }
        } catch (Exception e) {
            e.printStackTrace();
        }  
    }

/**
     * @Description  客户端
     * @Date 23:39 2022/3/12
     **/
public static void createBIOClient() {
        try {
            Socket socket = new Socket("127.0.0.1", 8888);
            OutputStream outputStream = socket.getOutputStream();
            outputStream.write(("hello server!!  ==>" + index.incrementAndGet()).getBytes(StandardCharsets.UTF_8));
            outputStream.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

每一个客户端的连接都需要一个用户线程去对应

在这里插入图片描述

缺点：一个客户端的来连接，为了处理客户端的响应，服务端一般会开启一个线程去处理这个连接，线程开销较大，比较浪费资源；

3.NIO

它支持面向缓冲的，基于通道的I/O操作方法；就是连接上不会一直处于阻塞，数据连接上来后，才会去读取；同步非阻塞

3.1 用户态轮训

BIO 的缺点很明显，为此演化成NIO,NIO的早期显示方式是，在用户态，线程轮训的去监听所有客户端的连接情况，有数据在处理；用户线程实现 ：对此，只需要一个线程去处理；

NIO也是有缺点的：如果连接数量过多，每次都需要遍历所有的连接，也是比较浪费性能的；

在这里插入图片描述

3.2 内核态轮训（多路复用）

3.2.1 select

man 2 select

系统内核层面提供轮训的实现比如Linux 的select()，select 的文档；

还是同步非阻塞(还是一个线程来处理的)


 int select(int nfds, fd_set *readfds, fd_set *writefds,
                  fd_set *exceptfds, struct timeval *timeout);
                  
//监听多个   descriptors 等待多个IO 操作 ，
===> 如果可以在不阻塞的情况下执行相应的I/O操作（例如读取（2）），则认为文件描述符已准备就绪==>意思就是： 取出的操作符可以直接 执行相应的I/O操作;
select()  and  pselect() allow a program to monitor multiple file descriptors, 
waiting until one or more of the file descriptors become "ready" for some class of I/O operation (e.g., input possible).  A file descriptor is considered ready if it is  possible to perform the corresponding I/O operation (e.g., read(2)) without blocking.

在这里插入图片描述

缺点1：文件描述符数据在用户态和内核态之间来回拷贝；

缺点2：select本质上是通过设置或者检查存放 fd 标志位的数据结构来进行下一步处理。单个进程可监视的fd数量被限制，即能监听端口的大小有限；一般32位1024个，64位 2048个

3.2.2 poll

流程图和select 基本一样

man 2 poll


int poll(struct pollfd *fds, nfds_t nfds, int timeout);

poll和select() 执行的任务差不多：它等待一组文件描述符中的一个准备好执行I/O。不同的是上面参数一个是文件描述符，一个是pollfd 数据结构；链表存储 没有大小的限制
poll() performs a similar task to select(2): it waits for one of a set of file descriptors to become ready to perform I/O.
The set of file descriptors to be monitored is specified in the fds argument, which is an array of structures of the following form:
           struct pollfd {
               int   fd;         /* file descriptor */
               short events;     /* requested events */
               short revents;    /* returned events */
};
The caller should specify the number of items in the fds array in nfds.

优点：和select 比较没有大小的限制了，基于链表的存储

缺点1：缺点和select的缺点一样，描述文件描述符的参数还是需要在用户态和内核态之间来回复制

缺点2：poll还有一个特点是“水平触发”，如果报告了fd后，没有被处理，那么下次poll时会再次报告该fd；

3.2.3 epoll

epoll 不同于 poll 和 select ,它是由好几个方法组成

man epoll

DESCRIPTION
       The epoll API performs a similar task to poll(2): monitoring multiple file descriptors to see if I/O is possible on any of them.  The epoll API can be used either as an edge-triggered or a level-triggered interface and scales well to large numbers
       of watched file descriptors.  The following system calls are provided to create and manage an epoll instance:

       *  epoll_create(2) creates an epoll instance and returns a file descriptor referring to that instance.  (The more recent epoll_create1(2) extends the functionality of epoll_create(2).)

       *  Interest in particular file descriptors is then registered via epoll_ctl(2).  The set of file descriptors currently registered on an epoll instance is sometimes called an epoll set.

       *  epoll_wait(2) waits for I/O events, blocking the calling thread if no events are currently available.

//===============================================================================================================
## epoll_create
epoll_create() creates an epoll(7) instance.  Since Linux 2.6.8, the size argument is ignored, but must be greater than zero; see NOTES below;
 epoll_create()  returns  a  file  descriptor referring to the new epoll instance.  This file descriptor is used for all the subsequent calls to the epoll interface.  When no longer required, the file descriptor returned by epoll_create() should be
       closed by using close(2).  When all file descriptors referring to an epoll instance have been closed, the kernel destroys the instance and releases the associated resources for reuse.
In the initial epoll_create() implementation, the size argument informed the kernel of the number of file descriptors that the caller expected to add to the epoll instance.  The kernel used this information as a hint for the  amount  of  space  to
       initially allocate in internal data structures describing events.  (If necessary, the kernel would allocate more space if the caller's usage exceeded the hint given in size.)  Nowadays, this hint is no longer required (the kernel dynamically sizes
       the required data structures without needing the hint), but size must still be greater than zero, in order to ensure backward compatibility when new epoll applications are run on older kernels.
总结下来就是，创建一个指定大小epoll 实例，大小值作为最初的分配 ，不用的时候close掉；

## epoll_ctl
 This system call performs control operations on the epoll(7) instance referred to by the file descriptor epfd.  It requests that the operation op be performed for the target file descriptor, fd.
  int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);
OP 参数如下
Valid values for the op argument are : 
fd 关联上一些事件 EPOLL_CTL_ADD，EPOLL_CTL_MOD,EPOLL_CTL_DEL，EPOLLIN,EPOLLOUT.........
它请求对目标文件描述符fd执行操作

## epoll_wait
The  epoll_wait()  system call waits for events on the epoll(7) instance referred to by the file descriptor epfd.  The memory area pointed to by events will contain the events that will be available for the caller.  Up to maxevents are returned by
       epoll_wait().  The maxevents argument must be greater than zero.
The timeout argument specifies the minimum number of milliseconds that epoll_wait() will block.  (This interval will be rounded up to the system clock granularity, and kernel scheduling delays mean that the blocking interval may overrun by a small
       amount.)  Specifying a timeout of -1 causes epoll_wait() to block indefinitely, while specifying a timeout equal to zero cause epoll_wait() to return immediately, even if no events are available.
epoll_wait（）
系统调用等待文件描述符epfd引用的epoll实例上的事件。事件指向的内存区域将包含调用者可用的事件。最多maxevents由返回epoll_wait（）。maxevents参数必须大于零。
timeout参数指定epoll_wait（）将阻止的最小毫秒数。（这个间隔将被四舍五入到系统时钟粒度，内核调度延迟意味着阻塞间隔可能会超出一小部分。）数量。）指定-1的超时将导致epoll_wait（）无限期阻塞，而指定等于零的超时将导致epoll_wait（）立即返回，即使没有可用的事件。

epoll API执行与poll类似的任务：监视多个文件描述符，以查看是否可以对其中任何一个进行I/O。epoll API既可以用作边缘触发接口 edge-triggered ，也可以用作水平触发接口 level-triggered：

LT模式下，只要这个fd还有数据可读，每次 epoll_wait都会返回它的事件，提醒用户程序去操作；
而在ET（边缘触发）模式中，它只会提示一次(效率较高)，直到下次再有数据流入之前都不会再提示了，无论fd中是否还有数据可读；ET模式下，read一个fd的时候一定要把它的buffer读光；

监视的文件描述符。提供以下系统调用来创建和管理epoll实例：

epoll_create 创建一个epoll实例，并返回一个引用该实例的文件描述符。（最近的epoll_create1扩展了epoll_create的功能。）
然后通过epoll_ctl 注册对特定文件描述符的回调。当前在epoll实例上注册的文件描述符集有时称为epoll集。
epoll_wait等待I/O事件，如果当前没有可用的事件，则阻塞调用线程。

在这里插入图片描述

优点1：没有最大并发连接的限制，1G的内存上能监听约10万个端口

优点2：没有内存来回拷贝，文件描述符在内核态和用户态共享空间中

优点3：没有在用户态和内核态遍历，所有的文件描述符在就绪状态触发后，会把对应文件描述符转移到链表中，线程直接可以获取到就绪状态的文件描述符而不需要遍历；

4.epoll

上面介绍这么多了 epoll 才是重点， Linux的epoll实现思路如下；

struct eventpoll{
    ....
    /*红黑树的根节点，这颗树中存储着所有添加到epoll中的需要监控的事件*/
    struct rb_root  rbr;
    /*双链表中则存放着将要通过epoll_wait返回给用户的满足条件的事件*/
    struct list_head rdlist;
    ....
};

在红黑树和链表中存储的数据结构

struct epitem{
    struct rb_node  rbn;//红黑树节点
    struct list_head    rdllink;//双向链表节点
    struct epoll_filefd  ffd;  //事件句柄信息   fd
    struct eventpoll *ep;    //指向其所属的eventpoll对象
    struct epoll_event event; //期待发生的事件类型
}

当epoll_wait 出发时，eventpoll->rdlist 中是否有epitem存在。如果rdlist不为空，事件返回给用户

在这里插入图片描述