io多路复用之select，poll，epoll总结

最新推荐文章于 2024-08-03 20:16:32 发布

wlgoc

最新推荐文章于 2024-08-03 20:16:32 发布

阅读量5k

点赞数

本文链接：https://blog.csdn.net/me4weizhen/article/details/52165157

版权

io多路复用在网络编程中比较常用。这个概念比较杂所以慢慢梳理，如果基础知识不够可以先去segmentfault这篇文章去看一下。这里面基础知识讲的比较好。对于用户态内核态阻塞文件描述符io描述等都讲解的很通俗易懂，至于后面的三个区别和我的也差不多。所以在这儿我尽量总结一下。结合了去年一英文博客，英语好的可以直接去看英文博客，英文讲解的很通俗易懂。

segamentfault上的基础介绍。Linux IO模式及 select、poll、epoll详解

三个多路复用select poll epoll区别英文网页select / poll / epoll: practical difference for system architects

最初是看的这个人的三篇博客，说实话个人感觉新手看着比较乱，专有名词比较多点击打开链接点击打开链接点击打开链接

还有一篇总结文章点击打开链接

下面是我的笔记。可能写的比较跳跃，有不好请指正。首先各自上示例代码(仅仅示例，还不够直接执行)，然后写了些总结。建议懂得多的可以直接上代码，少的先看解释再看代码。

区分select ，poll，epoll，

select

简介

select是1983年的4.2BSD提出。系统在select用32*32=1024位来进行查询。返回的时候数组如readfds是已经处理过的了，返回时只有准备好事件的fd。所以需要轮训(要用FD_ISSET挨个比较)和重新赋值。FD_ISSET（fd,&readfds）

原型

#include <sys/select.h>
#include <sys/time.h>
int select(int maxfdp1,fd_set *readset,fd_set *writeset,fd_set *exceptset,const struct timeval *timeout)

使用方法

总共分三步，

1.三个fd_set初始化，用FD_ZERO　FD_SET
2.调用select
3.用fd遍历每一个fd_set使用FD_ISSET。如果成功就处理。

缺点。

1.每次调用select都要把fd_set传输一遍，
2.第三步都要轮巡一次。
3.1024个最大限制。
4.其它线程突然要用socket，会冲突。

仍然在现实保留的原因

1.历史遗留问题，因为select发展了很久的时间，额可以肯定大多的平台都支持他了，因为你无法保证新的平台都支持poll或者epoll。放心，我们说的不是enaic那种元祖机子，你听说过xp吗？你知道他在全中国全世界知道今天2016/9/10仍然占据多少比例么。oh no，它只支持iselect。
2.时间高精度，因为select可以精确到ns级别。而后二者只能精确到ms级别。当然你会说很多系统调用都没有那么高精度的。但是对于实时操作系统，也就是类似工业控制的高精领域，或者说比如核电站，核反应堆，oh，no这儿用select不止是让系统更安全，让你不被老板炒鱿鱼，更是关系到我们大众安全的问题，请你一定不要忘了这一点。
3，当然如果是简单应用场景，比如低于200个socket，那么你用什么其实问题都不打，更多的问题是在与程序员的编程水平了。

代码

<span style="font-size:18px;"><span style="font-family:Microsoft YaHei;font-size:14px;"><strong><span style="font-family:SimSun;"><span style="font-family:Microsoft YaHei;"><span style="font-family:FangSong_GB2312;">fd_set fd_in, fd_out;
struct timeval tv;
 
// Reset the sets
FD_ZERO( &fd_in );
FD_ZERO( &fd_out );
 
// Monitor sock1 for input events
FD_SET( sock1, &fd_in );
 
// Monitor sock2 for output events
FD_SET( sock2, &fd_out );
 
// Find out which socket has the largest numeric value as select requires it
int largest_sock = sock1 > sock2 ? sock1 : sock2;
 
// Wait up to 10 seconds
tv.tv_sec = 10;
tv.tv_usec = 0;
 
// Call the select
int ret = select( largest_sock + 1, &fd_in, &fd_out, NULL, &tv );
 
// Check if select actually succeed
if ( ret == -1 )
    // report error and abort
else if ( ret == 0 )
    // timeout; no event detected
else
{
    if ( FD_ISSET( sock1, &fd_in ) )
        // input event on sock1
 
    if ( FD_ISSET( sock2, &fd_out ) )
        // output event on sock2
}</span></span></span></strong></span></span>

poll

原型

# include <poll.h>
int poll ( struct pollfd * fds, unsigned int nfds, int timeout);

struct pollfd {
int fd; /* 文件描述符 */
short events; /* 等待的事件 */
short revents; /* 实际发生了的事件 */
} ;

简介

poll system v release3出现。这儿不用三个set来传递东西了，而是使用一个fds指向一个组，这个组里每个结构体有着pollfd，包含三个元素fd，event，revent。在内核调用后revent会被标记，如果处理好了，我们直接if一个就可以检查是否可以读写或者什么io操作了。select每次set都要重新生成。而这儿就不用了，因为这儿是以fd为单位的。

使用方法

也是分三步
1.pollfd初始化，绑定sock，设置事件event，revent。设置时间限制。
2.调用poll
3.遍历看他的事件发生了么，如果发生了置0。

优势，

1.无上限1024.这数字怎么这么别扭。。。
2.由于它不修改pollfd里的数据，所以它可以不用每次都填写了。
3.方便的知道远程的状态比如宕机

缺点。

1，还要轮巡
2.不能动态修改set。
其实大多数client不用考虑这个，除非p2p应用。一些server端用不用考虑这个问题。
大多时候他都比select更好。甚至如下场景比epoll还好。

使用场景

1.你要跨平台，因为epoll只支持linux。
2.socket数目少于1000个。
3.大于1000但是是socket寿命比较短。
4.没有其他线程干扰的时候。

代码

<span style="font-size:18px;"><span style="font-family:Microsoft YaHei;font-size:14px;"><strong><span style="font-family:SimSun;"><span style="font-family:Microsoft YaHei;"><span style="font-family:FangSong_GB2312;">// The structure for two events
struct pollfd fds[2];
 
// Monitor sock1 for input
fds[0].fd = sock1;
fds[0].events = POLLIN;
 
// Monitor sock2 for output
fds[1].fd = sock2;
fds[1].events = POLLOUT;
 
// Wait 10 seconds
int ret = poll( &fds, 2, 10000 );
// Check if poll actually succeed
if ( ret == -1 )
    // report error and abort
else if ( ret == 0 )
    // timeout; no event detected
else
{
    // If we detect the event, zero it out so we can reuse the structure
    if ( pfd[0].revents & POLLIN )
        pfd[0].revents = 0;
        // input event on sock1

    if ( pfd[1].revents & POLLOUT )
        pfd[1].revents = 0;
        // output event on sock2
}</span></span></span></strong></span></span>

epoll

原型

#include <sys/epoll.h>
int epoll_create(int size);
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);
int epoll_wait(int epfd, struct epoll_event * events, int maxevents, int timeout);

简介

epoll是2.6内核才加进去的东西。所以很多旧的地方见不到他，而且只在linux下支持。epoll是直接在内核里的，用户调用系统调用去注册，因此省去了每次的复制和轮询的消耗。这儿用了三个系统调用，epollcreate只要每次调用开始调用一次创造一个epoll就可以了。然后用epoll_ctl来进行添加事件，其实就是注册到内核管理的epoll里。然后直接epoll_wait就可以了。系统会返回系统调用的。

使用方法

epoll稍微复杂些了。

1.准备工作多了，很复杂，这个记录数据在内核里。
1)构建epoll描述符，通过调用epoll_create
2)用需要的时间和上下文数据指针初始化。
3)调用epoll_ctl 添加文件描述符。
4)调用epoll_wait每次处理20个事件。这儿是接收一个空数组，然后填上东西。也就是有200个东西过来，我可能只填了一个。当然如果50个完成了也是回复20.剩下的不会被漏掉，下次再来处理。
5)遍历返回的数据。注意这儿返回的都是有用的东西。

优点

1.只返回触发的事件。少了拷贝消耗，迭代轮训消耗。
2.可以绑定更多上下文，不仅仅是socket。
3.任何时间处理socket。这些问题都是有内核来处理。了。这个还需要继续学习啊。
4.可以边缘触发。
5.多线程可以在同一个epoll wait里等待。

缺点

1.读写状态变更之类的就要麻烦些，在poll里只要改一个bit就可以了。在这里面则需要改更多的位数。并且都是system call。
2.创建socket也需要两次系统调用，麻烦。
3.只有linux下可以使用
4.复杂难调试

适合场景

1.多线程，多连接。在单线程还不如poll
2.大量线程监控1000上，
3.相对长寿命的连接。系统调用会很耗时。
4.linux依赖的事情。

代码

<span style="font-size:18px;"><span style="font-family:Microsoft YaHei;font-size:14px;"><strong><span style="font-family:SimSun;"><span style="font-family:Microsoft YaHei;"><span style="font-family:FangSong_GB2312;">// Create the epoll descriptor. Only one is needed per app, and is used to monitor all sockets.
// The function argument is ignored (it was not before, but now it is), so put your favorite number here
int pollingfd = epoll_create( 0xCAFE ); 

if ( pollingfd < 0 )
 // report error

// Initialize the epoll structure in case more members are added in future
struct epoll_event ev = { 0 };

// Associate the connection class instance with the event. You can associate anything
// you want, epoll does not use this information. We store a connection class pointer, pConnection1
ev.data.ptr = pConnection1;

// Monitor for input, and do not automatically rearm the descriptor after the event
ev.events = EPOLLIN | EPOLLONESHOT;
// Add the descriptor into the monitoring list. We can do it even if another thread is 
// waiting in epoll_wait - the descriptor will be properly added
if ( epoll_ctl( epollfd, EPOLL_CTL_ADD, pConnection1->getSocket(), &ev ) != 0 )
    // report error

// Wait for up to 20 events (assuming we have added maybe 200 sockets before that it may happen)
struct epoll_event pevents[ 20 ];

// Wait for 10 seconds
int ready = epoll_wait( pollingfd, pevents, 20, 10000 );
// Check if epoll actually succeed
if ( ret == -1 )
    // report error and abort
else if ( ret == 0 )
    // timeout; no event detected
else
{
    // Check if any events detected
    for ( int i = 0; i < ret; i++ )
    {
        if ( pevents[i].events & EPOLLIN )
        {
            // Get back our connection pointer
            Connection * c = (Connection*) pevents[i].data.ptr;
            c->handleReadEvent();
         }
    }
}</span></span></span></strong></span></span>

libevent

是一种对select，poll epoll封装的方法，实际底层还是这三个东西，优缺点都是存在的。只是需要你去适应多复杂的环境。因此三者自动调用，但是还是用起来麻烦。比写三个好一些。

select传输的是rdfd，wfd，每次改动后都需要重新传递。poll穿的是以fd为单位的event，每次event改了可以不用重新传输。
epoll是用一个描述子代替了前面的所有描述。而且是在内核空间里的。