Select VS Poll VS Epoll
Linux中有一个基本概念:Unix/Linux中的所有事物都是一个文件。每个进程都有一张file descriptors的表来指向文件,sockets, 设备或操作系统对象。
通过使用多个IO源工作的典型系统有一个初始化阶段,然后进入某种待机模式—等待任何客户端发送请求并响应它。
这一问题最简单的解决方案就是为每一个客户端创建一个线程或进程,在读时阻塞,直到发送请求并写入响应。对于少量的客户端,这是可以工作的,但是如果我们想将其扩展到数百个客户机,为每个客户端创建一个线程显然不是一个好的解决方案。
IO 多路复用
另一解决方案就是使用内核机制对一组file descriptors文件描述符进行轮询。现今有三种解决方案:
- select
- poll
- epoll
上面所有的方法都是基于同一种思路,创建一组文件描述符,告诉内核要对每个文件描述符(读、写…)做什么,并使用一个线程来阻塞一个函数调用,直到至少有一个文件描述符请求的操作可用为止
Select
select()
函数系统调用提供了一种实现同步I/O多路复用的机制。
int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout);
对select()
函数的调用将被阻塞,直到给定的文件描述符准备好执行I/O,或者直到经过可选的指定超时。
被监视的文件描述符分为三组:
-
File descriptors listed in the readfds set are watched to see if data is available for reading.
就是读文件描述字符集,第三个参数。
-
File descriptors listed in the writefds set are watched to see if a write operation will complete without blocking.写文件描述字符集,第四个参数。
-
File descriptors in the exceptfds set are watched to see if an exception has occurred, or if out-of-band data is available (these states apply only to sockets).异常文件描述字符集,第五个参数。
给定的集合可能为NULL,在这种情况下select()不监视该事件。
成功返回时,将修改每个集合,使其只包含准备好用于该集合所描述类型的I/O的文件描述符
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <wait.h>
#include <signal.h>
#include <errno.h>
#include <sys/select.h>
#include <sys/time.h>
#include <unistd.h>
#define MAXBUF 256
void child_process(void)
{
sleep(2);
char msg[MAXBUF];
struct sockaddr_in addr = {0};
int n, sockfd,num=1;
srandom(getpid());
/* Create socket and connect to server */
sockfd = socket(AF_INET, SOCK_STREAM, 0);
addr.sin_family = AF_INET;
addr.sin_port = htons(2000);
addr.sin_addr.s_addr = inet_addr("127.0.0.1");
connect(sockfd, (struct sockaddr*)&addr, sizeof(addr));
printf("child {%d} connected \n", getpid());
while(1){
int sl = (random() % 10 ) + 1;
num++;
sleep(sl);
sprintf (msg, "Test message %d from client %d", num, getpid());
n = write(sockfd, msg, strlen(msg)); /* Send message */
}
}
int main()
{
char buffer[MAXBUF];
int fds[5];
struct sockaddr_in addr;
struct sockaddr_in client;
int addrlen, n,i,max=0;;
int sockfd, commfd;
fd_set rset;
for(i=0;i<5;i++)
{
if(fork() == 0)
{
child_process();
exit(0);
}
}
sockfd = socket(AF_INET, SOCK_STREAM, 0);
memset(&addr, 0, sizeof (addr));
addr.sin_family = AF_INET;
addr.sin_port = htons(2000);
addr.sin_addr.s_addr = INADDR_ANY;
bind(sockfd,(struct sockaddr*)&addr ,sizeof(addr));
listen (sockfd, 5);
for (i=0;i<5;i++)
{
memset(&client, 0, sizeof (client));
addrlen = sizeof(client);
fds[i] = accept(sockfd,(struct sockaddr*)&client, &addrlen);
if(fds[i] > max)
max = fds[i];
}
while(1){
FD_ZERO(&rset);
for (i = 0; i< 5; i++ ) {
FD_SET(fds[i],&rset);
}
puts("round again");
select(max+1, &rset, NULL, NULL, NULL);
for(i=0;i<5;i++) {
if (FD_ISSET(fds[i], &rset)){
memset(buffer,0,MAXBUF);
read(fds[i], buffer, MAXBUF);
puts(buffer);
}
}
}
return 0;
}
我们从创建5个子进程开始,每个进程连接到服务器并向服务器发送消息。服务器进程使用accept()
为每个客户端创建不同的文件描述符。select()
中的第一个参数应该是三个集合中编号最高的文件描述符,加上1,因此我们检查max fd num
分析select函数的执行流程:
- select是一个阻塞函数,当没有数据时,会一直阻塞在select那一行。
- 当有数据时会将rset中对应的那一位置为1
- select函数返回,不再阻塞
- 遍历文件描述符数组,判断哪个fd被置位了
- 读取数据,然后处理
select函数的缺点
- bitmap默认大小为1024,虽然可以调整但还是有限度的
- rset每次循环都必须重新置位为0,不可重复使用。while循环中的前四行代码。
- 尽管将rset从用户态拷贝到内核态由内核态判断是否有数据,但是还是有拷贝的开销
- 当有数据时select就会返回,但是select函数并不知道哪个文件描述符有数据了,后面还需要再次对文件描述符数组进行遍历。效率比较低
Poll
与select()函数不同,poll()函数不再使用bitmap来标记文件描述符是否就绪,而是用一个结构体nfds数组。
int poll (struct pollfd *fds, unsigned int nfds, int timeout);
The structure pollfd has a different fields for the events and the returning events so we don’t need to build it each time:
struct pollfd {
int fd;
// 事件
short events;
// 返回事件
short revents;
};
For each file descriptor build an object of type pollfd and fill the required events. after poll returns check the revents field
To change the above example to use poll:
for (i=0;i<5;i++)
{
memset(&client, 0, sizeof (client));
addrlen = sizeof(client);
pollfds[i].fd = accept(sockfd,(struct sockaddr*)&client, &addrlen);
pollfds[i].events = POLLIN;
}
sleep(1);
while(1){
puts("round again");
poll(pollfds, 5, 50000);
for(i=0;i<5;i++) {
if (pollfds[i].revents & POLLIN){
pollfds[i].revents = 0;
memset(buffer,0,MAXBUF);
read(pollfds[i].fd, buffer, MAXBUF);
puts(buffer);
}
}
}
与select一样,我们需要检查每个pollfd对象,看看它的文件描述符是否准备好了,但是我们不需要每次迭代都构建集合
Poll vs Select
- poll( ) does not require that the user calculate the value of the highest- numbered file descriptor +1
- poll( ) is more efficient for large-valued file descriptors. Imagine watching a single file descriptor with the value 900 via select()—the kernel would have to check each bit of each passed-in set, up to the 900th bit.
- select( )’s file descriptor sets are statically sized.
- With select( ), the file descriptor sets are reconstructed on return, so each subsequent call must reinitialize them. The poll( ) system call separates the input (events field) from the output (revents field), allowing the array to be reused without change.
- The timeout parameter to select( ) is undefined on return. Portable code needs to reinitialize it. This is not an issue with pselect( )
- select( ) is more portable, as some Unix systems do not support poll( )
Epoll
在使用select和poll时,我们管理用户空间上的所有内容,并在每次调用时发送集合以等待。要添加另一个socket,我们需要将其添加到集合中,然后再次调用select/poll。
Epoll*系统调用帮助我们在内核中创建和管理上下文。我们将任务划分为3个步骤:
- create a context in the kernel using
epoll_create
- add and remove file descriptors to/from the context using
epoll_ctl
- wait for events in the context using
epoll_wait
struct epoll_event events[5];
int epfd = epoll_create(10);
...
...
for (i=0;i<5;i++)
{
static struct epoll_event ev;
memset(&client, 0, sizeof (client));
addrlen = sizeof(client);
ev.data.fd = accept(sockfd,(struct sockaddr*)&client, &addrlen);
ev.events = EPOLLIN;
epoll_ctl(epfd, EPOLL_CTL_ADD, ev.data.fd, &ev);
}
while(1){
puts("round again");
nfds = epoll_wait(epfd, events, 5, 10000);
for(i=0;i<nfds;i++) {
memset(buffer,0,MAXBUF);
read(events[i].data.fd, buffer, MAXBUF);
puts(buffer);
}
}
我们首先创建一个上下文。当客户端连接时,我们创建一个epoll_event 对象并将其添加到上下文中,在无限循环中,我们只等待上下文。
Epoll vs Select/Poll
- We can add and remove file descriptor while waiting
- epoll_wait returns only the objects with ready file descriptors
- epoll has better performance – O(1) instead of O(n)
- epoll can behave as level triggered or edge triggered
- epoll is Linux specific so non portable
Reference
抄的这一篇:https://devarea.com/linux-io-multiplexing-select-vs-poll-vs-epoll/#.X6ypn8gzZEb
自己学习记录用的,勿喷勿喷,如有错误还望指出!