Linux epoll函数图解

看到一篇不错的文章《The method to epoll's madness》(作者:Cindy Sridharan)(原文链接)。

下面是摘录:

epoll stands for event poll and is a Linux specific construct. It allows for a process to monitor multiple file descriptors and get notifications when I/O is possible on them. It allows for both edge-triggered as well as level-triggered notifications. Before we look into the bowels of epoll, first let’s explore the syntax.

The syntax of epoll

Unlike poll, epoll itself is not a system call. It's a kernel data structure that allows a process to multiplex I/O on multiple file descriptors.

This data structure can be created, modified and deleted by three system calls.

1) epoll_create

The epoll instance is created by means of the epoll_create system call, which returns a file descriptor to the epoll instance. The signature of epoll_create is as follows:

#include <sys/epoll.h>
int epoll_create(int size);

The size argument is an indication to the kernel about the number of file descriptors a process wants to monitor, which helps the kernel to decide the size of the epoll instance. Since Linux 2.6.8, this argument is ignored because the epoll data structure dynamically resizes as file descriptors are added or removed from it.

The epoll_create system call returns a file descriptor to the newly created epoll kernel data structure. The calling process can then use this file descriptor to add, remove or modify other file descriptors it wants to monitor for I/O to the epoll instance.

 

There is another system call epoll_create1 which is defined as follows:

int epoll_create1(int flags);

The flags argument can either be 0 or EPOLL_CLOEXEC.

When set to 0, epoll_create1 behaves the same way as epoll_create.

When the EPOLL_CLOEXEC flag is setany child process forked by the current process will close the epoll descriptor before it execs, so the child process won’t have access to the epoll instance anymore.

It’s important to note that the file descriptor associated with the epoll instance needs to be released with a close() system call. Multiple processes might hold a descriptor to the same epoll instance, since, for example, a fork without the EPOLL_CLOEXEC flag will duplicate the descriptor to the epoll instance in the child process). When all of these processes have relinquished their descriptor to the epoll instance (by either calling close() or by exiting), the kernel destroys the epoll instance.

2) epoll_ctl

A process can add file descriptors it wants monitored to the epoll instance by calling epoll_ctl. All the file descriptors registered with an epoll instanceare collectively called an epoll set or the interest list.

 

In the above diagram, process 483 has registered file descriptors fd1fd2fd3fd4 and fd5 with the epoll instance. This is the interest list or the epoll set of that particular epoll instance. Subsequently, when any of the file descriptors registered become ready for I/O, then they are considered to be in the ready list.

The ready list is a subset of the interest list.

 

The signature of the epoll_ctl syscall is as follows:

#include <sys/epoll.h>
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);

 

epfd — is the file descriptor returned by epoll_create which identifies the epoll instance in the kernel.

fd — is the file descriptor we want to add to the epoll list/interest list.

op — refers to the operation to be performed on the file descriptor fd. In general, three operations are supported:

— Register fd with the epoll instance (EPOLL_CTL_ADD) and get notified about events that occur on fd 
— Delete/deregister fd from the epoll instance. This would mean that the process would no longer get any notifications about events on that file descriptor (EPOLL_CTL_DEL). If a file descriptor has been added to multiple epoll instances, then closing it will remove it from all of the epoll interest lists to which it was added.
— Modify the events fd is monitoring (EPOLL_CTL_MOD)

 

event is a pointer to a structure called epoll_event which stores the event we actually want to monitor fd for.

 

The first field events of the epoll_event structure is a bitmask that indicates which events fd is being monitored for.

Like so, if fd is a socket, we might want to monitor it for the arrival of new data on the socket buffer (EPOLLIN). We might also want to monitor fd for edge-triggered notifications which is done by OR-ing EPOLLET with EPOLLIN. We might also want to monitor fd for the occurrence of a registered event but only once and stop monitoring fd for subsequent occurrences of that event. This can be accomplished by OR-ing the other flags (EPOLLET, EPOLLIN) we want to set for descriptor fd with the flag for only-once notification delivery EPOLLONESHOT. All possible flags can be found in the man page.

The second field of the epoll_event struct is a union field.

3) epoll_wait

A thread can be notified of events that happened on the epoll set/interest set of an epoll instance by calling the epoll_wait system call, which blocks until any of the descriptors being monitored becomes ready for I/O.

The signature of epoll_wait is as follows:

#include <sys/epoll.h>
int epoll_wait(int epfd, struct epoll_event *evlist, int maxevents, int timeout);

epfd — is the file descriptor returned by epoll_create which identifies the epoll instance in the kernel.

evlist — is an array of epoll_event structures. evlist is allocated by the calling process and when epoll_wait returns, this array is modified to indicate information about the subset of file descriptors in the interest list that are in the ready state (this is called the ready list)

maxevents — is the length of the evlist array

timeout — this argument behaves the same way as it does for poll or select. This value specifies for how long the epoll_wait system call will block:

— when the timeout is set to 0, epoll_wait does not block but returns immediately after checking which file descriptors in the interest list for epfdare ready
— when timeout is set to -1, epoll_wait will block “forever”. When epoll_wait blocks, the kernel can put the process to sleep until epoll_waitreturns. epoll_wait will block until 1) one or more descriptors specified in the interest list for epfd become ready or 2) the call is interrupted by a signal handler
— when timeout is set to a non negative and non zero value, then epoll_wait will block until 1) one or more descriptors specified in the interest list for epfd becomes ready or 2) the call is interrupted by a signal handler or 3) the amount of time specified by timeout milliseconds have expired

The return values of epoll_wait are the following:

— if an error (EBADF or EINTR or EFAULT or EINVAL) occurred, then the return code is -1
— if the call timed out before any file descriptor in the interest list became ready, then the return code is 0
— if one or more file descriptors in the interest list became ready, then the return code is a positive integer which indicates the total number of file descriptors in the evlist array. The evlist is then examined to determine which events occurred on which file descriptors.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
### 回答1: `epoll` 函数Linux 系统中一个用于处理大量并发连接的 I/O 多路复用机制。它通过维护一个文件描述符集合来监测多个描述符的状态,以便在发生 I/O 事件时快速通知程序。 使用 `epoll` 的基本流程如下: 1. 创建 `epoll` 句柄:使用 `epoll_create` 或 `epoll_create1` 函数创建一个 `epoll` 句柄。 2. 注册文件描述符:使用 `epoll_ctl` 函数向 `epoll` 句柄中添加需要监测的文件描述符,并为每个文件描述符设置监测事件。 3. 进行等待:使用 `epoll_wait` 函数阻塞等待,直到有一个或多个文件描述符准备就绪。 4. 处理事件:当有文件描述符就绪时,处理相应的 I/O 事件。 5. 重复步骤 3 和 4,不断监测文件描述符的状态,处理 I/O 事件。 常见的使用方式是:创建一个线程,在线程中调用 `epoll_wait` 函数,当有文件描述符就绪时通过回调函数处理相应的 I/O 事件。 ### 回答2: epoll函数Linux系统中一种高效的I/O事件通知机制,用于管理大量的文件描述符。其使用方式如下: 1. 创建一个epoll句柄: int epoll_create(int size); 创建一个epoll实例,并返回一个文件描述符,size表示期望监听的文件描述符数量,通常可以设置为任意正整数。 2. 注册文件描述符和事件: int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event); epfd为epoll实例的文件描述符,op为操作类型(EPOLL_CTL_ADD表示添加,EPOLL_CTL_MOD表示修改,EPOLL_CTL_DEL表示删除),fd为需要监听的文件描述符,event为事件类型结构体指针。 3. 开始监听文件描述符事件: int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout); epfd为epoll实例的文件描述符,events为存储事件的数组,maxevents表示最大监听的事件数量,timeout表示等待时间(-1表示一直等待,0表示立即返回,>0表示超时时间)。 4. 对返回的事件进行处理: 在epoll_wait函数返回后,可以遍历events数组,根据每个事件的文件描述符和事件类型进行处理。 5. 关闭epoll实例: close(epfd); 使用完epoll实例后,需要调用close函数关闭,释放相关资源。 以上就是epoll函数的基本使用流程。通过epoll可以高效地监听大量的文件描述符事件,减少系统资源的消耗。在实际开发中,可以根据需要设置不同的事件类型和回调函数,实现具体的业务逻辑。 ### 回答3: epoll函数Linux中用于处理I/O事件的一种高效机制。它可以监视一组文件描述符,并在其中的任意一个文件描述符上发生事件时进行相应的处理。 使用epoll函数的基本步骤如下: 1. 调用epoll_create函数创建一个epoll的句柄,该句柄被用于后续的相关操作。 2. 使用epoll_ctl函数epoll句柄中注册需要监视的文件描述符和事件。通过该函数可以实现添加、修改和删除文件描述符以及相应事件的功能。 3. 使用epoll_wait函数等待事件的发生。epoll_wait会一直阻塞,直到有文件描述符上的事件发生。一旦有事件发生,epoll_wait会返回所发生事件的文件描述符和相应的事件类型。 4. 根据返回的事件类型,进行相应的处理。 epoll函数有三个基本的系统调用: 1. epoll_create函数用来创建一个epoll实例,返回一个epoll句柄。 2. epoll_ctl函数用于操作epoll实例,可以实现添加、修改和删除文件描述符以及相应事件的功能。 3. epoll_wait函数用于等待事件的发生,一旦事件发生则返回相应的文件描述符和事件类型。 epoll函数的使用优点包括: - 支持大量的连接,可以监视数万个文件描述符。 - 存储监视文件描述符的数据结构(epoll实例)可以重复利用,避免了每次都需要重新设置的问题。 - 使用epoll_wait函数进行等待事件的发生,避免了轮询的方式,提高了效率。 总之,epoll函数Linux中用于处理I/O事件的一种高效机制,可以通过创建、操作epoll实例和等待事件的发生来实现对文件描述符的监视和相应事件的处理。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值