1. select
2. pselect
3. poll
4. ppoll
5. epoll
5.1 general
The epoll API performs a similar task to poll(2): monitoring multiple
file descriptors to see if I/O is possible on any of them. The epoll
API can be used either as an edge-triggered or a level-triggered
interface and scales well to large numbers of watched file
descriptors. The following system calls are provided to create and
manage an epoll instance:
* epoll_create(2) creates an epoll instance and returns a file
descriptor referring to that instance. (The more recent
epoll_create1(2) extends the functionality of epoll_create(2).)
* Interest in particular file descriptors is then registered via
epoll_ctl(2). The set of file descriptors currently registered on
an epoll instance is sometimes called an epoll set.
* epoll_wait(2) waits for I/O events, blocking the calling thread if
no events are currently available.
5.2 Level-triggered and edge-triggered
The epoll event distribution interface is able to behave both as
edge-triggered (ET) and as level-triggered (LT). The difference
between the two mechanisms can be described as follows. Suppose that
this scenario happens:
1. The file descriptor that represents the read side of a pipe (rfd)
is registered on the epoll instance.
2. A pipe writer writes 2 kB of data on the write side of the pipe.
3. A call to epoll_wait(2) is done that will return rfd as a ready
file descriptor.
4. The pipe reader reads 1 kB of data from rfd.
5. A call to epoll_wait(2) is done.
If the rfd file descriptor has been added to the epoll interface
using the EPOLLET (edge-triggered) flag, the call to epoll_wait(2)
done in step 5 will probably hang despite the available data still
present in the file input buffer; meanwhile the remote peer might be
expecting a response based on the data it already sent. The reason
for this is that edge-triggered mode delivers events only when
changes occur on the monitored file descriptor. So, in step 5 the
caller might end up waiting for some data that is already present
inside the input buffer. In the above example, an event on rfd will
be generated because of the write done in 2 and the event is consumed
in 3. Since the read operation done in 4 does not consume the whole
buffer data, the call to epoll_wait(2) done in step 5 might block
indefinitely.
An application that employs the EPOLLET flag should use nonblocking
file descriptors to avoid having a blocking read or write starve a
task that is handling multiple file descriptors. The suggested way
to use epoll as an edge-triggered (EPOLLET) interface is as follows:
i with nonblocking file descriptors; and
ii by waiting for an event only after read(2) or write(2)
return EAGAIN.
By contrast, when used as a level-triggered interface (the default,
when EPOLLET is not specified), epoll is simply a faster poll(2), and
can be used wherever the latter is used since it shares the same
semantics.
上述段表明 edge-triggered 对于IO事件的触发是状态改变式触发,而不是
可读可写式触发, 而level-triggered是后者。
Since even with edge-triggered epoll, multiple events can be
generated upon receipt of multiple chunks of data, the caller has the
option to specify the EPOLLONESHOT flag, to tell epoll to disable the
associated file descriptor after the receipt of an event with
epoll_wait(2). When the EPOLLONESHOT flag is specified, it is the
caller's responsibility to rearm the file descriptor using
epoll_ctl(2) with EPOLL_CTL_MOD.
上述段表明, 当指定EPOLLONESHOT标志时,某描述符触发IO事件
后, 会被在epoll监听队列中disable, 如果用户不在程序中通过epoll_ctl
with EPOLL_CTL_MOD去设定该描述符, 如:
event.data.fd = fd;
event.events = EPOLLIN | EPOLLET |EPOLLONESHOT;
epoll_ctl(epollfd, EPOLL_CTL_MOD, fd, &event);
那么后续发生在该描述符上的IO事件不会被触发。
也就是说, 系统将是否继续触发某个描述符的IO事件交由用户来控制。
5.3 example