epoll
epoll - I/O event notification facility
The epoll API performs a similar task to poll(2): monitoring multiple file descriptors to see if I/O is possible on any of them. The epoll API can be used either as an edge-triggered or a level-triggered interface and scales well to large numbers of watched file descriptors.
-
1
epoll_create(2) creates an epoll instance and returns a file descriptor referring to that instance. (The more recent epoll_create1(2) extends the function‐ality of epoll_create(2).) -
2
Interest in particular file descriptors is then registered via epoll_ctl(2). The set of file descriptors currently registered on an epoll instance is some‐times called an epoll set. -
3
epoll_wait(2) waits for I/O events, blocking the calling thread if no events are currently available.
epoll_create
SYNOPSIS
#include <sys/epoll.h>
int epoll_create(int size);
int epoll_create1(int flags);
DESCRIPTION
epoll_create() creates an epoll(7) instance. Since Linux 2.6.8, the size argument is ignored, but must be greater than zero; see NOTES below.
epoll_create() returns a file descriptor referring to the new epoll instance. This file descriptor is used for all the subsequent calls to the epoll interface. When no longer required, the
file descriptor returned by epoll_create() should be closed by using close(2). When all file descriptors referring to an epoll instance have been closed, the kernel destroys the instance and
releases the associated resources for reuse.
epoll_create1()
If flags is 0, then, other than the fact that the obsolete size argument is dropped, epoll_create1() is the same as epoll_create(). The following value can be included in flags to obtain
different behavior:
EPOLL_CLOEXEC
Set the close-on-exec (FD_CLOEXEC) flag on the new file descriptor. See the description of the O_CLOEXEC flag in open(2) for reasons why this may be useful.
RETURN VALUE
On success, these system calls return a nonnegative file descriptor. On error, -1 is returned, and errno is set to indicate the error.
ERRORS
EINVAL size is not positive.
EINVAL (epoll_create1()) Invalid value specified in flags.
EMFILE The per-user limit on the number of epoll instances imposed by /proc/sys/fs/epoll/max_user_instances was encountered. See epoll(7) for further details.
EMFILE The per-process limit on the number of open file descriptors has been reached.
ENFILE The system-wide limit on the total number of open files has been reached.
ENOMEM There was insufficient memory to create the kernel object.
VERSIONS
epoll_create() was added to the kernel in version 2.6. Library support is provided in glibc starting with version 2.3.2.
epoll_create1() was added to the kernel in version 2.6.27. Library support is provided in glibc starting with version 2.9.
epoll_ctl
SYNOPSIS
#include <sys/epoll.h>
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);
DESCRIPTION
This system call performs control operations on the epoll(7) instance referred to by the file descriptor epfd. It requests that the operation op be performed for the target file descriptor,
fd.
Valid values for the op argument are:
EPOLL_CTL_ADD
Register the target file descriptor fd on the epoll instance referred to by the file descriptor epfd and associate the event event with the internal file linked to fd.
EPOLL_CTL_MOD
Change the event event associated with the target file descriptor fd.
EPOLL_CTL_DEL
Remove (deregister) the target file descriptor fd from the epoll instance referred to by epfd. The event is ignored and can be NULL (but see BUGS below).
The event argument describes the object linked to the file descriptor fd. The struct epoll_event is defined as:
typedef union epoll_data {
void *ptr;
int fd;
uint32_t u32;
uint64_t u64;
} epoll_data_t;
struct epoll_event {
uint32_t events; /* Epoll events */
epoll_data_t data; /* User data variable */
};
The events member is a bit mask composed using the following available event types:
EPOLLIN
The associated file is available for read(2) operations.
EPOLLOUT
The associated file is available for write(2) operations.
EPOLLRDHUP (since Linux 2.6.17)
Stream socket peer closed connection, or shut down writing half of connection. (This flag is especially useful for writing simple code to detect peer shutdown when using Edge Trig‐
gered monitoring.)
EPOLLPRI
There is urgent data available for read(2) operations.
EPOLLERR
Error condition happened on the associated file descriptor. epoll_wait(2) will always wait for this event; it is not necessary to set it in events.
EPOLLHUP
Hang up happened on the associated file descriptor. epoll_wait(2) will always wait for this event; it is not necessary to set it in events. Note that when reading from a channel such
as a pipe or a stream socket, this event merely indicates that the peer closed its end of the channel. Subsequent reads from the channel will return 0 (end of file) only after all
outstanding data in the channel has been consumed.
EPOLLET
Sets the Edge Triggered behavior for the associated file descriptor. The default behavior for epoll is Level Triggered. See epoll(7) for more detailed information about Edge and
Level Triggered event distribution architectures.
EPOLLONESHOT (since Linux 2.6.2)
Sets the one-shot behavior for the associated file descriptor. This means that after an event is pulled out with epoll_wait(2) the associated file descriptor is internally disabled
and no other events will be reported by the epoll interface. The user must call epoll_ctl() with EPOLL_CTL_MOD to rearm the file descriptor with a new event mask.
EPOLLWAKEUP (since Linux 3.5)
If EPOLLONESHOT and EPOLLET are clear and the process has the CAP_BLOCK_SUSPEND capability, ensure that the system does not enter "suspend" or "hibernate" while this event is pending
or being processed. The event is considered as being "processed" from the time when it is returned by a call to epoll_wait(2) until the next call to epoll_wait(2) on the same epoll(7)
file descriptor, the closure of that file descriptor, the removal of the event file descriptor with EPOLL_CTL_DEL, or the clearing of EPOLLWAKEUP for the event file descriptor with
EPOLL_CTL_MOD. See also BUGS.
RETURN VALUE
When successful, epoll_ctl() returns zero. When an error occurs, epoll_ctl() returns -1 and errno is set appropriately.
ERRORS
EBADF epfd or fd is not a valid file descriptor.
EEXIST op was EPOLL_CTL_ADD, and the supplied file descriptor fd is already registered with this epoll instance.
EINVAL epfd is not an epoll file descriptor, or fd is the same as epfd, or the requested operation op is not supported by this interface.
ENOENT op was EPOLL_CTL_MOD or EPOLL_CTL_DEL, and fd is not registered with this epoll instance.
ENOMEM There was insufficient memory to handle the requested op control operation.
ENOSPC The limit imposed by /proc/sys/fs/epoll/max_user_watches was encountered while trying to register (EPOLL_CTL_ADD) a new file descriptor on an epoll instance. See epoll(7) for further
details.
EPERM The target file fd does not support epoll. This error can occur if fd refers to, for example, a regular file or a directory.
VERSIONS
epoll_ctl() was added to the kernel in version 2.6.
CONFORMING TO
epoll_ctl() is Linux-specific. Library support is provided in glibc starting with version 2.3.2.
epoll_wait
SYNOPSIS
#include <sys/epoll.h>
int epoll_wait(int epfd, struct epoll_event *events,
int maxevents, int timeout);
int epoll_pwait(int epfd, struct epoll_event *events,
int maxevents, int timeout,
const sigset_t *sigmask);
DESCRIPTION
The epoll_wait() system call waits for events on the epoll(7) instance referred to by the file descriptor epfd. The memory area pointed to by events will contain the events that will be
available for the caller. Up to maxevents are returned by epoll_wait(). The maxevents argument must be greater than zero.
The timeout argument specifies the number of milliseconds that epoll_wait() will block. The call will block until either:
* a file descriptor delivers an event;
* the call is interrupted by a signal handler; or
* the timeout expires.
Note that the timeout interval will be rounded up to the system clock granularity, and kernel scheduling delays mean that the blocking interval may overrun by a small amount. Specifying a
timeout of -1 causes epoll_wait() to block indefinitely, while specifying a timeout equal to zero cause epoll_wait() to return immediately, even if no events are available.
The struct epoll_event is defined as:
typedef union epoll_data {
void *ptr;
int fd;
uint32_t u32;
uint64_t u64;
} epoll_data_t;
struct epoll_event {
uint32_t events; /* Epoll events */
epoll_data_t data; /* User data variable */
};
The data of each returned structure will contain the same data the user set with an epoll_ctl(2) (EPOLL_CTL_ADD, EPOLL_CTL_MOD) while the events member will contain the returned event bit
field.
epoll_pwait()
The relationship between epoll_wait() and epoll_pwait() is analogous to the relationship between select(2) and pselect(2): like pselect(2), epoll_pwait() allows an application to safely wait
until either a file descriptor becomes ready or until a signal is caught.
The following epoll_pwait() call:
ready = epoll_pwait(epfd, &events, maxevents, timeout, &sigmask);
is equivalent to atomically executing the following calls:
sigset_t origmask;
pthread_sigmask(SIG_SETMASK, &sigmask, &origmask);
ready = epoll_wait(epfd, &events, maxevents, timeout);
pthread_sigmask(SIG_SETMASK, &origmask, NULL);
The sigmask argument may be specified as NULL, in which case epoll_pwait() is equivalent to epoll_wait().
RETURN VALUE
When successful, epoll_wait() returns the number of file descriptors ready for the requested I/O, or zero if no file descriptor became ready during the requested timeout milliseconds. When
an error occurs, epoll_wait() returns -1 and errno is set appropriately.
ERRORS
EBADF epfd is not a valid file descriptor.
EFAULT The memory area pointed to by events is not accessible with write permissions.
EINTR The call was interrupted by a signal handler before either (1) any of the requested events occurred or (2) the timeout expired; see signal(7).
EINVAL epfd is not an epoll file descriptor, or maxevents is less than or equal to zero.
VERSIONS
epoll_wait() was added to the kernel in version 2.6. Library support is provided in glibc starting with version 2.3.2.
epoll_pwait() was added to Linux in kernel 2.6.19. Library support is provided in glibc starting with version 2.6.
example
#include <sys/epoll.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_SIZE 5
#define MAX_BUF_SIZE 128
/**
*
*
*
*/
int main() {
struct epoll_event ev, events[MAX_SIZE];
char buf[MAX_BUF_SIZE];
int epoll_fd = epoll_create1(0);
if (epoll_fd == -1) {
printf("epoll create fail");
return -1;
}
ev.events = EPOLLIN;
ev.data.fd = 0; // stdin
if (epoll_ctl(epoll_fd, EPOLL_CTL_ADD, 0, &ev) == -1) {
perror("epoll_ctl: listen_sock");
exit(EXIT_FAILURE);
}
while(1) {
/***
* Note that the timeout interval will be rounded up to the system clock granularity, and kernel scheduling delays mean that the blocking interval may overrun by a small amount. Specifying a
timeout of -1 causes epoll_wait() to block indefinitely, while specifying a timeout equal to zero cause epoll_wait() to return immediately, even if no events are available.
*/
int nfds = epoll_wait(epoll_fd, &events, MAX_SIZE, -1);
if (nfds == -1) {
perror("epoll_wait");
exit(EXIT_FAILURE);
}
for (int n = 0; n < nfds; ++n) {
if (events[n].data.fd == 0) {
memset(buf, 0, MAX_BUF_SIZE);
int len = read(0, buf, MAX_BUF_SIZE);
if (strncmp("exit", buf, 4) == 0) {
printf("will exit!");
exit(EXIT_FAILURE);
} else {
printf("receive : len[%d] %s\n", len, buf);
}
}
}
}
return 0;
}