epoll源码分析---sys_epoll_wait()函数-CSDN博客

本文详细分析了Linux内核中的epoll机制，重点探讨了sys_epoll_wait函数，包括其返回EBADF错误的情况、超时时间的处理，以及epoll_wait在不同返回值下的行为。同时，文章还介绍了ep_poll和ep_scan_ready_list等关键函数的工作原理。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

一、sys_epoll_wait()函数

源码及分析如下所示：

/*
 * Implement the event wait interface for the eventpoll file. It is the kernel
 * part of the user space epoll_wait(2).
 */
SYSCALL_DEFINE4(epoll_wait, int, epfd, struct epoll_event __user *, events,
		int, maxevents, int, timeout)
{
	int error;
	struct file *file;
	struct eventpoll *ep;

	/* The maximum number of event must be greater than zero */
	/*
	 * 检查maxevents参数。
	 */
	if (maxevents <= 0 || maxevents > EP_MAX_EVENTS)
		return -EINVAL;

	/* Verify that the area passed by the user is writeable */
	/*
	 * 检查用户空间传入的events指向的内存是否可写。参见__range_not_ok()。
	 */
	if (!access_ok(VERIFY_WRITE, events, maxevents * sizeof(struct epoll_event))) {
		error = -EFAULT;
		goto error_return;
	}

	/* Get the "struct file *" for the eventpoll file */
	/*
	 * 获取epfd对应的eventpoll文件的file实例，file结构是在epoll_create中创建
	 */
	error = -EBADF;
	file = fget(epfd);
	if (!file)
		goto error_return;

	/*
	 * We have to check that the file structure underneath the fd
	 * the user passed to us _is_ an eventpoll file.
	 */
	/*
	 * 通过检查epfd对应的文件操作是不是eventpoll_fops
	 * 来判断epfd是否是一个eventpoll文件。如果不是
	 * 则返回EINVAL错误。
	 */
	error = -EINVAL;
	if (!is_file_epoll(file))
		goto error_fput;

	/*
	 * At this point it is safe to assume that the "private_data" contains
	 * our own data structure.
	 */
	ep = file->private_data;

	/* Time to fish for events ... */
	error = ep_poll(ep, events, maxevents, timeout);

error_fput:
	fput(file);
error_return:

	return error;
}

sys_epoll_wait（）是epoll_wait()对应的系统调用，主要用来获取文件状态已经就绪的事件，该函数检查参数、获取eventpoll文件后调用ep_poll（）来完成主要的工作。在分析ep_poll（）函数之前，先介绍一下使用epoll_wait（）时可能犯的错误（接下来介绍的就是我犯过的错误）：

1、返回EBADF错误

除非你故意指定一个不存在的文件描述符，否则几乎百分百肯定，你的程序有BUG了！从源码中可以看到调用fget（）函数返回NULL时，会返回此错误。fget（）源码如下：

struct file *fget(unsigned int fd)
{
	struct file *file;
	struct files_struct *files = current->files;

	rcu_read_lock();
	file = fcheck_files(files, fd);
	if (file) {
		if (!atomic_long_inc_not_zero(&file->f_count)) {
			/* File object ref couldn't be taken */
			rcu_read_unlock();
			return NULL;
		}
	}
	rcu_read_unlock();

	return file;
}

主要看这句(struct files_struct *files = current->files;)，这条语句是获取描述当前进程已经打开的文件的files_struct结构，然后从这个结构中查找传入的fd对应的file实例，如果没有找到，说明当前进程中打开的文件不包括这个fd，所以几乎百分百肯定是程序设计的问题。我的程序出错，就是因为在父进程中创建了文件描述符，但是将子进程变为守护进程了，也就没有继承父进程中打开的文件。
2、死循环（一般不会犯，