关于select/poll对比epoll的一次精彩问答

关于select/poll对比epoll的一次精彩问答

Epoll是Linux内核在2.5.44版本引进的一个新特性,旨在替换之前系统中老的select, poll等系统请求。这是Linux I/O系统一次质的飞跃。关于Epoll的详细的介绍见 Wikipedia

Epoll在绝大多数情况下性能都远超select或者poll,但是除了速度之外,三者之间的CPU开销,内存消耗情况又怎么样呢?

本文的内容来自Stackoverflow上一次精彩的问答,除了比较poll, select和epoll在性能,系统资源消耗等方面的差异之外,还指出了epoll在对普通文件支持方面相对于select/poll的不足之处(当然,这三者本身都不支持普通文件,只是作者认为epoll对这类问题的处理机制不好,这是个见仁见智的事情,不代表作者的观点是正确的)。本来想翻译成中文再写出来,但自己翻译水平有限,翻译了几句之后发现自己都不大看得懂了…所以还是放弃了。如果有NB的同学愿意翻译,可以联系我,质量OK的话,可以在原文后面带上翻译版。Anyway,还是希望对这个Topic感兴趣的同学能够看完这篇文章,相信能使你对epoll有个更深的了解。

问:

Everything I’ve read and experienced ( Tornado based apps ) leads me to believe that ePoll is a natural replacement for Select and Poll based networking, especially with Twisted. Which makes me paranoid, its pretty rare for a better technique or methodology not to come with a price.

Reading a couple dozen comparisons between epoll and alternatives shows that epoll is clearly the champion for speed and scalability, specifically that it scales in a linear fashion which is fantastic. That said, what about processor and memory utilization, is epoll still the champ?

答:

For very small numbers of sockets (varies depending on your hardware, of course, but we’re talking about something on the order of 10 or fewer), select can beat epoll in memory usage and runtime speed. Of course, for such small numbers of sockets, both mechanisms are so fast that you don’t really care about this difference in the vast majority of cases.

One clarification, though. Both select and epoll scale linearly. A big difference, though, is that the userspace-facing APIs have complexities that are based on different things. The cost of a select call goes roughly with the value of the highest numbered file descriptor you pass it. If you select on a single fd, 100, then that’s roughly twice as expensive as selecting on a single fd, 50. Adding more fds below the highest isn’t quite free, so it’s a little more complicated than this in practice, but this is a good first approximation for most implementations.

The cost of epoll is closer to the number of file descriptors that actually have events on them. If you’re monitoring 200 file descriptors, but only 100 of them have events on them, then you’re (very roughly) only paying for those 100 active file descriptors. This is where epoll tends to offer one of its major advantages over select. If you have a thousand clients that are mostly idle, then when you use select you’re still paying for all one thousand of them. However, with epoll, it’s like you’ve only got a few – you’re only paying for the ones that are active at any given time.

All this means that epoll will lead to less CPU usage for most workloads. As far as memory usage goes, it’s a bit of a toss up. select does manage to represent all the necessary information in a highly compact way (one bit per file descriptor). And the FD_SETSIZE (typically 1024) limitation on how many file descriptors you can use with select means that you’ll never spend more than 128 bytes for each of the three fd sets you can use with select (read, write, exception). Compared to those 384 bytes max, epoll is sort of a pig. Each file descriptor is represented by a multi-byte structure. However, in absolute terms, it’s still not going to use much memory. You can represent a huge number of file descriptors in a few dozen kilobytes (roughly 20k per 1000 file descriptors, I think). And you can also throw in the fact that you have to spend all 384 of those bytes with select if you only want to monitor one file descriptor but its value happens to be 1024, wheras with epoll you’d only spend 20 bytes. Still, all these numbers are pretty small, so it doesn’t make much difference.

And there’s also that other benefit of epoll, which perhaps you’re already aware of, that it is not limited to FD_SETSIZE file descriptors. You can use it to monitor as many file descriptors as you have. And if you only have one file descriptor, but its value is greater than FD_SETSIZE, epoll works with that too, but select does not.

Randomly, I’ve also recently discovered one slight drawback to epoll as compared to select or poll. While none of these three APIs supports normal files (ie, files on a file system), select and poll present this lack of support as reporting such descriptors as always readable and always writeable. This makes them unsuitable for any meaningful kind of non-blocking filesystem I/O, a program which uses select or poll and happens to encounter a file descriptor from the filesystem will at least continue to operate (or if it fails, it won’t be because of select or poll), albeit it perhaps not with the best performance.

On the other hand, epoll will fail fast with an error (EPERM, apparently) when asked to monitor such a file descriptor. Strictly speaking, this is hardly incorrect. It’s merely signalling its lack of support in an explicit way. Normally I would applaud explicit failure conditions, but this one is undocumented (as far as I can tell) and results in a completely broken application, rather than one which merely operates with potentially degraded performance.

In practice, the only place I’ve seen this come up is when interacting with stdio. A user might redirect stdin or stdout from/to a normal file. Whereas previously stdin and stdout would have been a pipe — supported by epoll just fine — it then becomes a normal file and epoll fails loudly, breaking the application.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
selectpollepoll都是I/O多路复用机制,用于同时监听多个I/O事件的状态。它们的基本原理是通过查询所有socket连接,如果有数据到达,就通知用户进程。\[2\]这些机制都属于同步I/O,需要在事件就绪后自己负责读写,并且读写过程会阻塞。而异步I/O则不会自己读写和阻塞,而是负责将数据从内核拷贝到用户空间。\[3\] select是最早出现的I/O多路复用机制,它使用fd_set数据结构来存储需要监听的文件描述符,通过调用select函数来等待事件的发生。select的缺点是效率较低,因为每次调用select都需要将所有的文件描述符集合传递给内核,而且select的文件描述符数量有限制。\[1\] pollselect的改进版本,它使用pollfd数据结构来存储需要监听的文件描述符,通过调用poll函数来等待事件的发生。poll相对于select的优点是没有文件描述符数量的限制,但仍然需要将所有的文件描述符集合传递给内核。\[1\] epoll是Linux特有的I/O多路复用机制,它使用epoll_event数据结构来存储需要监听的文件描述符,通过调用epoll_ctl函数来注册事件,然后通过调用epoll_wait函数来等待事件的发生。epoll的优点是没有文件描述符数量的限制,而且在注册事件时只需要拷贝一次文件描述符到内核,而不是在等待事件时重复拷贝。epoll还支持水平触发和边沿触发两种模式,边沿触发模式可以降低同一个事件被重复触发的次数。\[1\] 总结来说,selectpollepoll都是用于实现I/O多路复用的机制,它们的选择取决于具体的应用场景和需求。select适用于连接数量多但活动连接较少的情况,poll适用于连接数量多且活动连接较多的情况,而epoll适用于连接数量多但活动连接较少的情况,并且具有更高的效率和更灵活的触发模式。\[1\] #### 引用[.reference_title] - *1* *3* [selectpollepoll简介](https://blog.csdn.net/HuYingJie_1995/article/details/130516595)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control_2,239^v3^insert_chatgpt"}} ] [.reference_item] - *2* [selectpollepoll详解](https://blog.csdn.net/ljjjjjjjjjjj/article/details/129720990)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control_2,239^v3^insert_chatgpt"}} ] [.reference_item] [ .reference_list ]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值