关于httpd服务器中大量的time_wait状态
前提:
使用客户端(172.25.254.11)访问(ab压力测试)服务器端(172.25.254.200),在服务器上发现了大量的TIME_WAIT状态。
客户端ab测试:
[root@localhost linux]# ab -n 10000 -c 100 http://172.25.254.200/index.html
服务器端,快速使用netstat命令查看:
[root@dhcp conf]# netstat -antple | grep TIME_WAIT |wc -l 862
造成大量TIME_WAIT的原因
什么是time_wait状态
由于http协议基于tcp协议,属于全双工安全通信协议,参考tcp四次挥手的原理,我们可以知道:
主动断开的一方,在最后挥手完成后,为了确保被动断开一方可以收到自己最后一次ACK信息,会维持一个time_wait状态,而time_wait状态,持续2*MSL(Max Segment Lifetime)两倍最大段生存期,MSL的值是2分钟,缺省240s。(现实环境中,缺省值往往会设置为30s、1min、2min)。所以,如果是服务器端主动断开链接,就会进入time_wait状态。
time_wait状态带来的坏处
对于访问量大的Web Server,会存在大量的TIME_WAIT状态,假如server一秒钟接收1000个请求,那么就会积压240*1000=240000个TIME_WAIT的记录。这样大量的TIME_WAIT状态,会导致后面的请求服务无法高效的完成。
一般情况,一些爬虫服务器或者WEB服务器会遇到这个问题。服务器在保持这个状态2MSL(max segment lifetime)时间之后,才彻底关闭回收资源。
如果服务器不主动断开
既然time_wait这么不好,为什么服务器还要主动断开链接呢?让客户端维护time_wait状态不是更好吗?其实不是这样的,如果服务器不主动断开带了的资源浪费是更大的。
httpd服务器为了更好的利用服务器资源,会设置一些参数让服务器主动断开和客户端的链接。如果不主动断开链接,会导致更多的资源被阻塞。
如何解决
思路:
-
让服务器尽快的回收TIME_WAIT资源
-
让服务器尽快的重用TIME_WAIT资源
让每个TIME_WAIT尽可能早的过期,可以修改/etc/sysctl.conf文件,添加几行参数:
net.ipv4.tcp_syncookies = 1 //表示开启SYN Cookies。当出现SYN等待队列溢出时,启用cookies来处理,可防范少量SYN攻击,默认为0,表示关闭; net.ipv4.tcp_tw_reuse = 1 //让TIME_WAIT状态可以重用,即使time_wait占满了所有端口,也不会拒绝新请求 net.ipv4.tcp_tw_recycle = 1 //让time_wait尽快回收,开启是为了加速回收处于TIME_WAIT状态的资源 net.ipv4.tcp_fin_timeout 30 //默认的 TIMEOUT 时间
测试:
这里httpd服务器使用的是prefork模型。
在没有修改内核参数之前:
客户端访问:
[root@localhost linux]# ab -n 10000 -c 100 http://172.25.254.200/index.html
服务端监控:
[root@dhcp modules]# netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}' ESTABLISHED 88 TIME_WAIT 2048 [root@dhcp modules]# netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}' ESTABLISHED 2 TIME_WAIT 2048 [root@dhcp modules]# netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}' ESTABLISHED 2 TIME_WAIT 2048 [root@dhcp modules]# netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}' ESTABLISHED 2 TIME_WAIT 2048
可以看到,在客户端访问过程中,服务器在很长时间内,都维持了2048个TIME_WAIT状态。
再看看客户端ab测试结果:
[root@localhost linux]# ab -n 10000 -c 100 http://172.25.254.200/index.html This is ApacheBench, Version 2.3 <$Revision: 1430300 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking 172.25.254.200 (be patient) Completed 1000 requests Completed 2000 requests Completed 3000 requests Completed 4000 requests Completed 5000 requests Completed 6000 requests Completed 7000 requests Completed 8000 requests Completed 9000 requests Completed 10000 requests Finished 10000 requests Server Software: Apache/2.4.6 Server Hostname: 172.25.254.200 Server Port: 80 Document Path: /index.html Document Length: 94 bytes Concurrency Level: 100 Time taken for tests: 5.615 seconds Complete requests: 10000 Failed requests: 0 Write errors: 0 Total transferred: 3720000 bytes HTML transferred: 940000 bytes Requests per second: 1780.85 [#/sec] (mean) Time per request: 56.153 [ms] (mean) Time per request: 0.562 [ms] (mean, across all concurrent requests) Transfer rate: 646.95 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 3 3.1 2 20 Processing: 8 53 13.6 51 266 Waiting: 4 53 13.7 51 266 Total: 12 56 13.7 54 278 Percentage of the requests served within a certain time (ms) 50% 54 66% 58 75% 61 80% 63 90% 70 95% 76 98% 86 99% 94 100% 278 (longest request)
修改内核参数:
[root@dhcp conf.modules.d]# cat <<eof >>/etc/sysctl.conf > net.ipv4.tcp_syncookies = 1 > net.ipv4.tcp_tw_reuse = 1 > net.ipv4.tcp_tw_recycle = 1 > net.ipv4.tcp_fin_timeout = 30 > eof [root@dhcp conf.modules.d]# sysctl -p net.ipv4.tcp_syncookies = 1 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_tw_recycle = 1 net.ipv4.tcp_fin_timeout = 30
客户端访问:
[root@localhost linux]# ab -n 10000 -c 100 http://172.25.254.200/index.html
服务端监控:
[root@dhcp modules]# netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}' ESTABLISHED 81 FIN_WAIT1 3 TIME_WAIT 2048 [root@dhcp modules]# netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}' ESTABLISHED 86 FIN_WAIT1 5 TIME_WAIT 1518 [root@dhcp modules]# netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}' ESTABLISHED 2 TIME_WAIT 197 [root@dhcp modules]# netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}' ESTABLISHED 2 TIME_WAIT 197
可以看到,在短时间内,服务器释放了大量的TIME_WAIT状态。
再看看客户端ab测试结果:
[root@localhost linux]# ab -n 10000 -c 100 http://172.25.254.200/index.html This is ApacheBench, Version 2.3 <$Revision: 1430300 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking 172.25.254.200 (be patient) Completed 1000 requests Completed 2000 requests Completed 3000 requests Completed 4000 requests Completed 5000 requests Completed 6000 requests Completed 7000 requests Completed 8000 requests Completed 9000 requests Completed 10000 requests Finished 10000 requests Server Software: Apache/2.4.6 Server Hostname: 172.25.254.200 Server Port: 80 Document Path: /index.html Document Length: 94 bytes Concurrency Level: 100 Time taken for tests: 4.685 seconds Complete requests: 10000 Failed requests: 0 Write errors: 0 Total transferred: 3720000 bytes HTML transferred: 940000 bytes Requests per second: 2134.55 [#/sec] (mean) Time per request: 46.848 [ms] (mean) Time per request: 0.468 [ms] (mean, across all concurrent requests) Transfer rate: 775.44 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 3 3.1 2 22 Processing: 8 43 7.3 43 76 Waiting: 1 43 7.3 43 75 Total: 10 47 6.9 46 77 Percentage of the requests served within a certain time (ms) 50% 46 66% 49 75% 51 80% 52 90% 55 95% 58 98% 61 99% 67 100% 77 (longest request)
看看修改内核参数前后一些重要数据的变化 :
Requests per second | Time per request | Transfer rate | |
---|---|---|---|
修改前 | 1780.85 | 56.153 | 646.95 [Kbytes/sec] |
修改后 | 2134.55 | 46.848 | 775.44 [Kbytes/sec] |
Requests per second:每秒接收的请求数;
Time per request:每个请求响应时间;
Transfer rate:数据传输速率。
性能得到了提升。