日志巡检发现,公司web服务器一直报错,信息如下:
Jul 5 15:40:37 mail kernel: printk: 272 messages suppressed.
Jul 5 15:40:37 mail kernel: TCP: time wait bucket table overflow
Jul 5 15:40:37 mail kernel: TCP: time wait bucket table overflow
Jul 5 15:40:43 mail kernel: printk: 92 messages suppressed.
Jul 5 15:40:43 mail kernel: TCP: time wait bucket table overflow
(TCP:时间等待桶表)
排查步骤:
1. 查看服务器网络连接情况;
[root@mail ~]# netstat -pant |awk '/^tcp/ {++state[$6]} END {for(key in state) printf("%-10s\t%d\n",key,state[key]) }'
TIME_WAIT 4944
CLOSE_WAIT 1
FIN_WAIT1 93
FIN_WAIT2 66
ESTABLISHED 292
SYN_RECV 29
CLOSING 32
LAST_ACK 9
LISTEN 14
[root@mail ~]#
2.查看内核参数
vi /etc/sysctl.conf
将net.ipv4.tcp_max_tw_buckets = 5000
改为:net.ipv4.tcp_max_tw_buckets = 10000
3.使更改的内核参数生效
sysctl -p
4. 再次查看服务器网络连接情况;
[root@mail ~]# netstat -pant |awk '/^tcp/ {++state[$6]} END {for(key in state) printf("%-10s\t%d\n",key,state[key]) }'
TIME_WAIT 6644
CLOSE_WAIT 1
FIN_WAIT1 93
FIN_WAIT2 66
ESTABLISHED 292
SYN_RECV 29
CLOSING 32
LAST_ACK 9
LISTEN 14
5.
#再看/var/log/messages和dmesg的信息,已经不再报错了,看来net.ipv4.tcp_max_tw_buckets=10000暂时是够用了
6.原因
服务器的TCP连接数,超出了内核定义最大数。
转载于:https://blog.51cto.com/lzk2xx/1710537