项目出现socket连接超时和管道断开连接
检查nginx, nginx报错
recv() failed (104: Connection reset by peer) while reading response header from upstrea
错误日志表示:
(1)服务器的并发连接数超过了其承载量,服务器会将其中一些连接Down掉; (2)客户关掉了浏览器,而服务器还在给客户端发送数据; (3)浏览器端按了Stop
查看Linux最大连接数
ulimit -a
查看Web服务器(Nginx Apache)的并发请求数及其TCP连接状态:
netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
#或者:
netstat -n | awk '/^tcp/ {++state[$NF]} END {for(key in state) print key,"t",state[key]}'
ps aux|grep httpd|wc -l
ps -ef|grep httpd|wc -l
查看系统日志
tail -f /var/log/messages
Dec 10 11:31:36 localhost kernel: TCP: time wait bucket table overflow
Dec 10 11:31:36 localhost kernel: TCP: time wait bucket table overflow
Dec 10 11:31:36 localhost kernel: TCP: time wait bucket table overflow
Dec 10 11:31:36 localhost kernel: TCP: time wait bucket table overflow
Dec 10 11:31:36 localhost kernel: TCP: time wait bucket table overflow
Dec 10 11:31:41 localhost kernel: __ratelimit: 2039 callbacks suppressed
Dec 10 11:31:41 localhost kernel: TCP: time wait bucket table overflow
Dec 10 11:31:41 localhost kernel: TCP: time wait bucket table overflow
Dec 10 11:31:41 localhost kernel: TCP: time wait bucket table overflow
Dec 10 11:31:41 localhost kernel: TCP: time wait bucket table overflow
Dec 10 11:31:41 localhost kernel: TCP: time wait bucket table overflow
原理:
kernel 用 ip_conntrack 模块来记录 iptables 网络包的状态,并保存到 table 里(/proc/net/ip_conntrack|/proc/net/nf_conntrack),
如果网络状况繁忙,比如高连接,高并发连接等会导致逐步占用这个 table 可用空间,
一般这个 table 很大不容易占满并且可以自己清理,table 的记录会一直呆在 table 里占用空间直到源 IP 发一个 RST 包,
但是如果出现被攻击、错误的网络配置、有问题的路由/路由器、有问题的网卡等情况的时候,就会导致源 IP 发的这个 RST 包收不到,
这样就积累在 table 里,越积累越多直到占满,满了以后 iptables 就会丢包,出现外部无法连接服务器的情况。
知道问题就好办了,要么增加 table 容量以便能记录更多的连接信息(会消耗一点内存),要么就卸载 ip_conntrack 模块。
1.查看当前tcp time_wait连接数
netstat -an | grep TIME_WAIT | wc -l
6880
2.查看time wait bucket设置
cat /proc/sys/net/ipv4/tcp_max_tw_buckets
5000
显然TIME_WAIT数量已经超出了设置值(5000)。
解决办法:增加table容量
优化Linux内核参数
vim /etc/sysctl.conf
# 关闭路由转发
net.ipv4.ip_forward = 0
# 开启反向路径过滤
net.ipv4.conf.all.rp_filter= 1
net.ipv4.conf.default.rp_filter = 1
# 处理无源路由的包
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.shmall = 4294967296
# TTL
net.ipv4.ip_default_ttl = 64
# 内核panic时,1s后自动重启
kernel.panic = 1
# 允许更多的PIDs
kernel.pid_max = 32768
# 应对DDOS攻击,TCP连接建立设置
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_synack_retries = 1
net.ipv4.tcp_syn_retries = 1
net.ipv4.tcp_max_syn_backlog = 262144
# 应对timewait过高,TCP连接断开设置
net.ipv4.tcp_max_tw_buckets = 6000
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 0
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_fin_timeout = 10
net.ipv4.ip_local_port_range = 10000 65000
# 内存资源使用相关设定
net.core.wmem_default = 8388608
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 4194304
net.ipv4.tcp_wmem = 4096 16384 4194304
net.ipv4.tcp_mem = 94500000 915000000 927000000
# TCP keepalived连接设置
net.ipv4.tcp_keepalive_time = 30
net.ipv4.tcp_keepalive_intvl = 15
net.ipv4.tcp_keepalive_probes = 5
# 其他TCP相关参数调节
net.core.somaxconn = 262144
net.core.netdev_max_backlog = 262144
net.ipv4.tcp_max_orphans = 3276800
net.ipv4.tcp_sack = 1
net.ipv4.tcp_window_scaling = 1
# 生效
/sbin/sysctl -p
完成后项目没再报连接超时或者管道断开..
但是系统日志里还会有这个日志,一分钟一条..
Dec 10 15:06:18 localhost kernel: possible SYN flooding on port 7012. Sending cookies.
Dec 10 15:07:20 localhost kernel: possible SYN flooding on port 7012. Sending cookies.
感谢:
https://blog.csdn.net/u010098331/article/details/50730002
http://blog.51cto.com/qiangsh/2050974
https://blog.csdn.net/a_bang/article/details/64930109
http://www.cnblogs.com/sunmmi/articles/6809377.html
https://blog.csdn.net/heiyeshuwu/article/details/45692407
http://blog.51cto.com/nanchunle/1657410
https://www.cnblogs.com/wanghuaijun/p/7214319.html