遇到了一个客户端连接Redis总是失败的问题,由于是通过nginx代理连接的,又尝试不通过nginx代理连接直接连接redis地址,不过连接很不稳定,不时就断了。
因为这部分配置跑了一年多了,也没想到是nginx的问题,各种排查网络情况,弄了半天也没有个结论。
顺手看了一眼nginx的error日志,发现了新大陆。。。看来正式环境的nginx的error日志加上的必要性还是非常非常的大的。
nginx的错误日志如下:
2020/03/31 16:02:10 [alert] 25058#0: *5913973074 open socket #228 left in connection 182
2020/03/31 16:02:10 [alert] 25058#0: *5915419717 open socket #151 left in connection 183
2020/03/31 16:02:10 [alert] 25058#0: *5915419718 open socket #152 left in connection 184
...
看到貌似开的连接数太多了。
又用netstat查了一下网络情况,如下:
tcp 0 0 xxx:42992 192.168.1.105:6379 TIME_WAIT
tcp 0 0 xxx:43010 192.168.1.105:6379 TIME_WAIT
tcp 0 0 xxx:35034 192.168.1.105:6379 TIME_WAIT
tcp 0 0 xxx:60994 192.168.1.105:6379 TIME_WAIT
tcp 0 0 xxx:35976 192.168.1.105:6379 TIME_WAIT
tcp 0 0 xxx:34210 192.168.1.105:6379 TIME_WAIT
...
茫茫多的TIME_WAIT。。。
查了一下资料,说是要增加一下nginx配置里边的
events {
worker_connections 10240; # 之前默认是1024
}
修改完了虽说TIME_WAIT的连接数没有变少,但是连接redis的服务功能变正常了,而且nginx也没有错误日志了。
附:nginx文档关于 worker_connections 的说明。
Sets the maximum number of simultaneous connections that can be opened by a worker process.
It should be kept in mind that this number includes all connections (e.g. connections with proxied servers, among others), not only connections with clients. Another consideration is that the actual number of simultaneous connections cannot exceed the current limit on the maximum number of open files, which can be changed by worker_rlimit_nofile.