由于http服务经常需要承受大量的访问,有必要调整一些参数,另ingress-nginx性能最优。
1. 内核调整:
# sysctl -p
kernel.pid_max = 65535
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_syncookies = 0
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_max_tw_buckets = 5000
net.ipv4.ip_local_port_range = 1024 61000
net.ipv4.tcp_max_syn_backlog = 262144
net.ipv4.tcp_syn_retries = 2
设置后,TIME_WAIT会减少一大半
2. docker 启动参数调整:
$ cat docker.service :
LimitMEMLOCK=1288490188800
LimitSTACK=infinity
LimitNPROC=infinity
LimitNOFILE=196605
LimitCORE=infinity
3. 增加系统线程数:
# cat /etc/security/limits.d/20-nproc.conf
#增加下面:
root soft nofile 196605
root hard nofile 196605
* soft nofile 196605
* hard nofile 196605
4. 其他:
4.1 net.ipv4.tcp_syncookies = 0
0 表示禁用系统保护 ;1 表示启用系统保护。
如果要进行大规模并发测试,可以设置为0,否则内核会认为系统受到SYN Flood攻击,并启动自动保护,内核日志提示:
Dec 22 21:40:23 etcd-host2 kernel: TCP: request_sock_TCP: Possible SYN flooding on port 80. Dropping request. Check SNMP counters.
5. 查看ingress-nginx-controller进程的limit设置:
# ps ax |grep nginx |grep master
14004 ? S 0:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
15118 ? Ss 0:00 nginx: master process nginx -g daemon off;
# cat /proc/15118/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size unlimited unlimited bytes
Max core file size unlimited unlimited bytes
Max resident set unlimited unlimited bytes
Max processes unlimited unlimited processes
Max open files 196605 196605 files
Max locked memory 1288490188800 1288490188800 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 15078 15078 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
6. 压力测试攻击建议采用WRK (https://blog.csdn.net/kozazyh/article/details/86763096)
$ wrk -t100 -c6000 -d60s -H "Host: my.test.com" --latency http://my.test.com/
Running 1m test @ http://my.test.com/
100 threads and 3000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 198.82ms 276.58ms 2.00s 91.03%
Req/Sec 34.03 22.62 225.00 64.64%
Latency Distribution
50% 107.37ms
75% 216.68ms
90% 436.46ms
99% 1.52s
189283 requests in 1.00m, 156.92MB read
Socket errors: connect 0, read 0, write 0, timeout 19547
Non-2xx or 3xx responses: 1289
Requests/sec: 3149.40
Transfer/sec: 2.61MB
进行压力测试,可以实时观察统计数据(参考:https://blog.csdn.net/kozazyh/article/details/86763144)
有时进行压力测试的时候,观察到POD(nginx-ingress-controller)重启,原因有下面几个:
- 如果 net.ipv4.tcp_tw_recycle = 1 的时候,使用ab、work测试,nginx控制器会restart.
- k8s节点资源不足,也会造成重启。(这个很重要)
- 下游资源不足或者下游资源不可用,也会造成重启(例如,ing并发6000个连接,下游nginx限制了1024个连接,造成堵塞)。 在配置configmap中启用proxy-next-upstream: http_502 http_503 http_504选项,ingress控制器日志如下:
2018/12/24 14:32:49 [error] 61#61: *50903 upstream prematurely closed connection while reading response header from upstream, client: 192.168.5.102, server: my.test.com, request: "GET / HTTP/1.1", upstream: "http://172.30.53.4:80/", host: "my.test.com"
如果查看pod日志,是健康检查触发pod重启,暂时的解决方法可以删除(deploy/mandatory.yaml)健康检查:
$ kubectl describe po nginx-ingress-controller-8fcbbc8d7-wwr56 -n ingress-nginx
.....
Normal Killing 1m (x2 over 2m) kubelet, 192.168.5.86 Killing container with id docker://nginx-ingress-controller:Container failed liveness probe.. Container will be killed and recreated.
Warning Unhealthy 1m (x6 over 2m) kubelet, 192.168.5.86 Liveness probe failed: Get http://172.30.18.5:10254/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Warning Unhealthy 1m (x8 over 2m) kubelet, 192.168.5.86 Readiness probe failed: Get http://172.30.18.5:10254/healthz: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
......
目前的版本(0.22.0)跟直接在linux上安装的nginx进行眼里比较,性能只有直接安装nginx的1/3,如果性能碰到性能问题,也可以增加nginx-ingress-cotroller的数量进行负载均衡;
附:
参考: