最近在一台WEB服务器上部署了Zabbix监控,并且添加了TCP连接状态的监控,但是zabbix经常报TCP模板的中Key不支持
例如:
item "xxxxxx:tcp.status[listen]" became not supported: Received value [39 39] is not suitable for value type [Numeric (unsigned)] and data type [Decimal]
TCP连接状态使用这个模板使用zabbix监控TCP连接状态 ,是使用netstat来获取TCP的连接信息
/bin/netstat -an|awk '/^tcp/{++S[$NF]}END{for(a in S) print a,S[a]}'
到这台主机上面手动运行
$ time /bin/netstat -an|awk '/^tcp/{++S[$NF]}END{for(a in S) print a,S[a]}'
TIME_WAIT 17164
ESTABLISHED 4330
SYN_RECV 2
LAST_ACK 1
LISTEN 39
real0m8.944s
user0m0.495s
sys0m8.508s
花费时间为8.944秒,有17164个TIME_WAIT,由此判断可能是因为每次zabbix调用脚本的时候时间比较长,netstat返回的结果再写入到/tmp/tcp_status.txt这个文件的过程有重复,所以有时才会在这个文件中出现两行LISTEN的情况
于是改用ss命令并修改脚本
$ whereis ss
ss: /usr/sbin/ss /usr/share/man/man8/ss.8.gz
$ rpm -qf /usr/sbin/ss
iproute-2.6.32-23.el6.x86_64
ss命令是另外一个和netstat命令功能差不多的命令
ss -t -a 显示所有的TCP连接信息
ss -u -a 显示所有的UDP连接信息
改成ss后的执行情况
$ time ss -tan|awk 'NR>1{++S[$1]}END{for (a in S) print a,S[a]}'
SYN-RECV 1
ESTAB 4687
TIME-WAIT 20250
LISTEN 39
real0m0.238s
user0m0.192s
sys0m0.083s
可以看出当服务器的连接数过高时,使用ss命令有命令的优势。
所以现在需要把监控脚本修改成通过ss命令获取状态
ss命令的man手册没有对输出的State字段进行说明,通过查看iproute的源代码misc/ss.c 知道ss命令的State字段主要有以下几种
static const char *sstate_name[] = {
"UNKNOWN",
[TCP_ESTABLISHED] = "ESTAB",
[TCP_SYN_SENT] = "SYN-SENT",
[TCP_SYN_RECV] = "SYN-RECV",
[TCP_FIN_WAIT1] = "FIN-WAIT-1",
[TCP_FIN_WAIT2] = "FIN-WAIT-2",
[TCP_TIME_WAIT] = "TIME-WAIT",
[TCP_CLOSE] = "UNCONN",
[TCP_CLOSE_WAIT] = "CLOSE-WAIT",
[TCP_LAST_ACK] = "LAST-ACK",
[TCP_LISTEN] = "LISTEN",
[TCP_CLOSING] = "CLOSING",
};
ESTAB
SYN-SENT
SYN-RECV
FIN-WAIT-1
FIN-WAIT-2
TIME-WAIT
UNCONN
CLOSE-WAIT
LAST-ACK
LISTEN
CLOSING
于是修改脚本
#!/bin/bash #this script is used to get tcp and udp connetion status #tcp status metric=$1 tmp_file=/tmp/tcp_status.txt #/bin/netstat -an|awk '/^tcp/{++S[$NF]}END{for(a in S) print a,S[a]}' > $tmp_file /usr/sbin/ss -tan|awk 'NR>1{++S[$1]}END{for (a in S) print a,S[a]}' > $tmp_file #ESTAB #SYN-SENT #SYN-RECV #FIN-WAIT-1 #FIN-WAIT-2 #TIME-WAIT #UNCONN #CLOSE-WAIT #LAST-ACK #LISTEN #CLOSING case $metric in closed) output=$(awk '/UNCONN/{print $2}' $tmp_file) if [ "$output" == "" ];then echo 0 else echo $output fi ;; listen) output=$(awk '/LISTEN/{print $2}' $tmp_file) if [ "$output" == "" ];then echo 0 else echo $output fi ;; synrecv) output=$(awk '/SYN-RECV/{print $2}' $tmp_file) if [ "$output" == "" ];then echo 0 else echo $output fi ;; synsent) output=$(awk '/SYN-SENT/{print $2}' $tmp_file) if [ "$output" == "" ];then echo 0 else echo $output fi ;; established) output=$(awk '/ESTAB/{print $2}' $tmp_file) if [ "$output" == "" ];then echo 0 else echo $output fi ;; timewait) output=$(awk '/TIME-WAIT/{print $2}' $tmp_file) if [ "$output" == "" ];then echo 0 else echo $output fi ;; closing) output=$(awk '/CLOSING/{print $2}' $tmp_file) if [ "$output" == "" ];then echo 0 else echo $output fi ;; closewait) output=$(awk '/CLOSE-WAIT/{print $2}' $tmp_file) if [ "$output" == "" ];then echo 0 else echo $output fi ;; lastack) output=$(awk '/LAST-ACK/{print $2}' $tmp_file) if [ "$output" == "" ];then echo 0 else echo $output fi ;; finwait1) output=$(awk '/FIN-WAIT-1/{print $2}' $tmp_file) if [ "$output" == "" ];then echo 0 else echo $output fi ;; finwait2) output=$(awk '/FIN-WAIT-2/{print $2}' $tmp_file) if [ "$output" == "" ];then echo 0 else echo $output fi ;; *) echo -e "\e[033mUsage: sh $0 [closed|closing|closewait|synrecv|synsent|finwait1|finwait2|listen|established|lastack|timewait]\e[0m" esac
使用ss代替netstat统计连接信息后就没有再收到报警
转载于:https://blog.51cto.com/john88wang/1705239