最近在一台WEB服务器上部署了Zabbix监控,并且添加了TCP连接状态的监控,但是zabbix经常报TCP模板的中Key不支持

例如:

item "xxxxxx:tcp.status[listen]" became not supported: Received value [39 39] is not suitable for value type [Numeric (unsigned)] and data type [Decimal]


TCP连接状态使用这个模板使用zabbix监控TCP连接状态 ,是使用netstat来获取TCP的连接信息

/bin/netstat -an|awk '/^tcp/{++S[$NF]}END{for(a in S) print a,S[a]}'


到这台主机上面手动运行

$ time /bin/netstat -an|awk '/^tcp/{++S[$NF]}END{for(a in S) print a,S[a]}'

TIME_WAIT 17164

ESTABLISHED 4330

SYN_RECV 2

LAST_ACK 1

LISTEN 39


real0m8.944s

user0m0.495s

sys0m8.508s


花费时间为8.944秒,有17164个TIME_WAIT,由此判断可能是因为每次zabbix调用脚本的时候时间比较长,netstat返回的结果再写入到/tmp/tcp_status.txt这个文件的过程有重复,所以有时才会在这个文件中出现两行LISTEN的情况

于是改用ss命令并修改脚本


$ whereis ss

ss: /usr/sbin/ss /usr/share/man/man8/ss.8.gz

$ rpm -qf /usr/sbin/ss

iproute-2.6.32-23.el6.x86_64


ss命令是另外一个和netstat命令功能差不多的命令

ss -t -a  显示所有的TCP连接信息

ss -u -a  显示所有的UDP连接信息


改成ss后的执行情况

$ time ss  -tan|awk 'NR>1{++S[$1]}END{for (a in S) print a,S[a]}'

SYN-RECV 1

ESTAB 4687

TIME-WAIT 20250

LISTEN 39


real0m0.238s

user0m0.192s

sys0m0.083s


可以看出当服务器的连接数过高时,使用ss命令有命令的优势。

所以现在需要把监控脚本修改成通过ss命令获取状态

ss命令的man手册没有对输出的State字段进行说明,通过查看iproute的源代码misc/ss.c 知道ss命令的State字段主要有以下几种

static const char *sstate_name[] = {
        "UNKNOWN",
        [TCP_ESTABLISHED] = "ESTAB",
        [TCP_SYN_SENT] = "SYN-SENT",
        [TCP_SYN_RECV] = "SYN-RECV",
        [TCP_FIN_WAIT1] = "FIN-WAIT-1",
        [TCP_FIN_WAIT2] = "FIN-WAIT-2",
        [TCP_TIME_WAIT] = "TIME-WAIT",
        [TCP_CLOSE] = "UNCONN",
        [TCP_CLOSE_WAIT] = "CLOSE-WAIT",
        [TCP_LAST_ACK] = "LAST-ACK",
        [TCP_LISTEN] =  "LISTEN",
        [TCP_CLOSING] = "CLOSING",
};


ESTAB

SYN-SENT

SYN-RECV

FIN-WAIT-1

FIN-WAIT-2

TIME-WAIT

UNCONN

CLOSE-WAIT

LAST-ACK

LISTEN

CLOSING


于是修改脚本

#!/bin/bash
#this script is used to get tcp and udp connetion status
#tcp status
metric=$1
tmp_file=/tmp/tcp_status.txt
#/bin/netstat -an|awk '/^tcp/{++S[$NF]}END{for(a in S) print a,S[a]}' > $tmp_file
/usr/sbin/ss  -tan|awk 'NR>1{++S[$1]}END{for (a in S) print a,S[a]}' > $tmp_file

#ESTAB
#SYN-SENT
#SYN-RECV
#FIN-WAIT-1
#FIN-WAIT-2
#TIME-WAIT
#UNCONN
#CLOSE-WAIT
#LAST-ACK
#LISTEN
#CLOSING



case $metric in
   closed)
          output=$(awk '/UNCONN/{print $2}' $tmp_file)
          if [ "$output" == "" ];then
             echo 0
          else
             echo $output
          fi
        ;;
   listen)
          output=$(awk '/LISTEN/{print $2}' $tmp_file)
          if [ "$output" == "" ];then
             echo 0
          else
             echo $output
          fi
        ;;
   synrecv)
          output=$(awk '/SYN-RECV/{print $2}' $tmp_file)
          if [ "$output" == "" ];then
             echo 0
          else
             echo $output
          fi
        ;;
   synsent)
          output=$(awk '/SYN-SENT/{print $2}' $tmp_file)
          if [ "$output" == "" ];then
             echo 0
          else
             echo $output
          fi
        ;;
   established)
          output=$(awk '/ESTAB/{print $2}' $tmp_file)
          if [ "$output" == "" ];then
             echo 0
          else
             echo $output
          fi
        ;;
   timewait)
          output=$(awk '/TIME-WAIT/{print $2}' $tmp_file)
          if [ "$output" == "" ];then
             echo 0
          else
             echo $output
          fi
        ;;
   closing)
          output=$(awk '/CLOSING/{print $2}' $tmp_file)
          if [ "$output" == "" ];then
             echo 0
          else
             echo $output
          fi
        ;;
   closewait)
          output=$(awk '/CLOSE-WAIT/{print $2}' $tmp_file)
          if [ "$output" == "" ];then
             echo 0
          else
             echo $output
          fi
        ;;
   lastack)
          output=$(awk '/LAST-ACK/{print $2}' $tmp_file)
          if [ "$output" == "" ];then
             echo 0
          else
             echo $output
          fi
         ;;
   finwait1)
          output=$(awk '/FIN-WAIT-1/{print $2}' $tmp_file)
          if [ "$output" == "" ];then
             echo 0
          else
             echo $output
          fi
         ;;
   finwait2)
          output=$(awk '/FIN-WAIT-2/{print $2}' $tmp_file)
          if [ "$output" == "" ];then
             echo 0
          else
             echo $output
          fi
         ;;
         *)
          echo -e "\e[033mUsage: sh  $0 [closed|closing|closewait|synrecv|synsent|finwait1|finwait2|listen|established|lastack|timewait]\e[0m"
  
esac



使用ss代替netstat统计连接信息后就没有再收到报警