zabbix监控docker容器状态
前言:前段时间在部署zabbix,有个需求就是需要监控容器的状态 也就是cpu 内存 io的占用,于是就自己写了一个脚本,以及模板,在这里分享一下 嘿嘿 : )
废话我也就不多说,直接开始
首选,zabbix_agentd 配置 vim /usr/local/zabbix/etc/zabbix_agentd.conf
1 2 |
|
下面是docker.py 脚本,采用自动发现规则来发现容器,然后指定容器获取状态信息
1 #!/usr/bin/python 2 import sys 3 import os 4 import json 5 6 7 def discover(): 8 d = {} 9 d['data'] = [] 10 with os.popen("docker ps -a --format {{.Names}}") as pipe: 11 for line in pipe: 12 info = {} 13 info['{#CONTAINERNAME}'] = line.replace("\n","") 14 d['data'].append(info) 15 16 print json.dumps(d) 17 18 19 def status(name,action): 20 if action == "ping": 21 cmd = 'docker inspect --format="{{.State.Running}}" %s' %name 22 result = os.popen(cmd).read().replace("\n","") 23 if result == "true": 24 print 1 25 else: 26 print 0 27 else: 28 cmd = 'docker stats %s --no-stream --format "{{.%s}}"' % (name,action) 29 result = os.popen(cmd).read().replace("\n","") 30 if "%" in result: 31 print float(result.replace("%","")) 32 else: 33 print result 34 35 36 if __name__ == '__main__': 37 try: 38 name, action = sys.argv[1], sys.argv[2] 39 status(name,action) 40 except IndexError: 41 discover()
这里说一下自动发现规则的坑。。。我被坑了好久才找出来.....一是必须返回json格式内容,二是 info['{#CONTAINERNAME}' ] 这个key一定要这么写{#CONTAINERNAME}......
返回结果如下,一定要是这样的层级关系....
{"data": [{"{#CONTAINERNAME}": "node-3"}, {"{#CONTAINERNAME}": "node-2"}, {"{#CONTAINERNAME}": "node-1"}, {"{#CONTAINERNAME}": "web"}, {"{#CONTAINERNAME}": "cadvisor"}, {"{#CONTAINERNAME}": "updatol"}, {"{#CONTAINERNAME}": "research"}, {"{#CONTAINERNAME}": "services"}, {"{#CONTAINERNAME}": "data"}, {"{#CONTAINERNAME}": "rabbitmq"}, {"{#CONTAINERNAME}": "redis"}, {"{#CONTAINERNAME}": "mysql"}, {"{#CONTAINERNAME}": "ssdb"}]}
另外那个函数的很简单了,就是调用docker 命令在获取数据的。
自动发现规则呢 也就是这样
只监控的这几个状态,以及还有一个触发器就是ping 来检测当前这个容器状态是否运行,如果不是就报警。
模板如下
1 <?xml version="1.0" encoding="UTF-8"?> 2 <zabbix_export> 3 <version>3.2</version> 4 <date>2018-06-04T04:12:36Z</date> 5 <groups> 6 <group> 7 <name>Templates</name> 8 </group> 9 </groups> 10 <templates> 11 <template> 12 <template>docker-status</template> 13 <name>docker-status</name> 14 <description/> 15 <groups> 16 <group> 17 <name>Templates</name> 18 </group> 19 </groups> 20 <applications> 21 <application> 22 <name>docker_test</name> 23 </application> 24 </applications> 25 <items/> 26 <discovery_rules> 27 <discovery_rule> 28 <name>docker.discovery</name> 29 <type>0</type> 30 <snmp_community/> 31 <snmp_oid/> 32 <key>docker.discovery</key> 33 <delay>60</delay> 34 <status>0</status> 35 <allowed_hosts/> 36 <snmpv3_contextname/> 37 <snmpv3_securityname/> 38 <snmpv3_securitylevel>0</snmpv3_securitylevel> 39 <snmpv3_authprotocol>0</snmpv3_authprotocol> 40 <snmpv3_authpassphrase/> 41 <snmpv3_privprotocol>0</snmpv3_privprotocol> 42 <snmpv3_privpassphrase/> 43 <delay_flex/> 44 <params/> 45 <ipmi_sensor/> 46 <authtype>0</authtype> 47 <username/> 48 <password/> 49 <publickey/> 50 <privatekey/> 51 <port/> 52 <filter> 53 <evaltype>0</evaltype> 54 <formula/> 55 <conditions> 56 <condition> 57 <macro>{#CONTAINERNAME}</macro> 58 <value>@ CONTAINER NAME</value> 59 <operator>8</operator> 60 <formulaid>A</formulaid> 61 </condition> 62 </conditions> 63 </filter> 64 <lifetime>30</lifetime> 65 <description/> 66 <item_prototypes> 67 <item_prototype> 68 <name>Container {#CONTAINERNAME} Diskio usage:</name> 69 <type>0</type> 70 <snmp_community/> 71 <multiplier>0</multiplier> 72 <snmp_oid/> 73 <key>docker.[{#CONTAINERNAME} ,BlockIO]</key> 74 <delay>60</delay> 75 <history>90</history> 76 <trends>0</trends> 77 <status>0</status> 78 <value_type>1</value_type> 79 <allowed_hosts/> 80 <units/> 81 <delta>0</delta> 82 <snmpv3_contextname/> 83 <snmpv3_securityname/> 84 <snmpv3_securitylevel>0</snmpv3_securitylevel> 85 <snmpv3_authprotocol>0</snmpv3_authprotocol> 86 <snmpv3_authpassphrase/> 87 <snmpv3_privprotocol>0</snmpv3_privprotocol> 88 <snmpv3_privpassphrase/> 89 <formula>1</formula> 90 <delay_flex/> 91 <params/> 92 <ipmi_sensor/> 93 <data_type>0</data_type> 94 <authtype>0</authtype> 95 <username/> 96 <password/> 97 <publickey/> 98 <privatekey/> 99 <port/> 100 <description/> 101 <inventory_link>0</inventory_link> 102 <applications> 103 <application> 104 <name>docker_test</name> 105 </application> 106 </applications> 107 <valuemap/> 108 <logtimefmt/> 109 <application_prototypes/> 110 </item_prototype> 111 <item_prototype> 112 <name>Container{#CONTAINERNAME} CPU usage:</name> 113 <type>0</type> 114 <snmp_community/> 115 <multiplier>0</multiplier> 116 <snmp_oid/> 117 <key>docker.[{#CONTAINERNAME},CPUPerc]</key> 118 <delay>60</delay> 119 <history>90</history> 120 <trends>365</trends> 121 <status>0</status> 122 <value_type>0</value_type> 123 <allowed_hosts/> 124 <units>%</units> 125 <delta>0</delta> 126 <snmpv3_contextname/> 127 <snmpv3_securityname/> 128 <snmpv3_securitylevel>0</snmpv3_securitylevel> 129 <snmpv3_authprotocol>0</snmpv3_authprotocol> 130 <snmpv3_authpassphrase/> 131 <snmpv3_privprotocol>0</snmpv3_privprotocol> 132 <snmpv3_privpassphrase/> 133 <formula>1</formula> 134 <delay_flex/> 135 <params/> 136 <ipmi_sensor/> 137 <data_type>0</data_type> 138 <authtype>0</authtype> 139 <username/> 140 <password/> 141 <publickey/> 142 <privatekey/> 143 <port/> 144 <description/> 145 <inventory_link>0</inventory_link> 146 <applications> 147 <application> 148 <name>docker_test</name> 149 </application> 150 </applications> 151 <valuemap/> 152 <logtimefmt/> 153 <application_prototypes/> 154 </item_prototype> 155 <item_prototype> 156 <name>Container {#CONTAINERNAME} mem usage:</name> 157 <type>0</type> 158 <snmp_community/> 159 <multiplier>0</multiplier> 160 <snmp_oid/> 161 <key>docker.[{#CONTAINERNAME},MemPerc]</key> 162 <delay>60</delay> 163 <history>90</history> 164 <trends>365</trends> 165 <status>0</status> 166 <value_type>0</value_type> 167 <allowed_hosts/> 168 <units>%</units> 169 <delta>0</delta> 170 <snmpv3_contextname/> 171 <snmpv3_securityname/> 172 <snmpv3_securitylevel>0</snmpv3_securitylevel> 173 <snmpv3_authprotocol>0</snmpv3_authprotocol> 174 <snmpv3_authpassphrase/> 175 <snmpv3_privprotocol>0</snmpv3_privprotocol> 176 <snmpv3_privpassphrase/> 177 <formula>1</formula> 178 <delay_flex/> 179 <params/> 180 <ipmi_sensor/> 181 <data_type>0</data_type> 182 <authtype>0</authtype> 183 <username/> 184 <password/> 185 <publickey/> 186 <privatekey/> 187 <port/> 188 <description/> 189 <inventory_link>0</inventory_link> 190 <applications> 191 <application> 192 <name>docker_test</name> 193 </application> 194 </applications> 195 <valuemap/> 196 <logtimefmt/> 197 <application_prototypes/> 198 </item_prototype> 199 <item_prototype> 200 <name>Container {#CONTAINERNAME} NETio usage:</name> 201 <type>0</type> 202 <snmp_community/> 203 <multiplier>0</multiplier> 204 <snmp_oid/> 205 <key>docker.[{#CONTAINERNAME},NetIO]</key> 206 <delay>60</delay> 207 <history>90</history> 208 <trends>0</trends> 209 <status>0</status> 210 <value_type>1</value_type> 211 <allowed_hosts/> 212 <units/> 213 <delta>0</delta> 214 <snmpv3_contextname/> 215 <snmpv3_securityname/> 216 <snmpv3_securitylevel>0</snmpv3_securitylevel> 217 <snmpv3_authprotocol>0</snmpv3_authprotocol> 218 <snmpv3_authpassphrase/> 219 <snmpv3_privprotocol>0</snmpv3_privprotocol> 220 <snmpv3_privpassphrase/> 221 <formula>1</formula> 222 <delay_flex/> 223 <params/> 224 <ipmi_sensor/> 225 <data_type>0</data_type> 226 <authtype>0</authtype> 227 <username/> 228 <password/> 229 <publickey/> 230 <privatekey/> 231 <port/> 232 <description/> 233 <inventory_link>0</inventory_link> 234 <applications> 235 <application> 236 <name>docker_test</name> 237 </application> 238 </applications> 239 <valuemap/> 240 <logtimefmt/> 241 <application_prototypes/> 242 </item_prototype> 243 <item_prototype> 244 <name>Container{#CONTAINERNAME} is_run :</name> 245 <type>0</type> 246 <snmp_community/> 247 <multiplier>0</multiplier> 248 <snmp_oid/> 249 <key>docker.[{#CONTAINERNAME} ,ping]</key> 250 <delay>30</delay> 251 <history>90</history> 252 <trends>365</trends> 253 <status>0</status> 254 <value_type>3</value_type> 255 <allowed_hosts/> 256 <units/> 257 <delta>0</delta> 258 <snmpv3_contextname/> 259 <snmpv3_securityname/> 260 <snmpv3_securitylevel>0</snmpv3_securitylevel> 261 <snmpv3_authprotocol>0</snmpv3_authprotocol> 262 <snmpv3_authpassphrase/> 263 <snmpv3_privprotocol>0</snmpv3_privprotocol> 264 <snmpv3_privpassphrase/> 265 <formula>1</formula> 266 <delay_flex/> 267 <params/> 268 <ipmi_sensor/> 269 <data_type>0</data_type> 270 <authtype>0</authtype> 271 <username/> 272 <password/> 273 <publickey/> 274 <privatekey/> 275 <port/> 276 <description/> 277 <inventory_link>0</inventory_link> 278 <applications> 279 <application> 280 <name>docker_test</name> 281 </application> 282 </applications> 283 <valuemap/> 284 <logtimefmt/> 285 <application_prototypes/> 286 </item_prototype> 287 </item_prototypes> 288 <trigger_prototypes> 289 <trigger_prototype> 290 <expression>{docker-status:docker.[{#CONTAINERNAME} ,ping].last()}=0</expression> 291 <recovery_mode>0</recovery_mode> 292 <recovery_expression/> 293 <name>docker_{#CONTAINERNAME}_down</name> 294 <correlation_mode>0</correlation_mode> 295 <correlation_tag/> 296 <url/> 297 <status>0</status> 298 <priority>5</priority> 299 <description/> 300 <type>0</type> 301 <manual_close>0</manual_close> 302 <dependencies/> 303 <tags/> 304 </trigger_prototype> 305 </trigger_prototypes> 306 <graph_prototypes/> 307 <host_prototypes/> 308 </discovery_rule> 309 </discovery_rules> 310 <httptests/> 311 <macros/> 312 <templates/> 313 <screens/> 314 </template> 315 </templates> 316 </zabbix_export>
修改Zabbix_agentd 配置,docker.py脚本放在指定路径下,不要忘了给权限,导入模板,能获取数据就没问题。获取不了的,可以zabbix_get 来调试 找到问题出在哪去解决。