zabbix 自定义LLD-CSDN博客

线上部分实时job是用storm开发的，为了监控数据的延迟，在storm处理日志的时候会把日志的时间插入到redis中，然后通过zabbix做延迟的监控。由于经常有新的job上线，手动配置监控项就变得比较麻烦，为了解放生产力，还是需要搞成自动化。

之前添加网卡和分区监控的时候用了LLD的功能，并用了其内置的宏变量，新版本的zabbix是支持custom LLD的，实现步骤如下：

1.在模板中设置一个discovery rule ( UserParameter Key)，调用脚本，返回zabbix规定的json数据（返回自定义的宏变量），并正确设置的discovery（比如filter等）

这里通过官方文档并结合线上的agent日志，可以看到zabbix规定的数据格式

 
         143085:20141127:000548.967 Requested [vfs.fs.discovery] 
        
         143085:20141127:000548.967 Sending back [{ 
        
         "data" 
         :[ 
        
         { 
        
         "{#FSNAME}" 
         : 
         "\/" 
         , 
        
         "{#FSTYPE}" 
         : 
         "rootfs" 
         }, 
        
         { 
        
         "{#FSNAME}" 
         : 
         "\/proc\/sys\/fs\/binfmt_misc" 
         , 
        
         "{#FSTYPE}" 
         : 
         "binfmt_misc" 
         }, 
        
         { 
        
         "{#FSNAME}" 
         : 
         "\/data" 
         , 
        
         "{#FSTYPE}" 
         : 
         "ext4" 
         }]}]

比如线上返回json数据的key:

 
    
         UserParameter 
         = 
         storm.delay.discovery,python2. 
         6  
         / 
         apps 
         / 
         sh 
         / 
         zabbix_scripts 
         / 
         storm 
         / 
         storm_delay_discovery.py 
        
 
  

并通过

 
         zabbix_get  -s 127.0.0.1 -k storm.delay.discovery

验证返回数据的准确性

storm_delay_discovery.py内容如下：

 
         #!/usr/bin/python2.6 
        
         #for storm job delay monitor auto discovery 
        
         #edit by ericni.ni 
        
         #2014-11-27 
        
         import  
         sys 
        
         import  
         redis 
        
         import  
         exceptions 
        
         import  
         traceback 
        
         _hashtables  
         =  
         [] 
        
         _continue  
         =  
         True 
        
         _alllist  
         =  
         [] 
        
         class  
         RedisException(Exception): 
        
         def  
         __init__( 
         self 
         , errorlog): 
        
         self 
         .errorlog  
         =  
         errorlog 
        
         def  
         __str__( 
         self 
         ): 
        
         return  
         "error log is %s"  
         %  
         ( 
         self 
         .errorlog) 
        
         def  
         scan_one(cursor,conn): 
        
         try 
         : 
        
         cursor_v  
         =   
         conn.scan(cursor) 
        
         cursor_next  
         =  
         cursor_v[ 
         0 
         ] 
        
         cursor_value  
         =  
         cursor_v[ 
         1 
         ] 
        
         for  
         line  
         in  
         cursor_value: 
        
         if  
         (line.startswith( 
         "com-vip-storm" 
         )  
         or  
         line.startswith( 
         "stormdelay_" 
         )): 
        
         _hashtables.append(line) 
        
         else 
         : 
        
         pass 
        
         return  
         cursor_next 
        
         except  
         Exception,e: 
        
         raise  
         RedisException( 
         str 
         (e)) 
        
         def  
         scan_all(conn): 
        
         try 
         : 
        
         cursor1  
         =  
         scan_one( 
         '0' 
         ,conn) 
        
         global  
         _continue 
        
         while  
         _continue: 
        
         cursor2  
         =  
         scan_one(cursor1,conn) 
        
         if  
         int 
         (cursor2)  
         = 
         =  
         0 
         : 
        
         _continue  
         =  
         False 
        
         else 
         : 
        
         cursor1  
         =  
         cursor2 
        
         _continue  
         =  
         True 
        
         except  
         Exception,e: 
        
         raise  
         RedisException( 
         str 
         (e)) 
        
         def  
         hget_fields(conn,hashname): 
        
         fields  
         =  
         conn.hkeys(hashname) 
        
         re  
         =  
         "[" 
        
         #print "hashname %s"%(hashname) 
        
         #print fields 
        
         for  
         field  
         in  
         fields: 
        
         aline  
         =  
         "" 
        
         aline  
         + 
         =  
         """{"{#STORMHASHNAME}": "%s", "{#STORMHASHFIELD}": "%s"}""" 
         % 
         (hashname,field) 
        
         _alllist.append(aline) 
        
         if  
         __name__  
         = 
         =  
         '__main__' 
         : 
        
         re  
         =  
         "" 
        
         try 
         : 
        
         r 
         = 
         redis.StrictRedis(host 
         = 
         'xxx' 
         , port 
         = 
         xxx, db 
         = 
         0 
         ) 
        
         scan_all(r) 
        
         for  
         hashtable  
         in  
         _hashtables: 
        
         hget_fields(r,hashtable) 
        
         re  
         + 
         =  
         """{"data": """ 
        
         re  
         + 
         =  
         str 
         (_alllist).replace( 
         "'",'') 
        
         re += "}" 
        
         print re.replace("'" 
         ,'"') 
        
         except  
         Exception,e: 
        
         print   
         - 
         1

2.设置item/graph/trigger prototypes：

这里以item为例，定义item prototypes (同样需要定义key),key的参数为宏变量

比如Free inodes on {#FSNAME} (percentage)--->vfs.fs.inode[{#FSNAME},pfree]

本例中，在item中使用上面返回的宏变量即可，

 
         storm_delay[hget,{ 
         #STORMHASHNAME},{#STORMHASHFIELD}]

最后，把包含LLD的template链接到host上即可。

最后再配合screen.create/screenitem.update api就可以实现监控添加/screen添加,更新的自动化了。

本文转自菜菜光 51CTO博客，原文链接：http://blog.51cto.com/caiguangguang/1583536，如需转载请自行联系原作者