使用zabbix做聚合监控

最新推荐文章于 2024-07-14 13:10:18 发布

weixin_34294649

最新推荐文章于 2024-07-14 13:10:18 发布

阅读量427

点赞数

文章标签：运维大数据数据库

原文链接：https://yq.aliyun.com/articles/434649

版权

zabbix做为越来越受大家欢迎的监控工具，其相对于nagios,cacti之流，最大的一个特点就是数据是存放在关系型数据库中的，这样就可以极大的方便后续的数据查询，处理等，比如我们想知道一台机器全天ioutil 超过80的时间比例，在zabbix的数据库中，一个sql就可以搞定了，而在cacti中就不这么方便了，而且也不用担心数据随着时间的边长而被稀释掉。

在做zabbix的数据分析时，用到的比较多的表一般有hosts,items，interface,hisory*,trend*相关表，比如，通过zabbix监控整个hadoop集群的mapred的使用情况，只需要把每台机器的lastvalue进行聚合就好了。。

可以简单通过下面这种方式：

 
         #!/usr/bin/python 
        
         #edit by ericni 
        
         #to get hadoop totaol statistics 
        
         # -*- coding: utf8 -*- 
        
         import  
         MySQLdb 
        
         import  
         sys 
        
         import  
         os 
        
         def  
         get_total_value(sql): 
        
         db  
         =  
         MySQLdb.connect(host 
         = 
         'xxx' 
         ,user 
         = 
         'xxxx' 
         ,passwd 
         = 
         'xxx' 
         ,db 
         = 
         'xxx' 
         ) 
        
         cursor  
         =  
         db.cursor() 
        
         cursor.execute(sql) 
        
         try 
         : 
        
         result  
         =  
         cursor.fetchone()[ 
         0 
         ] 
        
         except 
         : 
        
         result  
         =  
         0 
        
         cursor.close() 
        
         db.close() 
        
         return  
         result 
        
         if  
         __name__  
         = 
         =  
         '__main__' 
         : 
        
         sql  
         =  
         '' 
        
         if  
         sys.argv[ 
         1 
         ]  
         = 
         =  
         "all_mapTaskSlots" 
         : 
        
         sql  
         =  
         "select sum(lastvalue) from  hosts a, items b   where key_ = 'hadoop_metrics[mrmetrics.log,mapred.tasktracker,mapTaskSlots]' and lower(host) like '%-hadoop-datanode%'  and a.hostid = b.hostid" 
        
         elif  
         sys.argv[ 
         1 
         ]  
         = 
         =  
         "all_maps_running" 
         : 
        
         sql  
         =  
         "select sum(lastvalue) from  hosts a, items b   where key_ = 'hadoop_metrics[mrmetrics.log,mapred.tasktracker,maps_running]' and lower(host) like '%-hadoop-datanode%'  and a.hostid = b.hostid" 
        
         elif  
         sys.argv[ 
         1 
         ]  
         = 
         =  
         "all_reduceTaskSlots" 
         : 
        
         sql  
         =  
         "select sum(lastvalue) from  hosts a, items b   where key_ = 'hadoop_metrics[mrmetrics.log,mapred.tasktracker,reduceTaskSlots]' and lower(host) like '%-hadoop-datanode%'  and a.hostid = b.hostid" 
        
         elif  
         sys.argv[ 
         1 
         ]  
         = 
         =  
         "all_reduces_running" 
         : 
        
         sql  
         =  
         "select sum(lastvalue) from  hosts a, items b   where key_ = 'hadoop_metrics[mrmetrics.log,mapred.tasktracker,reduces_running]' and lower(host) like '%-hadoop-datanode%'  and a.hostid = b.hostid" 
        
         elif  
         sys.argv[ 
         1 
         ]  
         = 
         =  
         "all_ThreadsBlocked" 
         : 
        
         sql  
         =  
         "select sum(lastvalue) from  hosts a, items b   where key_ =  'hadoop_stats[datanode,ThreadsBlocked]' and lower(host) like '%-hadoop-datanode%'  and a.hostid = b.hostid" 
        
         elif  
         sys.argv[ 
         1 
         ]  
         = 
         =  
         "all_ThreadsRunnable" 
         : 
        
         sql  
         =  
         "select sum(lastvalue) from  hosts a, items b   where key_ = 'hadoop_stats[datanode,ThreadsRunnable]' and lower(host) like '%-hadoop-datanode%'  and a.hostid = b.hostid" 
        
         elif  
         sys.argv[ 
         1 
         ]  
         = 
         =  
         "all_ThreadsWaiting" 
         : 
        
         sql  
         =  
         "select sum(lastvalue) from  hosts a, items b   where key_ =  'hadoop_stats[datanode,ThreadsWaiting]' and lower(host) like '%-hadoop-datanode%'  and a.hostid = b.hostid" 
        
         else 
         : 
        
         sys.exit( 
         0 
         ) 
        
         value  
         =  
         get_total_value(sql) 
        
         print  
         value

然后把可用的total map和total running map画在一个graph里面就可以知道map的使用率情况了。。

当然，zabbix也有自己的前端聚合的功能，不过相对来说，这样灵活性会高一些。。

本文转自菜菜光 51CTO博客，原文链接：http://blog.51cto.com/caiguangguang/1369808，如需转载请自行联系原作者

weixin_34294649

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
使用zabbix做聚合监控

zabbix做为越来越受大家欢迎的监控工具，其相对于nagios,cacti之流，最大的一个特点就是数据是存放在关系型数据库中的，这样就可以极大的方便后续的数据查询，处理等，比如我们想知道一台机器全天ioutil 超过80的时间比例，在zabbix的数据库中，一个sql就可以搞定了，而在cacti中就不这么方便了，而且也不用担心数据随着时间的边长而被稀...
复制链接

扫一扫