zabbix做为越来越受大家欢迎的监控工具,其相对于nagios,cacti之流,最大的一个特点就是数据是存放在关系型数据库中的,这样就可以极大的方便后续的数据查询,处理等,比如我们想知道一台机器全天ioutil 超过80的时间比例,在zabbix的数据库中,一个sql就可以搞定了,而在cacti中就不这么方便了,而且也不用担心数据随着时间的边长而被稀释掉。
在做zabbix的数据分析时,用到的比较多的表一般有hosts,items,interface,hisory*,trend*相关表,比如,通过zabbix监控整个hadoop集群的mapred的使用情况,只需要把每台机器的lastvalue进行聚合就好了。。
可以简单通过下面这种方式:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
|
#!/usr/bin/python
#edit by ericni
#to get hadoop totaol statistics
# -*- coding: utf8 -*-
import
MySQLdb
import
sys
import
os
def
get_total_value(sql):
db
=
MySQLdb.connect(host
=
'xxx'
,user
=
'xxxx'
,passwd
=
'xxx'
,db
=
'xxx'
)
cursor
=
db.cursor()
cursor.execute(sql)
try
:
result
=
cursor.fetchone()[
0
]
except
:
result
=
0
cursor.close()
db.close()
return
result
if
__name__
=
=
'__main__'
:
sql
=
''
if
sys.argv[
1
]
=
=
"all_mapTaskSlots"
:
sql
=
"select sum(lastvalue) from hosts a, items b where key_ = 'hadoop_metrics[mrmetrics.log,mapred.tasktracker,mapTaskSlots]' and lower(host) like '%-hadoop-datanode%' and a.hostid = b.hostid"
elif
sys.argv[
1
]
=
=
"all_maps_running"
:
sql
=
"select sum(lastvalue) from hosts a, items b where key_ = 'hadoop_metrics[mrmetrics.log,mapred.tasktracker,maps_running]' and lower(host) like '%-hadoop-datanode%' and a.hostid = b.hostid"
elif
sys.argv[
1
]
=
=
"all_reduceTaskSlots"
:
sql
=
"select sum(lastvalue) from hosts a, items b where key_ = 'hadoop_metrics[mrmetrics.log,mapred.tasktracker,reduceTaskSlots]' and lower(host) like '%-hadoop-datanode%' and a.hostid = b.hostid"
elif
sys.argv[
1
]
=
=
"all_reduces_running"
:
sql
=
"select sum(lastvalue) from hosts a, items b where key_ = 'hadoop_metrics[mrmetrics.log,mapred.tasktracker,reduces_running]' and lower(host) like '%-hadoop-datanode%' and a.hostid = b.hostid"
elif
sys.argv[
1
]
=
=
"all_ThreadsBlocked"
:
sql
=
"select sum(lastvalue) from hosts a, items b where key_ = 'hadoop_stats[datanode,ThreadsBlocked]' and lower(host) like '%-hadoop-datanode%' and a.hostid = b.hostid"
elif
sys.argv[
1
]
=
=
"all_ThreadsRunnable"
:
sql
=
"select sum(lastvalue) from hosts a, items b where key_ = 'hadoop_stats[datanode,ThreadsRunnable]' and lower(host) like '%-hadoop-datanode%' and a.hostid = b.hostid"
elif
sys.argv[
1
]
=
=
"all_ThreadsWaiting"
:
sql
=
"select sum(lastvalue) from hosts a, items b where key_ = 'hadoop_stats[datanode,ThreadsWaiting]' and lower(host) like '%-hadoop-datanode%' and a.hostid = b.hostid"
else
:
sys.exit(
0
)
value
=
get_total_value(sql)
print
value
|
然后把可用的total map和total running map画在一个graph里面就可以知道map的使用率情况了。。
当然,zabbix也有自己的前端聚合的功能,不过相对来说,这样灵活性会高一些。。
本文转自菜菜光 51CTO博客,原文链接:http://blog.51cto.com/caiguangguang/1369808,如需转载请自行联系原作者