falcon 关于mysql监控_Open-Falcon 监控系统监控 MySQL/Redis/MongoDB 状态监控

背景:

Open-Falcon 是小米运维部开源的一款互联网企业级监控系统解决方案,具体的安装和使用说明请见官网:http://open-falcon.org/,是一款比较全的监控。而且提供各种API,只需要把数据按照规定给出就能出图,以及报警、集群支持等等。

监控:

1) MySQL 收集信息脚本(mysql_monitor.py)

48304ba5e6f9fe08f3fa1abda7d326ab.png

#!/bin/env python

# -*- encoding: utf-8 -*-

from __future__ import division

import MySQLdb

import datetime

import time

import os

import sys

import fileinput

import requests

import json

import re

class MySQLMonitorInfo():

def __init__(self,host,port,user,password):

self.host = host

self.port = port

self.user = user

self.password = password

def stat_info(self):

try:

m = MySQLdb.connect(host=self.host,user=self.user,passwd=self.password,port=self.port,charset='utf8')

query = "SHOW GLOBAL STATUS"

cursor = m.cursor()

cursor.execute(query)

Str_string = cursor.fetchall()

Status_dict = {}

for Str_key,Str_value in Str_string:

Status_dict[Str_key] = Str_value

cursor.close()

m.close()

return Status_dict

except Exception, e:

print (datetime.datetime.now()).strftime("%Y-%m-%d %H:%M:%S")

print e

Status_dict = {}

return Status_dict

def engine_info(self):

try:

m = MySQLdb.connect(host=self.host,user=self.user,passwd=self.password,port=self.port,charset='utf8')

_engine_regex = re.compile(ur'(History list length) ([0-9]+\.?[0-9]*)\n')

query = "SHOW ENGINE INNODB STATUS"

cursor = m.cursor()

cursor.execute(query)

Str_string = cursor.fetchone()

a,b,c = Str_string

cursor.close()

m.close()

return dict(_engine_regex.findall(c))

except Exception, e:

print (datetime.datetime.now()).strftime("%Y-%m-%d %H:%M:%S")

print e

return dict(History_list_length=0)

if __name__ == '__main__':

open_falcon_api = 'http://192.168.200.86:1988/v1/push'

db_list= []

for line in fileinput.input():

db_list.append(line.strip())

for db_info in db_list:

# host,port,user,password,endpoint,metric = db_info.split(',')

host,port,user,password,endpoint = db_info.split(',')

timestamp = int(time.time())

step = 60

# tags = "port=%s" %port

tags = ""

conn = MySQLMonitorInfo(host,int(port),user,password)

stat_info = conn.stat_info()

engine_info = conn.engine_info()

mysql_stat_list = []

monitor_keys = [

('Com_select','COUNTER'),

('Qcache_hits','COUNTER'),

('Com_insert','COUNTER'),

('Com_update','COUNTER'),

('Com_delete','COUNTER'),

('Com_replace','COUNTER'),

('MySQL_QPS','COUNTER'),

('MySQL_TPS','COUNTER'),

('ReadWrite_ratio','GAUGE'),

('Innodb_buffer_pool_read_requests','COUNTER'),

('Innodb_buffer_pool_reads','COUNTER'),

('Innodb_buffer_read_hit_ratio','GAUGE'),

('Innodb_buffer_pool_pages_flushed','COUNTER'),

('Innodb_buffer_pool_pages_free','GAUGE'),

('Innodb_buffer_pool_pages_dirty','GAUGE'),

('Innodb_buffer_pool_pages_data','GAUGE'),

('Bytes_received','COUNTER'),

('Bytes_sent','COUNTER'),

('Innodb_rows_deleted','COUNTER'),

('Innodb_rows_inserted','COUNTER'),

('Innodb_rows_read','COUNTER'),

('Innodb_rows_updated','COUNTER'),

('Innodb_os_log_fsyncs','COUNTER'),

('Innodb_os_log_written','COUNTER'),

('Created_tmp_disk_tables','COUNTER'),

('Created_tmp_tables','COUNTER'),

('Connections','COUNTER'),

('Innodb_log_waits','COUNTER'),

('Slow_queries','COUNTER'),

('Binlog_cache_disk_use','COUNTER')

]

for _key,falcon_type in monitor_keys:

if _key == 'MySQL_QPS':

_value = int(stat_info.get('Com_select',0)) + int(stat_info.get('Qcache_hits',0))

elif _key == 'MySQL_TPS':

_value = int(stat_info.get('Com_insert',0)) + int(stat_info.get('Com_update',0)) + int(stat_info.get('Com_delete',0)) + int(stat_info.get('Com_replace',0))

elif _key == 'Innodb_buffer_read_hit_ratio':

try:

_value = round((int(stat_info.get('Innodb_buffer_pool_read_requests',0)) - int(stat_info.get('Innodb_buffer_pool_reads',0)))/int(stat_info.get('Innodb_buffer_pool_read_requests',0)) * 100,3)

except ZeroDivisionError:

_value = 0

elif _key == 'ReadWrite_ratio':

try:

_value = round((int(stat_info.get('Com_select',0)) + int(stat_info.get('Qcache_hits',0)))/(int(stat_info.get('Com_insert',0)) + int(stat_info.get('Com_update',0)) + int(stat_info.get('Com_delete',0)) + int(stat_info.get('Com_replace',0))),2)

except ZeroDivisionError:

_value = 0

else:

_value = int(stat_info.get(_key,0))

falcon_format = {

'Metric': '%s' % (_key),

'Endpoint': endpoint,

'Timestamp': timestamp,

'Step': step,

'Value': _value,

'CounterType': falcon_type,

'TAGS': tags

}

mysql_stat_list.append(falcon_format)

#_key : History list length

for _key,_value in engine_info.items():

_key = "Undo_Log_Length"

falcon_format = {

'Metric': '%s' % (_key),

'Endpoint': endpoint,

'Timestamp': timestamp,

'Step': step,

'Value': int(_value),

'CounterType': "GAUGE",

'TAGS': tags

}

mysql_stat_list.append(falcon_format)

print json.dumps(mysql_stat_list,sort_keys=True,indent=4)

requests.post(open_falcon_api, data=json.dumps(mysql_stat_list))

48304ba5e6f9fe08f3fa1abda7d326ab.png

指标说明:收集指标里的COUNTER表示每秒执行次数,GAUGE表示直接输出值。

指标

类型

说明

Undo_Log_Length

GAUGE

未清除的Undo事务数

Com_select

COUNTER

select/秒=QPS

Com_insert

COUNTER

insert/秒

Com_update

COUNTER

update/秒

Com_delete

COUNTER

delete/秒

Com_replace

COUNTER

replace/秒

MySQL_QPS

COUNTER

QPS

MySQL_TPS

COUNTER

TPS

ReadWrite_ratio

GAUGE

读写比例

Innodb_buffer_pool_read_requests

COUNTER

innodb buffer pool 读次数/秒

Innodb_buffer_pool_reads

COUNTER

Disk 读次数/秒

Innodb_buffer_read_hit_ratio

GAUGE

innodb buffer pool 命中率

Innodb_buffer_pool_pages_flushed

COUNTER

innodb buffer pool 刷写到磁盘的页数/秒

Innodb_buffer_pool_pages_free

GAUGE

innodb buffer pool 空闲页的数量

Innodb_buffer_pool_pages_dirty

GAUGE

innodb buffer pool 脏页的数量

Innodb_buffer_pool_pages_data

GAUGE

innodb buffer pool 数据页的数量

Bytes_received

COUNTER

接收字节数/秒

Bytes_sent

COUNTER

发送字节数/秒

Innodb_rows_deleted

COUNTER

innodb表删除的行数/秒

Innodb_rows_inserted

COUNTER

innodb表插入的行数/秒

Innodb_rows_read

COUNTER

innodb表读取的行数/秒

Innodb_rows_updated

COUNTER

innodb表更新的行数/秒

Innodb_os_log_fsyncs

COUNTER

Redo Log fsync次数/秒

Innodb_os_log_written

COUNTER

Redo Log 写入的字节数/秒

Created_tmp_disk_tables

COUNTER

创建磁盘临时表的数量/秒

Created_tmp_tables

COUNTER

创建内存临时表的数量/秒

Connections

COUNTER

连接数/秒

Innodb_log_waits

COUNTER

innodb log buffer不足等待的数量/秒

Slow_queries

COUNTER

慢查询数/秒

Binlog_cache_disk_use

COUNTER

Binlog Cache不足的数量/秒

使用说明:读取配置到都数据库列表执行,配置文件格式如下(mysqldb_list.txt):

IP,Port,User,Password,endpoint

192.168.2.21,3306,root,123,mysql-21:3306

192.168.2.88,3306,root,123,mysql-88:3306

最后执行:

python mysql_monitor.py mysqldb_list.txt

2) Redis 收集信息脚本(redis_monitor.py)

48304ba5e6f9fe08f3fa1abda7d326ab.png

#!/bin/env python

#-*- coding:utf-8 -*-

import json

import time

import re

import redis

import requests

import fileinput

import datetime

class RedisMonitorInfo():

def __init__(self,host,port,password):

self.host = host

self.port = port

self.password = password

def stat_info(self):

try:

r = redis.Redis(host=self.host, port=self.port, password=self.password)

stat_info = r.info()

return stat_info

except Exception, e:

print (datetime.datetime.now()).strftime("%Y-%m-%d %H:%M:%S")

print e

return dict()

def cmdstat_info(self):

try:

r = redis.Redis(host=self.host, port=self.port, password=self.password)

cmdstat_info = r.info('Commandstats')

return cmdstat_info

except Exception, e:

print (datetime.datetime.now()).strftime("%Y-%m-%d %H:%M:%S")

print e

return dict()

if __name__ == '__main__':

open_falcon_api = 'http://192.168.200.86:1988/v1/push'

db_list= []

for line in fileinput.input():

db_list.append(line.strip())

for db_info in db_list:

# host,port,password,endpoint,metric = db_info.split(',')

host,port,password,endpoint = db_info.split(',')

timestamp = int(time.time())

step = 60

falcon_type = 'COUNTER'

# tags = "port=%s" %port

tags = ""

conn = RedisMonitorInfo(host,port,password)

#查看各个命令每秒执行次数

redis_cmdstat_dict = {}

redis_cmdstat_list = []

cmdstat_info = conn.cmdstat_info()

for cmdkey in cmdstat_info:

redis_cmdstat_dict[cmdkey] = cmdstat_info[cmdkey]['calls']

for _key,_value in redis_cmdstat_dict.items():

falcon_format = {

'Metric': '%s' % (_key),

'Endpoint': endpoint,

'Timestamp': timestamp,

'Step': step,

'Value': int(_value),

'CounterType': falcon_type,

'TAGS': tags

}

redis_cmdstat_list.append(falcon_format)

#查看Redis各种状态,根据需要增删监控项,str的值需要转换成int

redis_stat_list = []

monitor_keys = [

('connected_clients','GAUGE'),

('blocked_clients','GAUGE'),

('used_memory','GAUGE'),

('used_memory_rss','GAUGE'),

('mem_fragmentation_ratio','GAUGE'),

('total_commands_processed','COUNTER'),

('rejected_connections','COUNTER'),

('expired_keys','COUNTER'),

('evicted_keys','COUNTER'),

('keyspace_hits','COUNTER'),

('keyspace_misses','COUNTER'),

('keyspace_hit_ratio','GAUGE'),

('keys_num','GAUGE'),

]

stat_info = conn.stat_info()

for _key,falcon_type in monitor_keys:

#计算命中率

if _key == 'keyspace_hit_ratio':

try:

_value = round(float(stat_info.get('keyspace_hits',0))/(int(stat_info.get('keyspace_hits',0)) + int(stat_info.get('keyspace_misses',0))),4)*100

except ZeroDivisionError:

_value = 0

#碎片率是浮点数

elif _key == 'mem_fragmentation_ratio':

_value = float(stat_info.get(_key,0))

#拿到key的数量

elif _key == 'keys_num':

_value = 0

for i in range(16):

_key = 'db'+str(i)

_num = stat_info.get(_key)

if _num:

_value += int(_num.get('keys'))

_key = 'keys_num'

#其他的都采集成counter,int

else:

try:

_value = int(stat_info[_key])

except:

continue

falcon_format = {

'Metric': '%s' % (_key),

'Endpoint': endpoint,

'Timestamp': timestamp,

'Step': step,

'Value': _value,

'CounterType': falcon_type,

'TAGS': tags

}

redis_stat_list.append(falcon_format)

load_data = redis_stat_list+redis_cmdstat_list

print json.dumps(load_data,sort_keys=True,indent=4)

requests.post(open_falcon_api, data=json.dumps(load_data))

48304ba5e6f9fe08f3fa1abda7d326ab.png

指标说明:收集指标里的COUNTER表示每秒执行次数,GAUGE表示直接输出值。

指标

类型

说明

connected_clients

GAUGE

连接的客户端个数

blocked_clients

GAUGE

被阻塞客户端的数量

used_memory

GAUGE

Redis分配的内存的总量

used_memory_rss

GAUGE

OS分配的内存的总量

mem_fragmentation_ratio

GAUGE

内存碎片率,used_memory_rss/used_memory

total_commands_processed

COUNTER

每秒执行的命令数,比较准确的QPS

rejected_connections

COUNTER

被拒绝的连接数/秒

expired_keys

COUNTER

过期KEY的数量/秒

evicted_keys

COUNTER

被驱逐KEY的数量/秒

keyspace_hits

COUNTER

命中KEY的数量/秒

keyspace_misses

COUNTER

未命中KEY的数量/秒

keyspace_hit_ratio

GAUGE

KEY的命中率

keys_num

GAUGE

KEY的数量

cmd_*

COUNTER

各种名字都执行次数/秒

使用说明:读取配置到都数据库列表执行,配置文件格式如下(redisdb_list.txt):

IP,Port,Password,endpoint

192.168.1.56,7021,zhoujy,redis-56:7021

192.168.1.55,7021,zhoujy,redis-55:7021

最后执行:

python redis_monitor.py redisdb_list.txt

3) MongoDB 收集信息脚本(mongodb_monitor.py)

...后续添加

4)其他相关的监控(需要装上agent),比如下面的指标:

告警项触发条件备注

load.1min

all(#3)>10

Redis服务器过载,处理能力下降

cpu.idle

all(#3)<10

CPU idle过低,处理能力下降

df.bytes.free.percent

all(#3)<20

磁盘可用空间百分比低于20%,影响从库RDB和AOF持久化

mem.memfree.percent

all(#3)<15

内存剩余低于15%,Redis有OOM killer和使用swap的风险

mem.swapfree.percent

all(#3)<80

使用20% swap,Redis性能下降或OOM风险

net.if.out.bytes

all(#3)>94371840

网络出口流量超90MB,影响Redis响应

net.if.in.bytes

all(#3)>94371840

网络入口流量超90MB,影响Redis响应

disk.io.util

all(#3)>90

磁盘IO可能存负载,影响从库持久化和阻塞写

相关文档:

https://www.cnblogs.com/zhoujinyi/p/6645104.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值