网上有很多关于nagios 监控mongodb 的介绍,但是无一例外,手工执行 python 能获取客户端的值,nagios 监控界面上面显示null, 针对该问题,本人用shell 脚本重新封装了部分监控选项,让nagios 能正常监控mongodb 服务器

如果想了解nagios 监控mongodb的部署过程,请点击:

http://www.2cto.com/database/201410/341855.html

https://github.com/mzupan/nagios-plugin-mongodb/blob/master/README.md


实现原理:

利用shell 脚本 获取 check_mongodb.py 的 值,然后传给nagios 实现nagios 警告

 

 

系统环境变量:

centos5.8 64bit

python2.4.3

pymongo  1.9

 

安装pymongo

tar -xvzfpymongo-1.9.tar.gz

cd pymongo-1.9

pythonsetup.py install

 

检查是否安装pymongo

[root@P-masternagios-plugin-mongodb]# python

Python2.4.3 (#1, Feb 22 2012, 16:05:45)

[GCC4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2

Type"help", "copyright", "credits" or"license" for more information.

>>>import pymongo

>>>pymongo.version

'1.9'

>>>import sys

>>>sys.exit()

 

 

mv  nagios-plugin-mongodb-bycsc.zip  /usr/local/nagios/libexec/

unzip nagios-plugin-mongodb-bycsc.zip

chown -R nagios:nagios  /usr/local/nagios/libexec/nagios-plugin-mongodb

chmod -R 755  /usr/local/nagios/libexec/nagios-plugin-mongodb

 

 

执行check_mongodb.py 看是否能正常运行,如下显示说明正常运行

[root@P-masternagios-plugin-mongodb]# ./check_mongodb.py -h

。。。省略

  -c COLLECTION, --collection=COLLECTION

                        Specify the collectionto check

  -T SAMPLE_TIME, --time=SAMPLE_TIME

                        Time used to samplenumber of pages faults

 

nagios服务配置:

1.nagios 服务器root 账户 定时任务配置:

具体参数请参考:/usr/local/nagios/libexec/nagios-plugin-mongodb/check_mongodb.sh脚本

 

*/10 * * * *  /usr/local/nagios/libexec/nagios-plugin-mongodb/check_mongodb.sh  10.0.8.17 ALL 30000

*/10 * * * * /usr/local/nagios/libexec/nagios-plugin-mongodb/check_mongodb.sh  10.0.8.18 ALL 30000     

*/10 * * * * /usr/local/nagios/libexec/nagios-plugin-mongodb/check_mongodb.sh  10.0.8.19 ALL 30000     

 

利用定时任务,把以上服务器的状态检查结果重定向到/tmp文件夹下

2.nagios 服务器端 配置文件配置

commands 配置文件:

vi  /usr/local/nagios/etc/objects/commands.cfg  添加:

definecommand {

    command_name    check_mongodb

    command_line   /usr/local/nagios/libexec/nagios-plugin-mongodb/check_mongodb.sh  '$HOSTADDRESS$'  '$ARG1$' '$ARG2$'

 

}

 

命令解说:

/usr/local/nagios/libexec/nagios-plugin-mongodb/check_mongodb.sh    IP地址  监控选项  端口

 

check_mongodb.py监控选项可以有如下:

usage:check_mongodb.py [options]

 

check_mongodb.py:error: option -A: invalid choice: 'memordfd' (choose from 'connect','connections', 'replication_lag', 'replication_lag_percent', 'replset_state','memory', 'memory_mapped', 'lock', 'flushing', 'last_flush_time','index_miss_ratio', 'databases', 'collections', 'database_size','database_indexes', 'collection_indexes', 'collection_size', 'queues', 'oplog','journal_commits_in_wl', 'write_data_files', 'journaled', 'opcounters','current_lock', 'replica_primary', 'page_faults', 'asserts','queries_per_second', 'page_faults', 'chunks_balance', 'connect_primary','collection_state', 'row_count', 'replset_quorum')

 

目前check_mongodb.sh只是配置了 'connect'  'connections' replset_state 'memory' 选项

 

可以参考 README.md 列子进行配置!

测试配置:

nagios 账户下运行命令检查mongodb服务器:(nagios账户下面执行)

su -nagios

# su  - nagios

$/usr/local/nagios/libexec/nagios-plugin-mongodb/check_mongodb.sh 10.0.8.19memory 30000

OK -Memory Usage: 0.04GB resident, 0.78GB virtual, 0.08GB mapped, 0.16GBmappedWithJournal

如显示以上结果,则通过

 

编写nagios 服务器端 mongodb 服务器的配置文件:

 

[root@P-masternagios-plugin-mongodb]# cat  /usr/local/nagios/etc/objects/server-8-17.cfg

definehost{

        use           linux-server

        host_name               server-8-17

        alias                   server-8-17

        address                 10.0.8.17

        }

 

defineservice{

        use             generic-service

        host_name       server-8-17

        service_description     SSH

        check_command   check_ssh

        }

 

......省略其他配置文件

#检测mongodb服务的连接时间

defineservice{

        use             generic-service

        host_name       server-8-17

        service_description     check mongodb connect 30000

        check_command            check_mongodb!connect!30000

 

        }

 

#检查mongodb的连接数

defineservice{

        use             generic-service

        host_name       server-8-17

        service_description     check mongodb connections 30000

        check_command            check_mongodb!connections!30000

 

        }

 

#检查mongodb内存使用率

defineservice{

        use             generic-service

        host_name       server-8-17

        service_description     check mongodb memory 30000

        check_command           check_mongodb!memory!30000

 

        }

 

#mongo复制的状态

defineservice{

        use             generic-service

        host_name       server-8-17

        service_description     check mongodb replset state 30000

        check_command           check_mongodb!replset_state!30000

}

 

#检查mongodb复制完成的百分比率确保primary和standby的time是一致的

defineservice{

        use             generic-service

        host_name       server-8-17

        service_description     check mongodb replication lag 30000

        check_command           check_mongodb!replication_lag!30000

 

}

备注:

检查筏值 请在/usr/local/nagios/libexec/nagios-plugin-mongodb/check_mongodb.sh 里面配置 -W -C 参数



请点击下载 重新封装过的nagios-plugin-mongodb: http://down.51cto.com/data/2061502

下载后 放在/usr/local/nagios/libexec/目录,

解压:unzip nagios-plugin-mongodb-bycsc.zip

赋权:chown -R nagios:nagios nagios-plugin-mongodb 

 

##################################################################

 more /usr/local/nagios/libexec/nagios-plugin-mongodb/check_mongodb.sh

#!/bin/sh


# crontab by user root

# */5 * * * *  /usr/local/nagios/libexec/nagios-plugin-mongodb/check_mongodb.sh 192.168.0.1 ALL 27017 > /dev/null 2>&1


# run process nrpe by user nagios

#command[check_mongodb_connect]=/usr/local/nagios/libexec/nagios-plugin-mongodb/check_mongodb.sh connect

#command[check_mongodb_connections]=/usr/local/nagios/libexec/nagios-plugin-mongodb/check_mongodb.sh connections

#command[check_mongodb_memory]=/usr/local/nagios/libexec/nagios-plugin-mongodb/check_mongodb.sh memory

#command[check_mongodb_replset_state]=/usr/local/nagios/libexec/nagios-plugin-mongodb/check_mongodb.sh replset_state



#VERSION="mycheck_mongodb.sh v1.0a, by csc, 2015-06-17."


######################################

RUN_BY_ROOT()

{

tmpfile=/tmp/check_mongodb_$1_$2_$3.tmp

case $2 in 

connect)

        /usr/local/nagios/libexec/nagios-plugin-mongodb/check_mongodb.py   -H $1  -A $2 -P $3 -W 2 -C 4 >$tmpfile

        ;;

connections)

        /usr/local/nagios/libexec/nagios-plugin-mongodb/check_mongodb.py   -H $1  -A $2 -P $3 -W 70 -C 80 >$tmpfile

        ;;


memory)

        /usr/local/nagios/libexec/nagios-plugin-mongodb/check_mongodb.py   -H $1  -A $2 -P $3 -W 20 -C 28 >$tmpfile

        ;;



replset_state)

         /usr/local/nagios/libexec/nagios-plugin-mongodb/check_mongodb.py   -H $1  -A $2 -P $3 -W 0 -C 0 >$tmpfile

        ;;


replication_lag)

/usr/local/nagios/libexec/nagios-plugin-mongodb/check_mongodb.py   -H $1  -A $2 -P $3 -W 15 -C 30 >$tmpfile

        ;;




*)

        echo "./check_mongodb.sh  192.168.0.1 connect  27017"

        ;;

esac



}



######################################

RUN_BY_NAGIOS()

{


tmpfile=/tmp/check_mongodb_$1_$2_$3.tmp


if [ -f $tmpfile ];then

cat $tmpfile|grep OK|grep -v grep >/dev/null

if [ $? -eq 0 ];then

/bin/echo -e |cat $tmpfile

 exit 0

fi


cat  $tmpfile|grep WARNING|grep -v grep>/dev/null

if [ $? -eq 0 ];then

/bin/echo -e |cat  $tmpfile

exit 1

fi


cat  $tmpfile |grep CRITICAL|grep -v grep>/dev/null

if [ $? -eq 0 ];then

/bin/echo -e |cat  $tmpfile

exit 2

fi

 

else

echo "$tmpfile is not exist!"

exit 1

fi


}

######################################


USER_NAME=`/usr/bin/whoami`

if [ "$USER_NAME" = "root" ]; then

  RUN_BY_ROOT $1 connect $3 

  RUN_BY_ROOT $1 connections $3

  RUN_BY_ROOT $1 memory $3

  RUN_BY_ROOT $1 replset_state $3

  RUN_BY_ROOT $1 replication_lag $3

else

  RUN_BY_NAGIOS $1 $2 $3

fi