目标:实现在监控3306端口服务时,出现1次critical软状态时或者在上一次执行后没有成功后出现的第一次硬状态critical情况下,远程执行mysql重启服务,并且每次执行远程重启服务前把报告事件记录到DB中
牵涉技术:
(1)Nagios事件处理原理
(2)Ssh无密码登录执行命令
(3)Perl操作mysql
如果大家对以上三条都掌握了,相信看懂这篇文章也就不成话下了。
##进入正题##
前期准备工作
I.制作ssh无密码登录
实现目标:nagios用户无密码登录server
大家对root用户无密码登录都做过。但是今天,我要做的是普通用户nagios用户无密码登录(在此感谢我同事的技术支持).
角色
Host_ip
备注
Client
192.168.x.x
Nagios监控端作为Client,目的是为了远程执行脚本
Server
192.168.x.y
存启动服务脚本,如:mysql脚本
Client端(192.168.x.x)制作
---------------------------------------------------------------------------------------------------
(1) 创建nagios用户略过(Server端也需要)
(2) su –nagios环境下执行
ssh-keygen -t rsa
一路回车便可,无需密码。
(3)将公钥copy到server端nagios家目录下
[nagios@nagios ~]$ scp .ssh/id_rsa.pub nagios@192.168.x.y:/home/nagios/
The authenticity of host '192.168.x.y (192.168.x.y)' can't be established.
RSA key fingerprint is 66:9a:b5:86:3d:81:22:9b:f8:67:9e:af:aa:4c:4a:97.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.x.y' (RSA) to the list of known hosts.
nagios@192.168.x.y's password:
id_rsa.pub 100% 411 0.4KB/s 00:00
---------------------------------------------------------------------------------------------------
Server端(192.168.x.x)制作
--------------------------------------------------------------------------------------------------
(1) 进入server端,登入nagios帐号
(2) 创建mkdir /home/nagios/.ssh
(3) 将公钥匙写入authorized_keys文件:
cat /home/nagios/id_rsa.pub >>.ssh/authorized_keys
(4) 改权限(以root身份或者通过visudo授权给nagios):
chmod 700 /home/nagios/.ssh
chmod 600 /home/nagios/.ssh/authorized_keys
检查
SERVER端权限检查
[root@centos-server nagios]# ls -la /home/nagios|grep .ssh
drwx------- 2 nagios nagios 4096 Aug 3 09:04 .ssh
[root@centos-server nagios]# ls -la /home/nagios/.ssh/
total 12
drw------- 2 nagios nagios 4096 Aug 3 09:04 .
drwx------ 4 nagios nagios 4096 Aug 3 09:03 ..
-rw------- 1 nagios nagios 411 Aug 3 09:04 authorized_keys
请确保红色标识的内容(保证.ssh目录的权限为700, authorized_keys的权限为600)
nagios用户持有者
CLIENT端登录测试
[nagios@nagios ~]$ ssh nagios@192.168.x.y
Last login: Wed Aug 3 09:15:59 2011 from 192.168.x.x
[nagios@centos-server ~]$
看到没?从192.168.x.x登录到192.168.x.y无需密码了。
如果没有这样的效果,大家看下是不是前面的权限问题。我曾今也是因为权限折腾了我同事半天。哈哈。
II.无密码登录远程执行命令
实现目标:nagios用户远程启动server端mysql服务
-----------------------------------------------------------------------------------------------
Server端(192.168.x.x)制作
------------------------------------------------------------------------------------------------
(1) 配置mysql启动控制脚本
输入以下SQL语句,创建一个具有root权限的用户(admin)和密码(controlmysql):
GRANT ALL PRIVILEGES ON *.* TO 'admin'@'localhost' IDENTIFIED BY ' controlmysql ';
GRANT ALL PRIVILEGES ON *.* TO 'admin'@'127.0.0.1' IDENTIFIED BY ' controlmysql ';
作用:用与启动/关闭控制mysql服务
Mysql控制(启动/停止等)脚本
#!/bin/sh
mysql_port=3306
mysql_username="admin"
mysql_password=" controlmysql "
mysql_scripts_path="/data0/mysql/3306"
mysqld_path="/usr/local/webserver/mysql"
start_mysql()
{
printf "Starting MySQL...\n"
/bin/sh ${mysqld_path}/bin/mysqld_safe --defaults-file=/data0/mysql/${mysql_port}/my.cnf 2>&1 > /dev/null &
}
stop_mysql()
{
printf "Stoping MySQL...\n"
${mysqld_path}/bin/mysqladmin -u ${mysql_username} -p${mysql_password} -S /tmp/mysql.sock shutdown
}
restart_mysql()
{
printf "Restarting MySQL...\n"
stop_mysql
sleep 5
start_mysql
}
kill_mysql()
{
kill -9 $(ps -ef | grep 'bin/mysqld_safe' | grep -v 'grep'| awk '{printf $2}')
kill -9 $(ps -ef | grep 'libexec/mysqld' | grep -v 'grep' |awk '{printf $2}')
}
if [ "$1" = "start" ]; then
start_mysql
elif [ "$1" = "stop" ]; then
stop_mysql
elif [ "$1" = "restart" ]; then
restart_mysql
elif [ "$1" = "kill" ]; then
kill_mysql
else
printf "Usage: ${mysql_scripts_path}/mysql {start|stop|restart|kill}\n"
fi
(2) 配置sudo,允许nagios用户执行脚本
**如果没有sudo,yum –y install sudo**
#visudo
添加
nagios ALL=(root) NOPASSWD:/data0/mysql/3306/mysql start
检查
SERVER端脚本测试检查
[root@centos-server ~]# netstat -an|grep 3306
[root@centos-server ~]#
说明mysql没有起来
[root@centos-server ~]# /data0/mysql/3306/mysql start
Starting MySQL...
[root@centos-server ~]# netstat -an|grep 3306
tcp 0 0 :::3306 :::* LISTEN
[root@centos-server ~]#
脚本OK,正常
Client端测试(以nagios用户登录)
[nagios@nagios ~]$ ssh nagios@192.168.x.y "sudo /data0/mysql/3306/mysql start"
sudo: sorry, you must have a tty to run sudo
解决:
Server端修改visudo,将下面一行注释
Defaults requiretty
再试
[nagios@nagios ~]$ ssh nagios@192.168.x.y "sudo /data0/mysql/3306/mysql start"
Starting MySQL...
正常启动
检查SERVER端 端口3306是否存在
恭喜,基本功已经做完。我们可以去玩监控端nagios配置了
III.Nagios监控端配置
(1)nagios基本配置文件如下:
mfs_hosts.cfg
define host{
use mfs-server
host_name mfs-1