概述
包括mysql宕机报警,mysql主从io,sql状态异常报警以及主从延迟过长报警
通过外部STMP发邮件
# vi /etc/mail.rc
添加以下设置:
# 发送邮件邮箱
set from=WENTAO_Wanna@126.com
# 外部stmp服务器地址
set smtp=smtp.126.com
# 外部smtp服务器认证的用户名
set smtp-auth-user=WENTAO_Wanna
# 外部smtp服务器认证的用户密码,注意是外部服务器的授权码并非邮箱登录密码
set smtp-auth-password=123456
# 邮件认证的方式
set smtp-auth=login
调试
# echo mail_content | mail -s "mail_title" WENTAO_Wanna@126.com,WENTAO_Wanna@foxmail.com
shell脚本实现自动警告邮件提示
创建保存警告文件的路径:/data/mysql_warning
# cd /data
# mkdir mysql_warning
创建备份脚本文件:/data/mysql_bak/mysql_monitoring.sh
# cd /data/mysql_warning
# touch mysql_monitoring.sh
# vi mysql_monitoring.sh
输入以下内容:
#check MySQL Slave's Runnning Status
#Crontab time 00:10
MYSQLPORT=`netstat -na|grep "LISTEN"|grep "3306"|awk -F[:" "]+ '{print $4}'`
MYSQLIP=`ifconfig eth0|grep "inet" | awk ' {print $2}'`
STATUS=$(/usr/bin/mysql -uroot -pyour-password -S /data/mysql/mysql.sock -e "show slave status\G" | grep -i "running")
DELAYED=$(/usr/bin/mysql -uroot -pyour-password -S /data/mysql/mysql.sock -e "show slave status\G" | grep -i "Seconds_Behind_Master")
#echo "$STATUS"
IO_env=`echo $STATUS | grep IO | awk ' {print $2}'`
echo "============================="
echo "$STATUS"
echo "$DELAYED"
SQL_env=`echo $STATUS | grep SQL | awk ' {print $4}'`
DELAYED_env=`echo $DELAYED | awk ' {print $2}'`
DATA=`date +"%y-%m-%d %H:%M:%S"`
function checkMysqlStatus(){
if [ "$MYSQLPORT" == "3306" ]
then
/usr/bin/mysql -uroot -pyour-password --connect_timeout=5 -e "show databases;" &>/dev/null 2>&1
if [ $? -ne 0 ]
then
echo "Server: $MYSQLIP mysql is down, please try to restart mysql by manual!" > /data/mysql/mysql.err
mail -s "WARN! server: $MYSQLIP mysql is down." WENTAO_Wanna@126.com < /data/mysql/mysql.err
else
echo "mysql is running..."
fi
else
mail -s "WARN!Server: $MYSQLIP mysql is down." WENTAO_Wanna@126.com
fi
}
echo "================================="
echo "$IO_env"
echo "$SQL_env"
echo "$DELAYED_env"
echo "================================="
checkMysqlStatus
# 延时大于60s
if [ "$DELAYED_env" -ge 60 ]
then
echo "MySQL Slave is delayed $DELAYED_env s!"
echo "####### $DATA #########">> /data/mysql_slave/mysql_slave_status.log
echo "MySQL Slave is delayed $DELAYED_env s!" >> /data/mysql_slave/mysql_slave_status.log
echo "MySQL Slave is delayed $DELAYED_env s!" | mail -s "WARN! $MYSQLIP MySQL Slave is delayed $DELAYED_env s!" WENTAO_Wanna@126.com
fi
if [ "$IO_env" = "Yes" -a "$SQL_env" = "Yes" ]
then
echo "MySQL Slave is running!"
else
echo "####### $DATA #########">> /data/mysql_slave/mysql_slave_status.log
echo "MySQL Slave is not running!" >> /data/mysql_slave/mysql_slave_status.log
echo "MySQL Slave is not running!" | mail -s "WARN! $MYSQLIP MySQL Slave is not running." WENTAO_Wanna@126.com
fi
检测参数说明:
Slave_IO_Running:该参数可作为io_thread的监控项,Yes表示io_thread的和主库连接正常并能实施复制工作,No则说明与主库通讯异常,多数情况是由主从间网络引起的问题;
Slave_SQL_Running:该参数代表sql_thread是否正常,具体就是语句是否执行通过,常会遇到主键重复或是某个表不存在。
Seconds_Behind_Master:是通过比较sql_thread执行的event的timestamp和io_thread复制好的event的timestamp(简写为ts)进行比较,而得到的这么一个差值;
NULL—表示io_thread或是sql_thread有任何一个发生故障,也就是该线程的Running状态是No,而非Yes。
0 — 该值为零,是我们极为渴望看到的情况,表示主从复制良好,可以认为lag不存在。
正值 — 表示主从已经出现延时,数字越大表示从库落后主库越多。
负值 — 几乎很少见,我只是听一些资深的DBA说见过,其实,这是一个BUG值,该参数是不支持负值的,也就是不应该出现。
修改文件属性,使其可执行
# chmod +x /data/mysql_warning/mysql_monitoring.sh
修改/etc/crontab
# vi /etc/crontab
在下面添加
10 00 * * * root /data/mysql_warning/mysql_monitoring.sh
表示每天早上8点至晚上12点每30分钟执行一次监控脚本
重新启动crond使设置生效
# systemctl enable crond.service #设为开机启动
# systemctl restart crond.service #重新启动
手动执行效果
# sh /data/mysql_warning/mysql_monitoring.sh
Warning: Using a password on the command line interface can be insecure.
Warning: Using a password on the command line interface can be insecure.
=============================
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Seconds_Behind_Master: 0
=================================
Yes
Yes
0
=================================
mysql is running...
MySQL Slave is running!
邮件效果
邮件效果图
参考