Mysql的高可用方案五花八门,本人人为最可靠、稳定、实用的还是mha。
mha方案涉及到5个节点:manager\secondary-check1\secondary-check2\db1\db2
备注:
a.secondary-check1\secondary-check2虽是可选的,但强烈建议使用,防止因为网络原因误切换;
b.官方推荐数据库3节点,但我人为业务量不是超大的情况下2节点足够,经过测试2节点可以正常运行,只是切换后有些日志要注意下(不要认为是异常)。
安装
1.在manager\db1\db2上安装依赖
# rpm -Uvh http://mirrors.kernel.org/fedora-epel/6/i386/epel-release-6-8.noarch.rpm
# yum install -yperl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch
#yum install -yperl-Module-Install perl-CPAN
2.修改源码
修改源码不是为了装13,而是确实有不爽的地方。
MHA5.6推荐使用统一的binlog server,理由是binlog server性能比slave好,同步日志时不容易丢数据;但我认为:
a、我们的slave压力不大,而且slave即使有压力,也只是slave_sql会有些压力,slave_io压力很有限,不会因为这个导致切换丢数据;
b、我们的slave都是半同步复制的,主从不同步会有报警监控的,正常情况下都一直同步状态,不会丢数据;
c、引入binlog server,还要考虑单点问题,太繁琐。。。
所以,我没有实用binlog server.
但不使用binlog server的情况下,如果用了gtid,主从切换时mha就不会从主库上拷贝日志补齐从库,NND,这个受不了,所以要改下逻辑判断,不管有没有使用gtid,都要从主库拷贝日志补齐从库。
另外,从库我习惯把event_scheduler关闭,虽然这不是必须的(从库job都是disable on slave状态,也能保证不运行)。
1.原逻辑是未启用gtidde执行save_master_binlog,改为不管是否启用gitd,都执行save_master_binlog
vi mha4mysql-manager-0.56/lib/MHA/MasterFailover.pm
# if ( !$_server_manager->is_gtid_auto_pos_enabled() ) {
$log->info();
$log->info("* Phase 3.2: Saving Dead Master's Binlog Phase..\n");
$log->info();
save_master_binlog($dead_master);
# }
2.ave_master_binlog_internal方法中,save_binary_logs参数改为和save_from_binlog_server方法中的相同,修改了一下三个参数:
--handle_raw_binlog=0
--skip_filter=1
--oldest_version=$_server_manager->get_oldest_version()
vi mha4mysql-manager-0.56/lib/MHA/MasterFailover.pm
# my $command =
#"save_binary_logs --command=save --start_file=$master_log_file --start_pos=$read_master_log_pos --binlog_dir=$dead_master->{master_binlog_dir} --output_file=$_diff_binary_log_remote --handle_raw_binlog=$dead_master->{handle_raw_binlog} --disable_log_bin=$dead_master->{disable_log_bin} --manager_version=$MHA::ManagerConst::VERSION";
my $command =
"save_binary_logs --command=save --start_file=$master_log_file --start_pos=$read_master_log_pos --binlog_dir=$dead_master->{master_binlog_dir} --output_file=$_diff_binary_log_remote --handle_raw_binlog=0 --skip_filter=1 --disable_log_bin=$dead_master->{disable_log_bin} --manager_version=$MHA::ManagerConst::VERSION";
# unless ( $dead_master->{handle_raw_binlog} ) {
my $oldest_version = $_server_manager->get_oldest_version();
$command .= " --oldest_version=$oldest_version ";
# }
3.新增event_scheduler设置
vi mha4mysql-manager-0.56/lib/MHA/DBHelper.pm
sub disable_read_only($) {
my $self = shift;
if ( $self->is_read_only() eq "0" ) {
return 0;
}
else {
$self->execute("SET GLOBAL event_scheduler=1");
return $self->execute(Unset_Readonly_SQL);
}
}
3.在对应节点安装mha软件
manager,db1,db2上安装node
# cd mha4mysql-node-0.56
# perl Makefile.PL
# make
# make install
manager上安装manager
# cd mha4mysql-manager-0.56
# perl Makefile.PL
# make
# make install
manager上安装mysql客户端
略
4.ssh配置
以下节点要配置SSH免密登录
manager -> db1
manager -> db2
manager -> secondary
manager -> secondary
db1 -> db2
db2 -> db1
配置文件
==== /etc/masterha_default.cnf ====
[server default]
ssh_user=root
master_binlog_dir=/data1/binlog
remote_workdir=/data1/mha
secondary_check_script= masterha_secondary_check -s 192.168.1.10 -s 192.168.1.11
ping_interval=3
master_ip_failover_script= /root/mha/scripts/master_ip_failover
shutdown_script= /root/mha/scripts/power_manager.sh
report_script= /root/mha/scripts/send_report
#master_ip_online_change_script= /root/mha/scripts/master_ip_online_change
==== /root/mha/conf/app1.conf ====
[server default]
user=root
password=111111
manager_workdir=/root/mha/logs/app1
manager_log=/root/mha/logs/app1/manager.log
master_ip_failover_script= /root/mha/scripts/master_ip_failover --vip=192.168.1.100/24
[server1]
hostname=192.168.1.1
port=3306
ssh_port=22
check_repl_delay=0
[server2]
hostname=192.168.1.2
port=3306
ssh_port=22
check_repl_delay=0
脚本
master_ip_failover
#!/usr/bin/env perl
use strict;
use warnings FATAL =>'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port,
$vip
);
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
'vip=s' => \$vip,
);
my $ssh_start_vip = "/sbin/ip addr add $vip dev eth1";
my $ssh_stop_vip = "/sbin/ip addr del $vip dev eth1";
my $exit_code = 0;
exit &main();
sub main {
#print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stopssh" ) {
# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
print "\n\n\n***************************************************************\n";
print "Disabling the VIP - $vip on old master: $orig_master_host\n";
print "***************************************************************\n\n\n\n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
print "\n\n\n***************************************************************\n";
print "Enabling the VIP - $vip on new master: $new_master_host \n";
print "***************************************************************\n\n\n\n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
`ssh $ssh_user\@$orig_master_host \" $ssh_start_vip \"`;
exit 0;
}
#by meishidong
elsif ( $command eq "stop" ) {
print "command = stopssh, do nothing.\n";
exit 0;
}
else {
&usage();
exit 1;
}
}
# A simple system call that enable the VIP on the new master
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover -command=start|stop|stopssh|status -orig_master_host=host -orig_master_ip=ip -orig_master_port=port -new_master_host=host -new_master_ip=ip -new_master_port=port\n";
}
power_manager.sh
为了防止脑裂,如果主库无法SSH,则强制将主关闭,防止脑裂;
如果是物理机,则用远程访问管理卡关机(如dell的idrac);如果是vmware虚拟机,用vmware的API关闭虚拟机。具体逻辑封装到脚本power_manager.py中。
另一个文章会说明如何通过调用vmware的api关闭虚拟机。
#!/bin/bash
# If ssh is reachable reachable (OS is alive but mysqld is not running, i.e. data file is corrupt), MHA Manager passes the following arguments.
# --command=stopssh
# --ssh_user=(ssh username so that you can connect to the master)
# --host=(master's hostname)
# --ip=(master's ip address)
# --port=(master's port number)
# --pid_file=(master's pid file)
# If the master is not reachable via SSH, MHA Manager passes the following arguments.
# --command=stop
# --host=(master's hostname)
# --ip=(master's ip address)
#vmware server
vms_ip=192.168.1.101
vms_username=m******
vms_passwd=111111
vms_port=443
for arg in $@
do
eval `echo $arg | sed 's/--//g'`
done
po_tmp_log=/root/mha/logs/power_manager/power_manager.tmp.$ip.log
po_manager_log=/root/mha/logs/power_manager/power_manager.$ip.log
stopmysql="service mysql stop; sleep 3; ps -ef | grep mysqld_safe | grep -v grep | awk '{print \$2}' | xargs -r kill -9; sleep 1; ps -ef | grep mysqld | grep -v grep | awk '{print \$2}' | xarg
s -r kill -9"
echo >> $po_manager_log
date >> $po_manager_log
echo "command = $command" >> $po_manager_log
echo "ip = $ip" >> $po_manager_log
function stopssh()
{
ssh -p 22 $ssh_user@$ip $stopmysql >> $po_manager_log 2>&1
mysqlproc=`ssh -p 22 $ssh_user@$ip 'ps -ef | grep mysqld | grep -v grep | wc -l'`
if [ $mysqlproc -eq 0 ]; then
echo "mysql process are killed successfully." >> $po_manager_log
return 10
else
echo "mysql process are killed falied." >> $po_manager_log
return 1
fi
}
function stop()
{
python3 /root/mha/scripts/power_manager.py -t=poweroff -s=$vms_ip -u=$vms_username -p=$vms_passwd -o=$vms_port -i=$ip > $po_tmp_log
cat $po_tmp_log >> $po_manager_log
powerstate=`cat $po_tmp_log | grep vm_powerstate | awk '{print $2}'`
if [ "$powerstate"x = "poweredOffx" ]; then
echo "power off $ip successfully." >> $po_manager_log
return 0
else
echo "power off $ip failed." >> $po_manager_log
return 1
fi
}
function status()
{
python3 /root/mha/scripts/power_manager.py -t=status -s=$vms_ip -u=$vms_username -p=$vms_passwd -o=$vms_port -i=$ip > $po_tmp_log
cat $po_tmp_log >> $po_manager_log
powerstate=`cat $po_tmp_log | grep vm_powerstate | awk '{print $2}'`
echo "get powerState of $ip: $powerstate." >> $po_manager_log
if [ "$powerstate"x = "poweredOnx" ]; then
return 0
else
return 1
fi
}
case "$command" in
stopssh)
stopssh
exit $?
;;
stop)
stop
exit $?
;;
status)
status
exit $?
;;
*)
echo "Usage: power_manager [stopssh|stop|status]"
exit 2
esac
send_report
#!/usr/bin/perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
#new_master_host and new_slave_hosts are set only when recovering master succeeded
my ( $dead_master_host, $new_master_host, $new_slave_hosts, $conf, $subject, $body );
GetOptions(
'orig_master_host=s' => \$dead_master_host,
'new_master_host=s' => \$new_master_host,
'new_slave_hosts=s' => \$new_slave_hosts,
'conf=s' => \$conf,
'subject=s' => \$subject,
'body=s' => \$body,
);
my $mailto = 'm****@sina.com';
system("/root/mha/scripts/send_mail.py '$mailto' '$subject' '$body' > /dev/null 2>&1");
system("/usr/bin/curl -s -d \"phoneNumber=15895811111\" -d \"msg=$subject\" http://sms.7881.com/interfaces/sendMsg.htm > /dev/null 2>&1");
exit 0;
send_mail.py
#!/usr/bin/env python
#-*- coding:utf-8 -*-
import smtplib
from email.mime.text import MIMEText
from email.utils import formatdate
from email.header import Header
import sys
default_encoding = 'utf-8'
if sys.getdefaultencoding() != default_encoding:
reload(sys)
sys.setdefaultencoding(default_encoding)
smtpHost = 'smtp.exmail.qq.com'
smtpPort = '25'
sslPort = '465'
fromMail = 'mysql@***.com'
username = 'mysql@***.com'
password = '111111'
def send_mail(to_list, subject, content):
encoding = 'utf-8'
mail = MIMEText(content.encode(encoding), 'plain', encoding)
mail['Subject'] = Header(subject, encoding)
mail['From'] = fromMail
mail['To'] = to_list
mail['Date'] = formatdate()
try:
# 连接smtp服务器,明文/SSL/TLS三种方式,根据你使用的SMTP支持情况选择一种
# 普通方式,通信过程不加密
# smtp = smtplib.SMTP(smtpHost, smtpPort)
# smtp.ehlo()
# smtp.login(username, password)
# tls加密方式,通信过程加密,邮件数据安全,使用正常的smtp端口
# smtp = smtplib.SMTP(smtpHost,smtpPort)
# smtp.set_debuglevel(True)
# smtp.ehlo()
# smtp.starttls()
# smtp.ehlo()
# smtp.login(username,password)
# 纯粹的ssl加密方式,通信过程加密,邮件数据安全
smtp = smtplib.SMTP_SSL(smtpHost,sslPort)
smtp.ehlo()
smtp.login(username,password)
smtp.sendmail(fromMail, to_list, mail.as_string())
smtp.close()
print("OK")
except Exception as e:
print("e")
send_mail(sys.argv[1], sys.argv[2], sys.argv[3])
配置OK,在manager上:
nohup masterha_manager--conf=/root/mha/conf/app1/conf > /tmp/mha.log 2>&1 &