用MHA实现mysql自动故障转移(配置vip模式,非keepalive)

一:MHA介绍

什么是mha,有什么特性 

1. 主服务器的自动监控和故障转移 

MHA监控复制架构的主服务器,一旦检测到主服务器故障,就会自动进行故障转移。即使有些从服务器没有收到最新的relay log,MHA自动从最新的从服务器上识别差异的relay log并把这些日志应用到其他从服务器上,因此所有的从服务器保持一致性了。MHA通常在几秒内完成故障转移,9-12秒可以检测出主服务器故障,7-10秒内关闭故障的主服务器以避免脑裂,几秒中内应用差异的relay log到新的主服务器上,整个过程可以在10-30s内完成。还可以设置优先级指定其中的一台slave作为master的候选人。由于MHA在slaves之间修复一致性,因此可以将任何slave变成新的master,而不会发生一致性的问题,从而导致复制失败。 

MHA Manager会定时探测集群中的master节点,当master出现故障时,它可以自动将最新数据的slave提升为新的master,然后将所有其他的slave重新指向新的master。整个故障转移过程对应用程序完全透明。

在MHA自动故障切换过程中,MHA试图从宕机的主服务器上保存二进制日志,最大程度的保证数据的不丢失,但这并不总是可行的。例如,如果主服务器硬件故障或无法通过ssh访问,MHA没法保存二进制日志,只进行故障转移而丢失了最新的数据。使用MySQL 5.5的半同步复制,可以大大降低数据丢失的风险。MHA可以与半同步复制结合起来。如果只有一个slave已经收到了最新的二进制日志,MHA可以将最新的二进制日志应用于其他所有的slave服务器上,因此可以保证所有节点的数据一致性。

目前MHA主要支持一主多从的架构,要搭建MHA,要求一个复制集群中必须最少有三台数据库服务器,一主二从,即一台充当master,一台充当备用master,另外一台充当从库,因为至少需要三台服务器,出于机器成本的考虑,淘宝也在该基础上进行了改造,目前淘宝TMHA已经支持一主一从。(出自:《深入浅出MySQL(第二版)》)

2. 交互式主服务器故障转移 

可以只使用MHA的故障转移,而不用于监控主服务器,当主服务器故障时,人工调用MHA来进行故障故障。 

3. 非交互式的主故障转移 

不监控主服务器,但自动实现故障转移。这种特征适用于已经使用其他软件来监控主服务器状态,比如heartbeat来检测主服务器故障和虚拟IP地址接管,可以使用MHA来实现故障转移和slave服务器晋级为master服务器。 

4. 在线切换主从服务器 

在许多情况下,需要将现有的主服务器迁移到另外一台服务器上。比如主服务器硬件故障,RAID控制卡需要重建,将主服务器移到性能更好的服务器上等等。维护主服务器引起性能下降,导致停机时间至少无法写入数据。另外,阻塞或杀掉当前运行的会话会导致主主之间数据不一致的问题发生。MHA提供快速切换和优雅的阻塞写入,这个切换过程只需要0.5-2s的时间,这段时间内数据是无法写入的。在很多情况下,0.5-2s的阻塞写入是可以接受的。因此切换主服务器不需要计划分配维护时间窗口(呵呵,不需要你在夜黑风高时通宵达旦完成切换主服务器的任务)。 

5. MHA由两部分组成:MHA Manager(管理节点)和MHA Node(数据节点) 

Manaer模块:可以管理多套Master-Slave Replication

Masterha_manager:提供实现自动故障检测和故障转移的命令

Node模块:部署在所有的MySQL Server上

管理节点可以和数据节点在同一台机器,也可以不在同一台机器上。

6. MHA比较灵活,可以写脚本,来进行故障转移,或者主从切换等。 

7.主库崩溃不存在数据一致性问题

8.不需要对当前mysql环境做重大修改

9.不需要添加额外的服务器(仅一台manager就可管理上百个replication)

10.只要replication支持的存储引擎,MHA都支持,不会局限于innodb


缺点:

1、    虽然MHA试图从宕机的主服务器上保存二进制日志,但也会有问题。例如,如果主服务器硬件故障或无法通过ssh访问,MHA没法保存二进制日志,只进行故障转移而丢失最新数据。

2、    当主DB故障,切换到另外的服务器上后,即使恢复了原来的主DB,也不能立即加入整套MHA系统中,得重新部署。而且当发生一次切换后,管理节点的监控进程就会自动退出,需要用脚本来自动启动。

二:实验环境

搭建好一主两从mysql,要求:

将从库配置成只读模式。

每台数据库上都要建下复制用户

开启半同步复制

开启GTID。

  

vip:192.168.6.252,期初配在master上。

三:实验步骤 

1:修改/etc/hosts

 在3台机器上都添加每台服务器的主机名,如: 

192.168.6.51 master   //主

192.168.6.52  slave1    //从

192.168.6.70  slave2    //从(主备)

2:配置主机信任关系

 #在192.168.6.51生成密码文件,然后将其拷贝到本机,192.168.6.52和192.168.6.70上。 

 # ssh-keygen

 # ssh-copy-id  root@192.168.6.51

 # ssh-copy-id  root@192.168.6.52  

# ssh-copy-id  root@192.168.6.70 

依次在192.168.6.52,192.168.6.70上也生成密码文件,然后拷贝到本机与其他机器上。 

配置完成后,用ssh  ip测试,看是否能免密码登陆。

3:安装MHA

点击这里进行下载:

http://pan.baidu.com/s/1pJ0VkSz

或者:

http://download.csdn.net/download/yabignshi/8974251

http://download.csdn.net/detail/yabignshi/8974265

在所有数据节点上安装:

yum install perl-DBD-MySQL -y

rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm

 安装完成后会在/usr/bin目录下生成以下脚本文件(这些工具通常由MHA Manager的脚本触发,无需人为操作):

save_binary_logs               //保存和复制master的二进制日志
apply_diff_relay_logs          //识别差异的中继日志事件并将其差异的事件应用于其他的slave
filter_mysqlbinlog             //去除不必要的ROLLBACK事件(MHA已不再使用这个工具)
purge_relay_logs               //清除中继日志(不会阻塞SQL线程)

 

在管理节点上安装:

yum install perl-DBD-MySQL -y(由于这里管理节点和数据节点都在6.51上,所以这个省略)

yum install perl-Config-Tiny -y

yum install epel-release -y

yum install perl-Log-Dispatch -y

yum install perl-Parallel-ForkManager -y

rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm

rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm 

安装完成后会在/usr/bin目录下生成以下脚本文件:

-rwxr-xr-x. 1 root      root        1995 Apr  1  2014 masterha_check_repl
-rwxr-xr-x. 1 root      root        1779 Apr  1  2014 masterha_check_ssh
-rwxr-xr-x. 1 root      root        1865 Apr  1  2014 masterha_check_status
-rwxr-xr-x. 1 root      root        3201 Apr  1  2014 masterha_conf_host
-rwxr-xr-x. 1 root      root        2517 Apr  1  2014 masterha_manager
-rwxr-xr-x. 1 root      root        2165 Apr  1  2014 masterha_master_monitor
-rwxr-xr-x. 1 root      root        2373 Apr  1  2014 masterha_master_switch
-rwxr-xr-x. 1 root      root        5171 Apr  1  2014 masterha_secondary_check
-rwxr-xr-x. 1 root      root        1739 Apr  1  2014 masterha_stop

-rwxr-xr-x. 1 root      root        4807 Apr  1  2014 filter_mysqlbinlog

-rwxr-xr-x. 1 root      root        7525 Apr  1  2014 save_binary_logs

-rwxr-xr-x. 1 root      root        8261 Apr  1  2014 purge_relay_logs

-rwxr-xr-x. 1 root      root       16367 Apr  1  2014 apply_diff_relay_logs

4:从服务器配置

从服务器,要加上relay_log_purge=0,不加的话,会报出warning,relay_log_purge=0 is not set on slave

4.1 在线设置

set global relay_log_purge = 0;

4.2 修改配置文件

vi /etc/my.cnf

添加:

relay_log_purge = 0

5:配置mha manage

5.1 添加管理账号

#在数据节点上执行以下操作

grant all privileges on *.* TO mha@'192.168.%' IDENTIFIED BY 'test'; 

5.2:配置/etc/mha/app1.cnf

 #只在管理端做,manage这台机器

 mkdir /etc/mha

 mkdir -p /var/log/mha/app1

vi /etc/mha/app1.cnf

添加:

[server default]
manager_log=/var/log/mha/app1/manager.log
manager_workdir=/var/log/mha/app1.log
master_binlog_dir=/data/mysql/data
master_ip_failover_script= /etc/mha/master_ip_failover
master_ip_online_change_script=/etc/mha/master_ip_online_change
report_script=/etc/mha/send_report
secondary_check_script=masterha_secondary_check -s 192.168.6.51 -s 192.168.6.52 -s 192.168.6.70
user=mha
password=test
ping_interval=2
repl_password=beijing
repl_user=rep_user
ssh_user=root

[server1]
hostname=192.168.6.51
port=3306

[server2]
candidate_master=1
#check_repl_delay=0
hostname=192.168.6.52
port=3306

[server3]
hostname=192.168.6.70
port=3306

默认情况下,MHA Manager只是用单个线路检测:从Manager到Master。但是这个并不推荐。MHA通过调用一个额外的脚本(通过secondary_check_script参数指定),实际上可以有两个或更多的检查线路,以提高 MHA 的网络容忍能力,避免MHA频繁failover。

在server default中的配置,是三台机器共同的配置,也可以放到具体的server中进行定制.

5.3 master_ip_failover

vi /etc/mha/master_ip_failover
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
my $vip = '192.168.6.242/24';
my $key = '0';
my $ssh_start_vip = "/sbin/ifconfig enp0s3:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig enp0s3:$key down";
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);
exit &main();
sub main {
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {
return 0 unless ($ssh_user);
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip
--orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

#需要改下脚本里的浮动ip和网卡名称为自己实际要配的vip及所在网卡:

my $vip = '192.168.6.242/24';

my $ssh_start_vip = "/sbin/ifconfig enp0s3:$key $vip";

my $ssh_stop_vip = "/sbin/ifconfig enp0s3:$key down";

5.4 master_ip_online_change

vi /etc/mha/master_ip_online_change
添加:
#!/usr/bin/env perl
use strict;
use warnings FATAL =>'all';
use Getopt::Long;
my $vip = '192.168.144.242/24';
my $key = '0';
my $ssh_start_vip = "/sbin/ifconfig enp0s3:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig enp0s3:$key down";
my $exit_code = 0;
my (
$command, $orig_master_is_new_slave, $orig_master_host,
$orig_master_ip, $orig_master_port, $orig_master_user,
$orig_master_password, $orig_master_ssh_user, $new_master_host,
$new_master_ip, $new_master_port, $new_master_user,
$new_master_password, $new_master_ssh_user,
);
GetOptions(
'command=s' => \$command,
'orig_master_is_new_slave' => \$orig_master_is_new_slave,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'orig_master_user=s' => \$orig_master_user,
'orig_master_password=s' => \$orig_master_password,
'orig_master_ssh_user=s' => \$orig_master_ssh_user,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
'new_master_user=s' => \$new_master_user,
'new_master_password=s' => \$new_master_password,
'new_master_ssh_user=s' => \$new_master_ssh_user,
);
exit &main();
sub main {
#print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
print "\n\n\n***************************************************************\n";
print "Disabling the VIP - $vip on old master: $orig_master_host\n";
print "***************************************************************\n\n\n\n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
print "\n\n\n***************************************************************\n";
print "Enabling the VIP - $vip on new master: $new_master_host \n";
print "***************************************************************\n\n\n\n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
`ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_start_vip \"`;
exit 0;
}
else {
&usage();
exit 1;
}}
# A simple system call that enable the VIP on the new master
sub start_vip() {
`ssh $new_master_ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover -command=start|stop|stopssh|status -orig_master_host=host -
orig_master_ip=ip -orig_master_port=po
rt -new_master_host=host -new_master_ip=ip -new_master_port=port\n";
}

同样需要修改vip及网卡名称

5.5 send_report

vi /etc/mha/send_report,添加:

#!/usr/bin/perl
 
# Copyright (C) 2011 DeNA Co.,Ltd.
#
# This program is free software; you can redistribute it and/or modify
#  itunder the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
#  along with this program; if not, write to the Free Software
# Foundation, Inc.,
#  51Franklin Street, Fifth Floor, Boston, MA 02110-1301  USA
 
## Note: This is a sample script and is notcomplete. Modify the script based on your environment.
 
use strict;
use warnings FATAL => 'all';
use Mail::Sender;
use Getopt::Long;
 
#new_master_host and new_slave_hosts areset only when recovering master succeeded
my ( $dead_master_host, $new_master_host, $new_slave_hosts,$subject, $body );
my $smtp='smtp.163.com';
my $mail_from='xxxxxxx@163.com';
my $mail_user='xxxxxxx@163.com';
my $mail_pass='Password';
my$mail_to=['949538827@qq.com','15521xxxx@139.com'];
GetOptions(
 'orig_master_host=s' => \$dead_master_host,
 'new_master_host=s'  =>\$new_master_host,
 'new_slave_hosts=s'  =>\$new_slave_hosts,
 'subject=s'          =>\$subject,
 'body=s'             => \$body,
);
 
mailToContacts($smtp,$mail_from,$mail_user,$mail_pass,$mail_to,$subject,$body);
 
sub mailToContacts {
   my ( $smtp, $mail_from, $user, $passwd, $mail_to, $subject, $msg ) = @_;
   open my $DEBUG, "> /tmp/monitormail.log"
       or die "Can't open the debug     file:$!\n";
   my $sender = new Mail::Sender {
       ctype       => 'text/plain;charset=utf-8',
       encoding    => 'utf-8',
       smtp        => $smtp,
       from        => $mail_from,
       auth        => 'LOGIN',
       TLS_allowed => '0',
       authid      => $user,
       authpwd     => $passwd,
       to          => $mail_to,
       subject     => $subject,
       debug       => $DEBUG
   };
 
   $sender->MailMsg(
       {   msg   => $msg,
           debug => $DEBUG
       }
    )or print $Mail::Sender::Error;
   return 1;
}
 
 
 
# Do whatever you want here
 
exit 0;

注意:需要修改下以下几行内容:
my $smtp='smtp.163.com';
my $mail_from='xxxxxxx@163.com';
my $mail_user='xxxxxxx@163.com';
my $mail_pass='Password';
my$mail_to=['949538827@qq.com','15521xxxx@139.com'];

5.6 给脚本授权

chmod +x /etc/mha/master_ip_online_change

chmod +x /etc/mha/master_ip_failover

chmod +x /etc/mha/send_report

5.7 在主节点手动绑定vip

ifconfig enp0s3:0 192.168.6.242 up

6:检查mha manage是不是配置成功

6.1 检查ssh登录

 

[root@ser6-51 .ssh]#  masterha_check_ssh --conf=/etc/mha/app1.cnf  

Fri Aug  7 15:11:07 2015 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Aug  7 15:11:07 2015 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Fri Aug  7 15:11:07 2015 - [info] Reading server configuration from /etc/mha/app1.cnf..
Fri Aug  7 15:11:07 2015 - [info] Starting SSH connection tests..
Fri Aug  7 15:11:08 2015 - [debug] 
Fri Aug  7 15:11:07 2015 - [debug]  Connecting via SSH from root@192.168.6.51(192.168.6.51:22) to root@192.168.6.52(192.168.6.52:22)..
Fri Aug  7 15:11:07 2015 - [debug]   ok.
Fri Aug  7 15:11:07 2015 - [debug]  Connecting via SSH from root@192.168.6.51(192.168.6.51:22) to root@192.168.6.70(192.168.6.70:22)..
Fri Aug  7 15:11:08 2015 - [debug]   ok.
Fri Aug  7 15:11:08 2015 - [debug] 
Fri Aug  7 15:11:07 2015 - [debug]  Connecting via SSH from root@192.168.6.52(192.168.6.52:22) to root@192.168.6.51(192.168.6.51:22)..
Fri Aug  7 15:11:08 2015 - [debug]   ok.
Fri Aug  7 15:11:08 2015 - [debug]  Connecting via SSH from root@192.168.6.52(192.168.6.52:22) to root@192.168.6.70(192.168.6.70:22)..
Fri Aug  7 15:11:08 2015 - [debug]   ok.
Fri Aug  7 15:11:09 2015 - [debug] 
Fri Aug  7 15:11:08 2015 - [debug]  Connecting via SSH from root@192.168.6.70(192.168.6.70:22) to root@192.168.6.51(192.168.6.51:22)..
Fri Aug  7 15:11:08 2015 - [debug]   ok.
Fri Aug  7 15:11:09 2015 - [debug]  Connecting via SSH from root@192.168.6.70(192.168.6.70:22) to root@192.168.6.52(192.168.6.52:22)..
Fri Aug  7 15:11:09 2015 - [debug]   ok.
Fri Aug  7 15:11:09 2015 - [info] All SSH connection tests passed successfully.

/*

如果看到,All SSH connection tests passed successfully,就说明ssh配置成功了

假如报错:

 [error][/usr/share/perl5/vendor_perl/MHA/SSHCheck.pm, ln63]

将.ssh下的内容全部清空,然后重新认证即可

*/  

6.2检查mysql replication是否配置成功

[root@ser6-51 .ssh]#  masterha_check_repl --conf=/etc/mha/app1.cnf  
Fri Aug  7 15:33:11 2015 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Aug  7 15:33:11 2015 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Fri Aug  7 15:33:11 2015 - [info] Reading server configuration from /etc/mha/app1.cnf..
Fri Aug  7 15:33:11 2015 - [info] MHA::MasterMonitor version 0.56.
Fri Aug  7 15:33:11 2015 - [info] GTID failover mode = 0
Fri Aug  7 15:33:11 2015 - [info] Dead Servers:
Fri Aug  7 15:33:11 2015 - [info] Alive Servers:
Fri Aug  7 15:33:11 2015 - [info]   192.168.6.51(192.168.6.51:3306)
Fri Aug  7 15:33:11 2015 - [info]   192.168.6.52(192.168.6.52:3306)
Fri Aug  7 15:33:11 2015 - [info]   192.168.6.70(192.168.6.70:3306)
Fri Aug  7 15:33:11 2015 - [info] Alive Slaves:
Fri Aug  7 15:33:11 2015 - [info]   192.168.6.52(192.168.6.52:3306)  Version=5.6.20-r5436-log (oldest major version between slaves) log-bin:enabled
Fri Aug  7 15:33:11 2015 - [info]     Replicating from 192.168.6.51(192.168.6.51:3306)
Fri Aug  7 15:33:11 2015 - [info]     Primary candidate for the new Master (candidate_master is set)
Fri Aug  7 15:33:11 2015 - [info]   192.168.6.70(192.168.6.70:3306)  Version=5.6.20-r5436-log (oldest major version between slaves) log-bin:enabled
Fri Aug  7 15:33:11 2015 - [info]     Replicating from 192.168.6.51(192.168.6.51:3306)
Fri Aug  7 15:33:11 2015 - [info] Current Alive Master: 192.168.6.51(192.168.6.51:3306)
Fri Aug  7 15:33:11 2015 - [info] Checking slave configurations..
Fri Aug  7 15:33:11 2015 - [info] Checking replication filtering settings..
Fri Aug  7 15:33:11 2015 - [info]  binlog_do_db= , binlog_ignore_db= 
Fri Aug  7 15:33:11 2015 - [info]  Replication filtering check ok.
Fri Aug  7 15:33:11 2015 - [info] GTID (with auto-pos) is not supported
Fri Aug  7 15:33:11 2015 - [info] Starting SSH connection tests..
Fri Aug  7 15:33:13 2015 - [info] All SSH connection tests passed successfully.
Fri Aug  7 15:33:13 2015 - [info] Checking MHA Node version..
Fri Aug  7 15:33:14 2015 - [info]  Version check ok.
Fri Aug  7 15:33:14 2015 - [info] Checking SSH publickey authentication settings on the current master..
Fri Aug  7 15:33:14 2015 - [info] HealthCheck: SSH to 192.168.6.51 is reachable.
Fri Aug  7 15:33:14 2015 - [info] Master MHA Node version is 0.56.
Fri Aug  7 15:33:14 2015 - [info] Checking recovery script configurations on 192.168.6.51(192.168.6.51:3306)..
Fri Aug  7 15:33:14 2015 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysql/data --output_file=/var/tmp/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000032 
Fri Aug  7 15:33:14 2015 - [info]   Connecting to root@192.168.6.51(192.168.6.51:22).. 
  Creating /var/tmp if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /data/mysql/data, up to mysql-bin.000032
Fri Aug  7 15:33:15 2015 - [info] Binlog setting check done.
Fri Aug  7 15:33:15 2015 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Fri Aug  7 15:33:15 2015 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=192.168.6.52 --slave_ip=192.168.6.52 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.20-r5436-log --manager_version=0.56 --relay_log_info=/data/mysql/data/relay-log.info  --relay_dir=/data/mysql/data/  --slave_pass=xxx
Fri Aug  7 15:33:15 2015 - [info]   Connecting to root@192.168.6.52(192.168.6.52:22).. 
  Checking slave recovery environment settings..
    Opening /data/mysql/data/relay-log.info ... ok.
    Relay log found at /data/mysql/data, up to mysql-relay-bin.000002
    Temporary relay log file is /data/mysql/data/mysql-relay-bin.000002
    Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Fri Aug  7 15:33:16 2015 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=192.168.6.70 --slave_ip=192.168.6.70 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.20-r5436-log --manager_version=0.56 --relay_log_info=/data/mysql/data/relay-log.info  --relay_dir=/data/mysql/data/  --slave_pass=xxx
Fri Aug  7 15:33:16 2015 - [info]   Connecting to root@192.168.6.70(192.168.6.70:22).. 
  Checking slave recovery environment settings..
    Opening /data/mysql/data/relay-log.info ... ok.
    Relay log found at /data/mysql/data, up to mysql-relay-bin.000003
    Temporary relay log file is /data/mysql/data/mysql-relay-bin.000003
    Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Fri Aug  7 15:33:17 2015 - [info] Slaves settings check done.
Fri Aug  7 15:33:17 2015 - [info] 
192.168.6.51(192.168.6.51:3306) (current master)
 +--192.168.6.52(192.168.6.52:3306)
 +--192.168.6.70(192.168.6.70:3306)
 
Fri Aug  7 15:33:17 2015 - [info] Checking replication health on 192.168.6.52..
Fri Aug  7 15:33:17 2015 - [info]  ok.
Fri Aug  7 15:33:17 2015 - [info] Checking replication health on 192.168.6.70..
Fri Aug  7 15:33:17 2015 - [info]  ok.
Fri Aug  7 15:33:17 2015 - [warning] master_ip_failover_script is not defined.
Fri Aug  7 15:33:17 2015 - [warning] shutdown_script is not defined.
Fri Aug  7 15:33:17 2015 - [info] Got exit code 0 (Not master dead).
 
MySQL Replication Health is OK.

/*

假如执行该命令报错:

……

Fri Aug  7 15:17:21 2015 - [info]   Connecting to root@192.168.6.52(192.168.6.52:22).. 
Can't exec "mysqlbinlog": No such file or directory at /usr/share/perl5/vendor_perl/MHA/BinlogManager.pm line 106.
mysqlbinlog version command failed with rc 1:0, please verify PATH, LD_LIBRARY_PATH, and client options
 at /usr/bin/apply_diff_relay_logs line 493
Fri Aug  7 15:17:21 2015 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln205] Slaves settings check failed!
Fri Aug  7 15:17:21 2015 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln413] Slave configuration failed.
Fri Aug  7 15:17:21 2015 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations.  at /usr/bin/masterha_check_repl line 48
Fri Aug  7 15:17:21 2015 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.
Fri Aug  7 15:17:21 2015 - [info] Got exit code 1 (Not master dead).
 
MySQL Replication Health is NOT OK!

在所有数据节点上都创建一下软连接:

 ln -s /usr/local/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog

 再次运行:

masterha_check_repl --conf=/etc/mha/app1.cnf  

又报一个新的错误: 

Testing mysql connection and privileges..sh: mysql: command not found
mysql command failed with rc 127:0!
 at /usr/bin/apply_diff_relay_logs line 375
main::check() called at /usr/bin/apply_diff_relay_logs line 497
eval {...} called at /usr/bin/apply_diff_relay_logs line 475
main::main() called at /usr/bin/apply_diff_relay_logs line 120
Fri Aug  7 15:28:08 2015 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln205] Slaves settings check failed!
Fri Aug  7 15:28:08 2015 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln413] Slave configuration failed.
Fri Aug  7 15:28:08 2015 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations.  at /usr/bin/masterha_check_repl line 48
Fri Aug  7 15:28:08 2015 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.
Fri Aug  7 15:28:08 2015 - [info] Got exit code 1 (Not master dead).

 在所有数据节点上建立软连接:

 ln -s /usr/local/mysql/bin/mysql /usr/bin/mysql

*/  

7:在管理端启动监控

[root@ser6-51 .ssh]# nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/mha/app1/manager.log 2>&1 & 

[1] 14694

[root@ser6-51 .ssh]# masterha_check_status --conf=/etc/mha/app1.cnf  //查看状态

app1 (pid:14694) is running(0:PING_OK), master:192.168.6.51

 # masterha_stop --conf=/etc/mha/app1.cnf  //关闭监控  

8:验证MHA高可用

关掉主库,看下浮动ip是否漂移,看下令一个从库的复制是否指向了新主库,然后通过连浮动ip插入两条数据,看下这两条数据是否会同步到其他从库上。

在6.51上关闭Mysql实例:

[root@ser6-51 ~]# service mysql stop

Shutting down MySQL...                                     [  OK  ]

 在管理节点上查看日志:

[root@ser6-51 ~]# tail -f /var/log/mha/app1/manager.log
 
----- Failover Report -----
 
app1: MySQL Master failover 192.168.6.51(192.168.6.51:3306) to 192.168.6.52(192.168.6.52:3306) succeeded
 
Master 192.168.6.51(192.168.6.51:3306) is down!
 
Check MHA Manager logs at ser6-51:/var/log/mha/app1/manager.log for details.
 
Started automated(non-interactive) failover.
The latest slave 192.168.6.52(192.168.6.52:3306) has all relay logs for recovery.
Selected 192.168.6.52(192.168.6.52:3306) as a new master.
192.168.6.52(192.168.6.52:3306): OK: Applying all logs succeeded.
192.168.6.70(192.168.6.70:3306): This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
192.168.6.70(192.168.6.70:3306): OK: Applying all logs succeeded. Slave started, replicating from 192.168.6.52(192.168.6.52:3306)
192.168.6.52(192.168.6.52:3306): Resetting slave info succeeded.
Master failover to 192.168.6.52(192.168.6.52:3306) completed successfully.

可以看到master自动切换到6.52上了。 

在6.70上查看:

mysql> show slave status \G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.6.52
                  Master_User: rep_user
                  Master_Port: 3306

可以看到master_host变成了192.168.6.52.

在现在的master上查看变量read_only,发现被自动关闭了,说明之前的slave现在可写了:

mysql> show variables like 'read_only';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| read_only     | OFF   |
+---------------+-------+
1 row in set (0.01 sec)

在新主库上执行show slave status,会看到没有了主从复制的信息,不再是原主的从库:

欧耶。嘻嘻。

 

我发现再次启动6.51的mysql后,并没有自动加入集群。需要自己连接到现在的master上:

change master to master_host='192.168.6.52',master_user='rep_user',master_password='……' ,master_port=3306,master_auto_position=1;

start slave;

切换完成后,manager进程会自动挂掉:

[root@manager ~]# masterha_check_status --conf=/etc/mha/app1.cnf

app1 is stopped(2:NOT_RUNNING).

需要手动启动mha管理进程:

nohup masterha_manager --conf=/etc/mha/app1.cnf  --ignore_last_failover < /dev/null > /var/log/mha/app1/manager.log 2>&1 & 

实验发现,当mha架构中宕掉了两个节点后,则无法进行故障转移。必须至少两个节点是存活的才行。

实践发现,mha会选择一个io进程读取到最新的数据作为新主库(若备选主库不是最新的,也不会选择它),并且会等待从库应用完relay log,才会切过来。

 

本篇文章参考自:http://blog.51yip.com/mysql/1722.html

http://www.cnblogs.com/wingsless/p/4033093.html

初探keepalive+mysql-ha架构

 

 

  • 1
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值