简介
MHA(Master High Availability)目前在MySQL高可用方面是一个相对成熟的解决方案,它由日本DeNA公司youshimaton(现就职于Facebook公司)开发,是一套优秀的作为MySQL高可用性环境下故障切换和主从提升的高可用软件。在MySQL故障切换过程中,MHA能做到在0~30秒之内自动完成数据库的故障切换操作,并且在进行故障切换的过程中,MHA能在最大程度上保证数据的一致性,以达到真正意义上的高可用
该软件由两部分组成:MHA Manager(管理节点)和MHA Node(数据节点)。MHA Manager可以单独部署在一台独立的机器上管理多个master-slave集群,也可以部署在一台slave节点上。MHA Node运行在每台MySQL服务器上,MHA Manager会定时探测集群中的master节点,当master出现故障时,它可以自动将最新数据的slave提升为新的master,然后将所有其他的slave重新指向新的master。整个故障转移过程对应用程序完全透明
在MHA自动故障切换过程中,MHA试图从宕机的主服务器上保存二进制日志,最大程度的保证数据的不丢失,但这并不总是可行的。例如,如果主服务器硬件故障或无法通过ssh访问,MHA没法保存二进制日志,只进行故障转移而丢失了最新的数据。使用MySQL 5.5的半同步复制,可以大大降低数据丢失的风险。MHA可以与半同步复制结合起来。如果只有一个slave已经收到了最新的二进制日志,MHA可以将最新的二进制日志应用于其他所有的slave服务器上,因此可以保证所有节点的数据一致性
目前MHA主要支持一主多从的架构,要搭建MHA,要求一个复制集群中必须最少有三台数据库服务器,一主二从,即一台充当master,一台充当备用master,另外一台充当从库,因为至少需要三台服务器,出于机器成本的考虑,淘宝也在该基础上进行了改造,目前淘宝TMHA已经支持一主一从
如图:
MHA软件由两部分组成,Manager工具包和Node工具包
Manager工具包 | |
masterha_check_ssh | 检查MHA的SSH配置状况 |
masterha_check_repl | 检查MySQL复制状况 |
masterha_manger | 启动MHA |
masterha_check_status | 检测当前MHA运行状态 |
masterha_master_monitor | 检测master是否宕机 |
masterha_master_switch | 控制故障转移(自动或者手动) |
masterha_conf_host | 添加或删除配置的server信息 |
Node工具包(这些工具通常由MHA Manager的脚本触发,无需人为操作) | |
save_binary_logs | 保存和复制master的二进制日志 |
apply_diff_relay_logs | 识别差异的中继日志事件并将其差异的事件应用于其他的slave |
filter_mysqlbinlog | 去除不必要的ROLLBACK事件(MHA已不再使用这个工具) |
purge_relay_logs | 清除中继日志(不会阻塞SQL线程) |
一、 搭建mysql MHA高可用
系统 | IP | 名称 | 角色 |
---|---|---|---|
CentOS 7.4 | 192.168.1.9 | master | master |
CentOS 7.4 | 192.168.1.10 | slaver1 | slaver1 |
CentOS 7.4 | 192.168.1.11 | slaver2 | slaver2 |
CentOS 7.4 | 192.168.1.12 | monitor | monitor |
1、在全部主机上配置SSH登录无密码验证
在全部机器上执行以下命令
[root@master ~]# ssh-keygen -t rsa
[root@master ~]# for i in 9 10 11 12;do ssh-copy-id root@192.168.2.$i;done
全部主机执行下面命令
echo '
192.168.2.9 master
192.168.2.10 slave1
192.168.2.11 slave2
192.168.2.12 manger'>> /etc/hosts
2、在master、slaver1、slaver2上安装mysql
安装文献查看:Mysql数据库管理系统
安装一台后可使用scp进行复制缩短时间如下:
[root@master1 ~]# systemctl stop mysqld #复制前先关闭mysql 服务
-------------------------------------
mysql 主文件复制
[root@master1 ~]# for i in 10 11;do scp -r /usr/local/mysql/ root@192.168.2.$i:/usr/local/mysql;done
mysql启动文件复制
[root@master1 ~]# for i in 10 11;do scp -r /usr/lib/systemd/system/mysqld.service root@192.168.2.$i:/usr/lib/systemd/system/;done
mysql配置文件复制
[root@master1 ~]# for i in 10 11;do scp -r /etc/my.cnf root@192.168.2.$i:/etc/;done
mysql系统文件复制
[root@master1 ~]# for i in 10 11;do scp -r /etc/init.d/mysql.server root@192.168.2.$i:/etc/init.d/;done
在 slaver1、slaver2 上操作
echo '
yum -y install gcc gcc-c++ ncurses bison libgcrypt perl cmake ncurses-devel
groupadd mysql
useradd -r -g mysql mysql
chown -R mysql:mysql /usr/local/mysql
chkconfig --add mysql.server
rm -rf /usr/local/mysql/data/auto.cnf
systemctl start mysql
netstat -anptul | grep mysql
rm -rf jb.sh' > jb.sh
bash jb.sh
直接回车
echo "export PATH=$PATH:/usr/local/mysql/bin/" >>/etc/profile
source /etc/profile
3、配置主从复制
在 master上进行配置
echo '
server-id=1
log-bin=mysql-bin
log-slave-updates
sync_binlog=1
auto_increment_increment=2
auto_increment_offset=1
relay-log=relay1-log-bin
relay-log-index=slave-relay1-bin.index
symbolic-links=0
' >> /etc/my.cnf
[root@C7--09 ~]# systemctl restart mysqld
[root@C7--09 ~]# mysql -uroot -p123.com
.......
mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000001 | 154 | | | |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
mysql> grant replication slave on *.* to 'master'@'192.168.2.%' identified by '123.com';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
在 slave1上进行配置(slave2相同注意修改:ID)
echo '
server-id=2
log-bin=mysql-bin
log-slave-updates
sync_binlog=1
auto_increment_increment=2
auto_increment_offset=1
relay-log=relay1-log-bin
relay-log-index=slave-relay1-bin.index
symbolic-links=0
' >> /etc/my.cnf
[root@slaver1 ~]# systemctl restart mysql
[root@slaver1 ~]# mysql -uroot -p123.com
.....
mysql> change master to master_host='192.168.2.9',master_user='master',master_password='123.com',master_log_file='mysql-bin.000001',master_log_pos=154;
Query OK, 0 rows affected, 2 warnings (0.00 sec)
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.2.9
Master_User: master
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000001
Read_Master_Log_Pos: 611
Relay_Log_File: relay1-log-bin.000002
Relay_Log_Pos: 777
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
.............
....
验证:
master上
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| sys |
+--------------------+
4 rows in set (0.00 sec)
mysql> create database user;
Query OK, 1 row affected (0.00 sec)
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| sys |
| user |
+--------------------+
5 rows in set (0.00 sec)
——————————————————————————————————————————
slaver1
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| sys |
| user |
+--------------------+
5 rows in set (0.00 sec)
————————————————————————————————————————
slaver2
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| sys |
| user |
+--------------------+
5 rows in set (0.00 sec)
在 monitor上安装客户端mysql
[root@monitor ~]# yum install -y mariadb-server mariadb
在master上授权一个用户进行监控
mysql> grant all on *.* to 'root'@'192.168.2.%' identified by '123.com';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
4、安装MHA
4.1、在 master、slaver1、slaver2 上先进行安装 Perl 依赖包
yum install perl-DBD-MySQL perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker perl-CPAN
4.2、在 monitor上安装 MHA Manger依赖的perl模块
上传 perl-Log-Dispatch 文件里面有安装包
[root@monitor~]# vim /etc/yum.repos.d/centOS7.repo
............
......
..
[MHA]
baseurl=file:///root/perl-Log-Dispatch
enabled=1
gpgcheck=0
配置本地yum源;保存退出
[root@manger ~]# yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes -y
............
.....
[root@manger ~]# yum install perl-DBD-MySQL perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker perl-CPAN
.....
..
4.3、在所有服务器上安装 mha4mysql-node-0.56.tar.gz 软件包
[root@master ~]# tar xf mha4mysql-node-0.56.tar.gz
[root@master ~]# cd mha4mysql-node-0.56
[root@master mha4mysql-node-0.56]# perl Makefile.PL
...............
........
..
*** Module::AutoInstall configuration finished.
Checking if your kit is complete...
Looks good
Writing Makefile for mha4mysql::node
[root@master mha4mysql-node-0.56]# make && make install
...........
.....
4.4、在 monitor 上安装 mha4mysql-manager-0.56.tar.gz 软件包
[root@manger ~]# tar xf mha4mysql-manager-0.56.tar.gz
[root@manger ~]# cd mha4mysql-manager-0.56
[root@manger mha4mysql-manager-0.56]# perl Makefile.PL
............
......
..
Checking if your kit is complete...
Looks good
Writing Makefile for mha4mysql::manager
[root@manger mha4mysql-manager-0.56]# make && make install
.........
...
..
安装完成后会在 /usr/local/bin 目录下生成脚本文件(在前面的已经讲过不做说明)
在 /root/mha4mysql-manager-0.56/samples/scripts/ 下有些示例脚本复制到 /usr/local/bin/ 下,这些脚本不完整,需要自己修改,这是软件开发着留给我们自己发挥的,如果开启下面的任何一个脚本对应的参数,而对应这里的脚本又没有修改,则会抛错
[root@manger bin]# ll /root/mha4mysql-manager-0.56/samples/scripts/
总用量 32
-rwxr-xr-x 1 4984 users 3648 4月 1 2014 master_ip_failover
-rwxr-xr-x 1 4984 users 9870 4月 1 2014 master_ip_online_change
-rwxr-xr-x 1 4984 users 11867 4月 1 2014 power_manager
-rwxr-xr-x 1 4984 users 1360 4月 1 2014 send_report
[root@manger bin]# cp /root/mha4mysql-manager-0.56/samples/scripts/* /usr/local/bin/
master_ip_failover | 自动切换时vip管理的脚本,不是必须,如果我们使用keepalived的,我们可以自己编写脚本完成对vip的管理,比如监控mysql,如果mysql异常,我们停止keepalived就行,这样vip就会自动漂移 |
master_ip_online_change | 在线切换时vip的管理,不是必须,同样可以可以自行编写简单的shell完成 |
power_manager | 故障发生后关闭主机的脚本,不是必须 |
send_report | 因故障切换后发送报警的脚本,不是必须,可自行编写简单的shell完成 |
5、配置MHA
5.1、创建MHA的工作目录,并且创建相关配置文件(在软件包解压后的目录里面有样例配置文件)
[root@manger ~]# mkdir -p /etc/masterha
[root@manger ~]# cp /root/mha4mysql-manager-0.56/samples/conf/app1.cnf /etc/masterha/
5.2、修改app1.cnf配置文件
[root@manger ~]# vim /etc/masterha/app1.cnf
[server default]
manager_workdir=/var/log/masterha/app1
manager_log=/var/log/masterha/app1/manager.log
master_binlog_dir=/usr/local/mysql/data/
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
password=123.com
user=root
remote_workdir=/tmp
repl_password=123.com
repl_user=master
report_script=/usr/local/bin/send_report
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.2.9 -s 192.168.2.10 -s 192.168.2.11
shutdown_script=""
ssh_user=root
[server1]
hostname=192.168.2.9
port=3306
[server2]
hostname=192.168.2.10
port=3306
candidate_master=1
check_repl_delay=0
[server3]
hostname=192.168.2.11
port=3306
no_master=1 #删除多余server4
——————————————————说明
[server default] 默认区域
manager_workdir=/var/log/masterha/app1 #设置manger的工作目录
manager_log=/var/log/masterha/app1/manager.log #设置manager日志
master_binlog_dir=/usr/local/mysql/data/ #设置 master 保存mysql-bin的位置
master_ip_failover_script=/usr/local/bin/master_ip_failover #设置自动 failover(故障切换) 的时候切换脚本
master_ip_online_change_script=/usr/local/bin/master_ip_online_change #设置手动切换时候的脚本
password=123.com #设置mysql中root用户的密码,这个密码是用于监控用户的那个密码
user=root #设置监控用户
remote_workdir=/tmp #设置远端mysql在发生切换时mysql-bin的保存位置
repl_password=123.com #设置复制用户的密码
repl_user=master #设置复制用户
report_script=/usr/local/bin/send_report #设置发生切换后发送的报警的脚本
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.2.9 -s 192.168.2.10 -s 192.168.2.11 #实现多路由监测master的可用性
shutdown_script="" #设置故障发生后关闭故障主机脚本(主要是关闭主机放在发生脑裂,这里不使用)
ssh_user=root #设置ssh的登录用户名
——————————————————————————————
[server1]
hostname=192.168.2.9
port=3306
[server2]
hostname=192.168.2.10
port=3306
candidate_master=1 #设置候选master,设置该参数后,发生主从切换以后会将此库提升为主,即使这个主库不是群集中事件最新的slave
check_repl_delay=0 #默认情况下如果一个alave落后master 100M的relay 1ogs的话, MHA将不会选择该alave作为一个新的master,因为对于这个alave的恢复需要花费很长时间,通过设置check_repl_delay=0, MHA触发切换在选择一个新的master的时候将会忽略复制延时,这个参数对于设置了candidate_master=1的主机非常有用,因为这个侯选主在切换的过程中一定是新的maater
5.3、设置relay log的清除方式(在每个slave节点上)
[root@slaver1 ~]# mysql -uroot -p123.com -e "set global relay_log_purge=0"
mysql: [Warning] Using a password on the command line interface can be insecure.
[root@slaver2 ~]# mysql -uroot -p123.com -e "set global relay_log_purge=0"
mysql: [Warning] Using a password on the command line interface can be insecure.
注意:MHA在发生切换的过程中,从库的恢复过程中依赖于relay log的相关信息,所以这里要将relay log的自动清除设置为OFF,采用手动清除relay log的方式。在默认情况下,从服务器上的中继日志会在SQL线程执行完毕后被自动删除。但是在MHA环境中,这些中继日志在恢复其他从服务器时可能会被用到,因此需要禁用中继日志的自动删除功能。定期清除中继日志需要考虑到复制延时的问题。在ext3的文件系统下,删除大的文件需要一定的时间,会导致严重的复制延时。为了避免复制延时,需要暂时为中继日志创建硬链接,因为在linux系统中通过硬链接删除大文件速度会很快。(在mysql数据库中,删除大表时,通常也采用建立硬链接的方式)
MHA节点中包含了pure_relay_logs命令工具,它可以为中继日志创建硬链接,执行SET GLOBAL relay_log_purge=1,等待几秒钟以便SQL线程切换到新的中继日志,再执行SET GLOBAL relay_log_purge=0
pure_relay_logs脚本参数 | |
--user mysql | 用户名 |
--password mysql | 密码 |
--port | 端口号 |
--workdir | 指定创建relay log的硬链接的位置,默认是/var/tmp,由于系统不同分区创建硬链接文件会失败,故需要执行硬链接具体位置,成功执行脚本后,硬链接的中继日志文件被删除 |
--disable_relay_log_purge | 默认情况下,如果relay_log_purge=1,脚本会什么都不清理,自动退出,通过设定这个参数,当relay_log_purge=1的情况下会将relay_log_purge设置为0。清理relay log之后,最后将参数设置为OFF |
5.4删除原本的 master_ip_failover 上传新 master_ip_failover
[root@manger ~]# cd /usr/local/bin/
[root@manger bin]# rm -rf master_ip_failover
————————————————————上传省略
[root@manger bin]# ls
apply_diff_relay_logs masterha_check_ssh masterha_manager masterha_secondary_check master_ip_online_change save_binary_logs
filter_mysqlbinlog masterha_check_status masterha_master_monitor masterha_stop power_manager send_report
masterha_check_repl masterha_conf_host masterha_master_switch master_ip_failover purge_relay_logs
[root@manger bin]# chmod a+x master_ip_failover
[root@manger bin]# vim master_ip_failover
....
9 my $vip = '192.168.2.100/24'; # Virtual IP
10 my $gateway = '192.168.2.254'; #Gateway IP
11 my $interface = 'ens33';
....
5.5、检查SSH配置
[root@manger bin]# masterha_check_ssh --conf=/etc/masterha/app1.cnf
Sat Dec 25 18:58:39 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Dec 25 18:58:39 2021 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Sat Dec 25 18:58:39 2021 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Failed to get IP address on host host4!
at /usr/local/share/perl5/MHA/Config.pm line 63.
[root@manger bin]# vim /etc/masterha/app1.cnf
[root@manger bin]# masterha_check_ssh --conf=/etc/masterha/app1.cnf
Sat Dec 25 18:59:30 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Dec 25 18:59:30 2021 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Sat Dec 25 18:59:30 2021 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Sat Dec 25 18:59:30 2021 - [info] Starting SSH connection tests..
Sat Dec 25 18:59:31 2021 - [debug]
Sat Dec 25 18:59:31 2021 - [debug] Connecting via SSH from root@192.168.2.10(192.168.2.10:22) to root@192.168.2.9(192.168.2.9:22)..
Sat Dec 25 18:59:31 2021 - [debug] ok.
Sat Dec 25 18:59:31 2021 - [debug] Connecting via SSH from root@192.168.2.10(192.168.2.10:22) to root@192.168.2.11(192.168.2.11:22)..
Sat Dec 25 18:59:31 2021 - [debug] ok.
Sat Dec 25 18:59:31 2021 - [debug]
Sat Dec 25 18:59:30 2021 - [debug] Connecting via SSH from root@192.168.2.9(192.168.2.9:22) to root@192.168.2.10(192.168.2.10:22)..
Sat Dec 25 18:59:31 2021 - [debug] ok.
Sat Dec 25 18:59:31 2021 - [debug] Connecting via SSH from root@192.168.2.9(192.168.2.9:22) to root@192.168.2.11(192.168.2.11:22)..
Sat Dec 25 18:59:31 2021 - [debug] ok.
Sat Dec 25 18:59:32 2021 - [debug]
Sat Dec 25 18:59:31 2021 - [debug] Connecting via SSH from root@192.168.2.11(192.168.2.11:22) to root@192.168.2.9(192.168.2.9:22)..
Sat Dec 25 18:59:32 2021 - [debug] ok.
Sat Dec 25 18:59:32 2021 - [debug] Connecting via SSH from root@192.168.2.11(192.168.2.11:22) to root@192.168.2.10(192.168.2.10:22)..
Sat Dec 25 18:59:32 2021 - [debug] ok.
Sat Dec 25 18:59:32 2021 - [info] All SSH connection tests passed successfully.
最后显示:[info] All SSH connection tests passed successfully 说明没有问题
5.6、检查整个复制环境状况
在master、slaver1、slaver2 上都做一遍软连接
ln -s /usr/local/mysql/bin/* /usr/local/bin/
[root@manger bin]# masterha_check_repl --conf=/etc/masterha/app1.cnf
Sat Dec 25 19:08:12 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Dec 25 19:08:12 2021 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Sat Dec 25 19:08:12 2021 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Sat Dec 25 19:08:12 2021 - [info] MHA::MasterMonitor version 0.56.
Creating directory /var/log/masterha/app1.cnf.. done.
Sat Dec 25 19:08:13 2021 - [info] GTID failover mode = 0
Sat Dec 25 19:08:13 2021 - [info] Dead Servers:
Sat Dec 25 19:08:13 2021 - [info] Alive Servers:
Sat Dec 25 19:08:13 2021 - [info] 192.168.2.9(192.168.2.9:3306)
Sat Dec 25 19:08:13 2021 - [info] 192.168.2.10(192.168.2.10:3306)
Sat Dec 25 19:08:13 2021 - [info] 192.168.2.11(192.168.2.11:3306)
Sat Dec 25 19:08:13 2021 - [info] Alive Slaves:
.........
......
...
Checking the Status of the script.. OK
Sat Dec 25 19:08:16 2021 - [info] OK.
Sat Dec 25 19:08:16 2021 - [warning] shutdown_script is not defined.
Sat Dec 25 19:08:16 2021 - [info] Got exit code 0 (Not master dead).
MySQL Replication Health is OK.
5.7、开启MHA Manager监控
[root@manger ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover /var/log/masterha/app1/manager.log 2>&1 &
[1] 38422
[root@manger ~]# nohup: 忽略输入并把输出追加到"nohup.out" #回车
[root@manger ~]# jobs -l
[1]+ 38422 运行中 nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover /var/log/masterha/app1/manager.log 2>&1 &
5.8、查看监控状态
[root@manger ~]# masterha_check_status --conf=/etc/masterha/app1.cnf
app1 (pid:38422) is running(0:PING_OK), master:192.168.2.9
看到已经 监控 master 了ip地址为192.168.2.9
6、故障测试
6.1、模拟 mysql 故障
注意:切换以后MHA服务会自动停止
关闭master
[root@master ~]# systemctl stop mysqld
[root@master ~]#
6.2、在manager上查看MHA服务和切换日志
[root@manger ~]# cat /var/log/masterha/app1/manager.log
........................
...........
app1: MySQL Master failover 192.168.2.9(192.168.2.9:3306) to 192.168.2.10(192.168.2.10:3306) succeeded
Master 192.168.2.9(192.168.2.9:3306) is down!
Check MHA Manager logs at manger:/var/log/masterha/app1/manager.log for details.
Started automated(non-interactive) failover.
Invalidated master IP address on 192.168.2.9(192.168.2.9:3306)
The latest slave 192.168.2.10(192.168.2.10:3306) has all relay logs for recovery.
Selected 192.168.2.10(192.168.2.10:3306) as a new master.
192.168.2.10(192.168.2.10:3306): OK: Applying all logs succeeded.
192.168.2.10(192.168.2.10:3306): OK: Activated master IP address.
192.168.2.11(192.168.2.11:3306): This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
192.168.2.11(192.168.2.11:3306): OK: Applying all logs succeeded. Slave started, replicating from 192.168.2.10(192.168.2.10:3306)
192.168.2.10(192.168.2.10:3306): Resetting slave info succeeded.
Master failover to 192.168.2.10(192.168.2.10:3306) completed successfully
看到 Master failover to 192.168.2.10(192.168.2.10:3306) completed successfully 说明备选 master 已经上位
6.3、在 slaver2 上查看主从复制情况
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.2.10
Master_User: master
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 154
Relay_Log_File: relay1-log-bin.000002
Relay_Log_Pos: 320
Relay_Master_Log_File: mysql-bin.000003
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
6.4、启动MHA Manger监控,查看集群里的master
[root@manger ~]# vim /etc/masterha/app1.cnf
[server default]
manager_log=/var/log/masterha/app1/manager.log
manager_workdir=/var/log/masterha/app1.cnf
master_binlog_dir=/usr/local/mysql/data/
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
password=123.com
ping_interval=1
remote_workdir=/tmp
repl_password=123.com
repl_user=master
report_script=/usr/local/bin/send_report
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.2.9 -s 192.168.2.10 -s 192.168.2.11
shutdown_script=""
ssh_user=root
user=root
#发现server1被删除了
[server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.2.10
port=3306
[server3]
hostname=192.168.2.11
#no_master=1 #注释
port=3306
保存退出
[root@manger ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover /var/log/masterha/app1/manager.log 2>&1 &
[1] 127904
[root@manger ~]# nohup: 忽略输入并把输出追加到"nohup.out"
[root@manger ~]# jobs -l
[1]+ 127904 运行中 nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover /var/log/masterha/app1/manager.log 2>&1 &
[root@manger ~]# masterha_check_status --conf=/etc/masterha/app1.cnf
app1 (pid:127904) is running(0:PING_OK), master:192.168.2.10
6.5、将MySQL故障服务器重新加入MHA环境步骤
1、把故障服务器设为新的slave
2、重新启动MHA manager
停止 运行
[root@manger ~]# masterha_stop --conf=/etc/masterha/app1.cnf
Stopped app1 successfully.
[1]+ 退出 1 nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover /var/log/masterha/app1/manager.log 2>&1
或者直接 kill 掉
[root@manger ~]# jobs -l
[1]+ 22602 运行中 nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover /var/log/masterha/app1/manager.log 2>&1 &
[root@manger ~]# kill -9 22602
[root@manger ~]# kill -9 22602
-bash: kill: (22602) - 没有那个进程
[1]+ 已杀死 nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover /var/log/masterha/app1/manager.log 2>&1
3、查看MHA状态