mysql ab公司_mysql的AB及读写和集群

最新推荐文章于 2024-12-06 17:47:21 发布

SecuritySensei

最新推荐文章于 2024-12-06 17:47:21 发布

阅读量445

点赞数

文章标签： mysql ab公司

本文链接：https://blog.csdn.net/weixin_36372146/article/details/113206526

版权

Mysql的AB及读写

第4章集群

4.2 MHA

4.2.1 ssh互信

4.2.1 ABB(一主二从)/AABB(二主多从)架构搭建

4.2.2 MHA安装(其他节点安装MHA_node)

4.2.4 MHA互信和环境检查

4.2.5 启动及相关日志文件

第5章错误集锦

第1章 Mysql的AB配置

当MySQL的版本是5.5或高于5.5的时候，其中大部分的内容相似

主要是5.5之后不再支持master打头的参数

主|从

IP地址

允许同步的用户

读|写

mysql版本

Mycat版本

JDK版本

操作系统版本

master

192.168.13.189

slave

write

mysql-8.0.13

CentOS 7.4.1708

192.168.13.192

copy

write

mysql-8.0.13

Mycat-server-1.6.6.1

1.8.0_191

CentOS 7.4.1708

slave

192.168.13.190

slave

read

mysql-8.0.13

CentOS 7.4.1708

192.168.13.191

slave

read

mysql-8.0.13

CentOS 7.4.1708

192.168.13.193

copy

read

mysql-8.0.13

CentOS 7.4.1708

1.1master配置

Master：

在/etc/my.cnf 添加：

server-id = 1

log-bin=/home/mysql/logs/binlog/bin-log(若报错则不加路径，如下)

log-bin=bin-log

max_binlog_size = 500M

binlog_cache_size = 128K

binlog-do-db = adb

binlog-ignore-db = mysql

log-slave-updates

expire_logs_day=2(在centos7.3上易报错，若报错，可不加)

binlog_format="MIXED"

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

server-id = 1 #服务器标志号，注意在配置文件中不能出现多个这样的标识，如果出现多个的话mysql以第一个为准，一组主从中此标识号不能重复。

log-bin=/home/mysql/logs/binlog/bin-log #开启bin-log，并指定文件目录和文件名前缀。

max_binlog_size = 500M #每个bin-log最大大小，当此大小等于500M时会自动生成一个新的日志文件。一条记录不会写在2个日志文件中，所以有时日志文件会超过此大小。

binlog_cache_size = 128K #日志缓存大小

binlog-do-db = adb #需要同步的数据库名字，如果是多个，就以此格式在写一行即可。

binlog-ignore-db = mysql #不需要同步的数据库名字，如果是多个，就以此格式在写一行即可。

log-slave-updates #当Slave从Master数据库读取日志时更新新写入日志中，如果只启动log-bin而没有启动log-slave-updates则Slave只记录针对自己数据库操作的更新。

expire_logs_day=2 #设置bin-log日志文件保存的天数，此参数mysql5.0以下版本不支持。

binlog_format=”MIXED” #设置bin-log日志文件格式为：MIXED，可以防止主键重复。

接着重启MySQL：/etc/init.d/mysqld restart

再进入MySQL：mysql -u root -p123456

创建backup用户：

创建一个用于让从数据库连接的用户(以前那种方式不能用了)

mysql> CREATE USER 'copy'@'%' IDENTIFIED WITH mysql_native_password BY 'tqw961110';(密码强度可能不够会报错，XINyang3009@@)或下面一个

mysql>CREATE USER 'slave'@'%' IDENTIFIED BY 'XINyang3009@@';

mysql>grant all on *.* to 'copy'@'%' identified by "10jqka@123" #这个方式也可以

mysql>flush privileges #刷新授权表

mysql> show master status;

Master的操作就完了

【server_id千万不要一样，否则会报错】

mysql> show master status;【192.168.13.192】

+------------------+----------+--------------+------------------+-------------------+

+------------------+----------+--------------+------------------+-------------------+

+------------------+----------+--------------+------------------+-------------------+

1 row in set (0.00 sec)

mysql> show master status;【192.168.13.189】

+------------------+----------+--------------+------------------+-------------------+

+------------------+----------+--------------+------------------+-------------------+

| mysql-bin.000006 | 3490 | | | |

+------------------+----------+--------------+------------------+-------------------+

1 row in set (0.00 sec)

1.2slave配置步骤

在/etc/my.cnf 添加：

server_id=2

replicate-do-db=database1 　　　　 #要同步的数据库的名字

replicate-ignore-db=mysql 　　　　　　#被忽略的数据库

接着重启MySQL：/etc/init.d/mysqld restart

再进入MySQL：mysql -u root -p123456

同步执行此操作：

mysql> change master to

-> master_host='2.2.2.2',

-> master_user='backup',

-> master_password='123456',

-> master_log_file='bin_log.000003',　　　　　　　　　　# master上记录的日志文件

-> master_log_pos=120; 　　　　　　　　　　　　　　 # master上记录的日志文件位置

Query OK, 0 rows affected, 2 warnings (0.06 sec)

同步开启：start slave；

同步关闭：stop slave；

最后开启同步 start slave即可

1.2.1 192.168.13.190

mysql> change master to

-> master_host='192.168.13.189',

-> master_user='slave',

-> master_password='XINyang3009',

-> master_log_file='mysql-bin.000006',

-> master_log_pos=3490;

Query OK, 0 rows affected, 2 warnings (0.06 sec)

mysql> start slave;

mysql> show slave status\G;

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 192.168.13.189

Master_User: slave

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.000006

Read_Master_Log_Pos: 3490

Relay_Log_File: ybb-test-mysql-2-relay-bin.000012

Relay_Log_Pos: 3704

Relay_Master_Log_File: mysql-bin.000006

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB: mysql

Slave_IO_Running: Yes(必须是yes)

Slave_SQL_Running: Yes(必须是yes)

Replicate_Do_DB: mysql

1.2.2 192.168.13.191

mysql> show slave status\G;

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 192.168.13.189

Master_User: slave

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.000006

Read_Master_Log_Pos: 3490

Relay_Log_File: ybb-test-mysql-2-relay-bin.000012

Relay_Log_Pos: 3704

Relay_Master_Log_File: mysql-bin.000006

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB: mysql

Slave_IO_Running: Yes(必须是yes)

Slave_SQL_Running: Yes(必须是yes)

Replicate_Do_DB: mysql

1.2.3 192.168.13.192

1.2.4 192.168.13.193

1.2.4 192.168.13.189

至此，二主三从搭建完成。

#Slave_IO_Running：连接到主库，并读取主库的日志到本地，生成本地日志文件

#Slave_SQL_Running:读取本地日志文件，并执行日志里的SQL命令。

第2章读写分离

在这里读写分离中间件用的是mycat，部署在192.168.13.192上

2.1安装mycat

1、安装JDK，mycat依赖Java环境

2、解压Mycat-server-1.6.6.1-release-20181031195535-linux.tar.gz，同时修改环境变量 /etc/profile

[root@ybb-test-mysql-4 conf]# cat /etc/profile

3、进入配置文件目录cd ./mycat/conf

(1)server.xml是登陆mycat的用户进行配置的配置文件

(2)schema.xml 对真实数据库进行配置的配置文件

2.1.1 server.xml

[root@ybb-test-mysql-4 conf]# cat server.xml

99 XINyang3009@@

100 TESTDB

101

102

103

111

112

113

114 user

115 TESTDB

116

117

这里配置了两个可以来连接的用户

用户1 test 密码test给予了此用户TESTDB数据库的权限

用户2 user 密码user给予了此用户TESTDB数据库的只读权限

注意这里的TESTDB 不一定是你数据库上的真实库名.可以任意指定.只要和接下来的schema.xml的配置文件中的库名统一即可

2.1.1 schema.xml

[root@ybb-test-mysql-4 conf]# cat schema.xml

writeType="0" dbType="mysql" dbDriver="native" switchType="1" slaveThreshold="100">

select user()

1、

这里TESTDB 就是我们对外声称的我们有数据库的名称必须和server.xml中的用户指定的数据库名称一致，添加一个dataNode="dn1"是指定了我们这个库只在dn1上.没有进行分库

2、

这里只需要改database的名字db1就是你真实数据库服务上的数据库名，根据你自己的数据库名进行修改.

3、

这里只需要配置三个地方 balance="1"与writeType="0" ,switchType=”1”

a. balance 属性负载均衡类型，目前的取值有4种：

1. balance="0", 不开启读写分离机制，所有读操作都发送到当前可用的writeHost上。

2. balance="1"，全部的readHost与stand by writeHost参与select语句的负载均衡，简单的说，当双主双从模式(M1 ->S1，M2->S2，并且M1与M2互为主备)，正常情况下，M2,S1,S2都参与select语句的负载均衡。

3. balance="2"，所有读操作都随机的在writeHost、readhost上分发。

4. balance="3"，所有读请求随机的分发到wiriterHost对应的readhost执行,writerHost不负担读压力，注意balance=3只在1.4及其以后版本有，1.3没有。

b. writeType 属性

负载均衡类型，目前的取值有 3 种：

1. writeType="0", 所有写操作发送到配置的第一个writeHost，第一个挂了切到还生存的第二个

writeHost，重新启动后已切换后的为准，切换记录在配置文件中:dnindex.properties .

2. writeType="1"，所有写操作都随机的发送到配置的writeHost。

3. writeType="2"，没实现。

c. switchType 属性

1：表示不自动切换

1 ：默认值，自动切换

2 ：基于MySQL主从同步的状态决定是否切换

4、

这里是配置的我们的两台读写服务器IP地址访问端口和访问用户的用户名和密码

2.2启动mycat

2.2.1 启动mycat

到bin目录下，执行./mycat restart

[root@ybb-test-mysql-4 bin]# ps -aux|grep mycat

root 30319 0.0 0.0 17816 744 ? Sl 15:51 0:00 /usr/local/mysql/mycat/bin/./wrapper-linux-x86-64 /usr/local/mysql/mycat/conf/wrapper.conf wrapper.syslog.ident=mycat wrapper.pidfile=/usr/local/mysql/mycat/logs/mycat.pid wrapper.daemonize=TRUE wrapper.lockfile=/var/lock/subsys/mycat

root 30321 3.5 6.5 6927816 253020 ? Sl 15:51 0:03 java -DMYCAT_HOME=. -server -XX:MaxPermSize=64M -XX:+AggressiveOpts -XX:MaxDirectMemorySize=2G -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=1984 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Xmx4G -Xms1G -Djava.library.path=lib -classpath lib/wrapper.jar:conf:lib/asm-4.0.jar:lib/commons-collections-3.2.1.jar:lib/commons-lang-2.6.jar:lib/curator-client-2.11.0.jar:lib/curator-framework-2.11.0.jar:lib/curator-recipes-2.11.0.jar:lib/disruptor-3.3.4.jar:lib/dom4j-1.6.1.jar:lib/druid-1.0.26.jar:lib/ehcache-core-2.6.11.jar:lib/fastjson-1.2.12.jar:lib/guava-19.0.jar:lib/hamcrest-core-1.3.jar:lib/hamcrest-library-1.3.jar:lib/jline-0.9.94.jar:lib/joda-time-2.9.3.jar:lib/jsr305-2.0.3.jar:lib/kryo-2.10.jar:lib/leveldb-0.7.jar:lib/leveldb-api-0.7.jar:lib/libwrapper-linux-ppc-64.so:lib/libwrapper-linux-x86-32.so:lib/libwrapper-linux-x86-64.so:lib/log4j-1.2-api-2.5.jar:lib/log4j-1.2.17.jar:lib/log4j-api-2.5.jar:lib/log4j-core-2.5.jar:lib/log4j-slf4j-impl-2.5.jar:lib/mapdb-1.0.7.jar:lib/minlog-1.2.jar:lib/mongo-java-driver-2.11.4.jar:lib/Mycat-server-1.6.6.1-release.jar:lib/mysql-binlog-connector-java-0.16.1.jar:lib/mysql-connector-java-5.1.35.jar:lib/netty-3.7.0.Final.jar:lib/netty-buffer-4.1.9.Final.jar:lib/netty-common-4.1.9.Final.jar:lib/objenesis-1.2.jar:lib/reflectasm-1.03.jar:lib/sequoiadb-driver-1.12.jar:lib/slf4j-api-1.6.1.jar:lib/univocity-parsers-2.2.1.jar:lib/velocity-1.7.jar:lib/wrapper.jar:lib/zookeeper-3.4.6.jar -Dwrapper.key=aBVwhEZHoe1K2Vpw -Dwrapper.port=32000 -Dwrapper.jvm.port.min=31000 -Dwrapper.jvm.port.max=31999 -Dwrapper.pid=30319 -Dwrapper.version=3.2.3 -Dwrapper.native_library=wrapper -Dwrapper.service=TRUE -Dwrapper.cpu.timeout=10 -Dwrapper.jvmid=1 org.tanukisoftware.wrapper.WrapperSimpleApp io.mycat.MycatStartup start

root 30458 0.0 0.0 112676 956 pts/2 S+ 15:53 0:00 grep --color=auto mycat

[root@ybb-test-mysql-4 bin]#netstat -anlpt|grep :*066

tcp6 0 0 :::8066 :::* LISTEN 30321/java

tcp6 0 0 :::9066 :::* LISTEN 30321/java

[root@ybb-test-mysql-4 bin]#

8066：数据端口

9066：管理端口

命令行的登陆是通过 9066 管理端口来操作。

2.2.2 mycat日志

[root@ybb-test-mysql-4 logs]# cd /usr/local/mysql/mycat/logs

warpper 日志：mycat启动，停止，添加为服务等都会记录到此日志文件，如果系统环境配置错误或缺少配置时，导致Mycat无法启动，可以通过查看warrpper.log定位具体错误原因。

mycat.log：为mycat主要日志文件，记录了启动时分配的相关buffer信息，数据源连接信息，连接池，动态类加载信息等等在log4j.xml文件中进行相关配置，如保留个数，大小，字符集，日志文件大小等。非启动状态下可以删除，启动后会自动生成该日志文件

2.3登录mycat相关问题

命令：mysql -uroot -pXINyang3009@@ -h192.168.13.192 -P9066

[root@ybb-test-mysql-4 conf]# mysql -uroot -pXINyang3009@@ -h192.168.13.192 -P9066

mysql: [Warning] Using a password on the command line interface can be insecure.

ERROR 1045 (HY000): Access denied for user 'root', because password is error

特别注意：本次实验的mysql版本均为mysql-8.0.13，此故障折腾了我两三天，以上的配置查看了无数遍，后来查阅相关资料才知道目前MyCat仍主要面对MySql 5.5, 5.6, 5.7版，对最新的MySql 8尚未完全支持，需要用户对MySql 8和MyCat的配置进行一系列的修改。其中对mycat的登陆方式有改变，Mycat登录逻辑库的传统方式是：mysql -uroot -p -h127.0.0.1 -P8066 -DTESTDB，对于MySql 8，会报密码错误方式，这是由于Mysql 8的缺省加密方式已经改为caching_sha2_password，而MyCat对此尚不支持。当然可以通过修改MySQL8的加密方式使其通过，但未对MySQL8完全支持，所以会在功能上也会大打折扣，目前mycat官网并没有对MySQL8版本的详细文档，官网的文档目前对MySQL5.X依旧适用。

之前做实验之所以成功是因为我在云上也装了一台mysql 5.6，并且也是用5.6做的主从，当时做完之后没有用用mysql 8把版本做连接测试，用的是5.6版本，而此次用mysql8版本连接测试显示始终失败，还折腾两天左右，我才想到可能是版本的问题，为此我又在mysql5.6版本上重新做了一遍验证一下是否是由于版本太新导致mycat不支持从而出现的错误，于是又开始折腾很久。

1、版本简介：

OS版本

读|写

MySQL版本

mycat版本

CentOS 7.2

2.2.2.10

读|写

5.6.19

Mycat-server-1.6.6.1

为尽快达到实验目的，只在一台虚机安装MySQL(时长约3h)，并读写也在该主机上。

2、安装MySQL 5.6(详细过程略)

3、配置文件server.xml和schema.xml

4、mycat启动

到bin目录下，执行./mycat restart

5、登录

[root@localhost conf]# mysql -uroot -pXINyang3009@@ -h2.2.2.10 -P9066

Warning: Using a password on the command line interface can be insecure.

Welcome to the MySQL monitor. Commands end with ; or \g.

Your MySQL connection id is 2

Server version: 5.6.29-mycat-1.6.6.1-release-20181031195535 MyCat Server (monitor)

Oracle is a registered trademark of Oracle Corporation and/or its

affiliates. Other names may be trademarks of their respective owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

当出现mycat版本就说明登录成功。

6、查看当前可读写的入口

mysql> show @@datasource;

+----------+--------+-------+----------+------+------+--------+------+------+---------+-----------+---

+----------+--------+-------+----------+------+------+--------+------+------+---------+-----------+---|dn1 | write1 | mysql | 2.2.2.10 | 3306 | W | 0 | 10 | 1000 | 442 | 2 | 1 |

| dn1 | write1 | mysql | 2.2.2.10 | 3306 | R | 0 | 8 | 1000 | 436 | 0 | 0 |

+----------+--------+-------+----------+------+------+--------+------+------+---------+-----------+---

2 rows in set (0.02 sec)

7、知识点总结

8、总结

由上可知，最新的mycat并不支持最新版本的mysql，也许能够连上去但在并未完全开放之前，相信很多mycat的功能会受限。

经过第二次试验可知，在mysql 8上线的配置并没有错，只不过是由于mycat的不支持最新的mysql版本导致，所以等mycat支持mysql 8的时候就可以生效了

2.4小结

由于mycat目前并不全面支持mysql最新版本，所以目前读写分离我选的是mysql官方推荐的mysql-proxy，功能虽小但基于mysql 8版本的过度期还是一款不错的产品。

第3章mysql-proxy

3.1 安装mysql-proxy

1、使用本地镜像安装相关环境：

yum install -y gcc* gcc-c++* autoconf* automake* zlib* libxml* ncurses-devel* libmcrypt* libtool* flex* pkgconfig* libevent* glib*

2、下载mysql-proxy-0.8.5-linux-glibc2.3-x86-64bit.tar

3、解压安装

4、环境变量

[root@localhost ~]# vi .bash_profile

PATH=$PATH:$HOME/bin:/mysqlprox/proxy/bin

3.2 配置mysql-proxy

mkdir /home/mysql-proxy/logs【日志文件位置】

mkdir /home/mysql-proxy/lua 【脚本位置】

cd /usr/local/mysql-proxy【mysql-proxy安装位置】

cp share/doc/mysql-proxy/rw-splitting.lua ./lua 【复制读写分离配置文件】

vi /etc/mysql-proxy.cnf【创建配置文件】

[mysql-proxy]

user=root 【运行mysql-proxy用户】

admin-username=proxy 【主从mysql共有的用户】

admin-password=123456 【用户的密码】

proxy-address=192.168.11.31:4040【mysql-proxy运行ip和端口，不加端口，默认4040】

proxy-read-only-backend-addresses=192.168.11.31 【指定后端从slave读取数据】

proxy-backend-addresses=192.168.11.34 【指定后端主master写入数据】

proxy-lua-script=/usr/local/mysql-proxy/lua/rw-splitting.lua 【指定读写分离配置文件位置】

admin-lua-script=/usr/local/mysql-proxy/lua/admin.lua 【指定管理脚本】

log-file=/var/log/mysql-proxy.log

log-level=info 【定义log日志级别】

daemon=true【以守护进程方式运行】

keepalive=true 【mysql-proxy崩溃时，尝试重启】

(1)(critical) Key file contains key “keepalive”which has a value that cannot be interpreted.

daemon=1

keepalive=1

<1> 给配置文件执行权限

chmod 660 /etc/mysql-porxy.cnf

<2> 修改读写分离配置文件

vim /usr/local/mysql-proxy/lua/rw-splitting.luaif not proxy.global.config.rwsplit

min_idle_connections = 1, 【默认超过4个连接数时，才开始读写分离，改为1】

max_idle_connections = 1, 【默认8，改为1】

3.3 启动mysql-proxy和查看

启动：/usr/local/mysql-proxy/bin/mysql-proxy --defaults-file=/etc/mysql-proxy.cnf

查看：netstat -tupln | grep 4000

这样一个读写分离就完全了

第4章集群

目前Mysql集群用的MHA或keepalive的方式做的，但都通过VIP进行管理，若是在云上做集群会有一些隐患。和H3C云工程师进行交流了解，他们不建议我们在云上做任何的集群，因为集群有心跳检测，会与虚机的备份有冲突，当虚机进行备份时会造成心跳停止，导致集群报错，解决办法就是不使用虚机备份，使用其他方式进行备份方式。中国经济网就是采用虚机做集群但不采用虚机备份的方式。

其他备份方式与虚机备份又有区别：

(1)其他备份方式：只能备份指定的数据，不能备份虚机的相关信息。

(2)虚机备份：不仅能备份指定的数据还能备份虚机的相关配置信息，在虚机崩溃时还可以进行虚机恢复。

4.1 版本介绍

属性

IP地址

允许同步的用户

读|写

mysql版本

MHA版本

操作系统版本

master

192.168.13.189

slave

write

mysql-8.0.13

0.54

CentOS 7.4

192.168.13.190

slave

read

mysql-8.0.13

0.54

CentOS 7.4

slave

192.168.13.192

slave

read

mysql-8.0.13

0.58

CentOS 7.4

MHA_manager

192.168.13.171

mysql-8.0.13

0.58

CentOS 7.4

MHA实现机制：

(1)监控AB的状态

(2)完整的选举机制(看谁的数据跟master最接近)

(3)让一个B切换到新A

(4)保证数据的完整性(通过差异还原)

因为AB复制是异步复制，所以可能有一些数据尚没有被B拉到其relay_log中,即AB数据不一致，MHA是怎样解决这种情况的呢？

(1)mha_manager使用scp命令将A当前binlog拷贝到mha_manager

(2)待新A(选举：依据谁的relay_log新)产生后，mha_manager再拿着旧A的binlog和新的relay_log做比对，并进行差异还原以保证新A和旧A数据的一致性

(3)mha_manager最后拿着老A的binlog去找复制组中其他B做差异还原，保证数据的一致性

4.2 MHA

实验步骤：

1)ssh证书户信任(ssh的密钥验证)

2)ABB/AABB架构

3)安装mha_manager、mha_node

4)测试

4.2.1 ssh互信

(1)ssh-keygen -t rsa

ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.13.189

ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.13.190

ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.13.192

ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.13.171

(2)在192.168.13.189、192.168.13.190、192.168.13.192、192.168.13.171主机上将(1)都执行一遍即可，实现这四台主机ssh连接不需要密码

4.2.1 ABB(一主二从)/AABB(二主多从)架构搭建

略

4.2.1 MHA安装(管理节点MHA_manager)

(1)安装依赖包

[root@ybb-test-01 mha]# ls dependent1

perl-Config-Tiny-2.14-7.el7.noarch.rpm perl-Mail-Sender-0.8.23-1.el7.noarch.rpm perl-MIME-Types-1.38-2.el7.noarch.rpm

perl-Email-Date-Format-1.002-15.el7.noarch.rpm perl-Mail-Sendmail-0.79-21.el7.noarch.rpm perl-Parallel-ForkManager-1.18-2.el7.noarch.rpm

perl-Log-Dispatch-2.41-1.el7.1.noarch.rpm perl-MIME-Lite-3.030-1.el7.noarch.rpm perl-Params-Validate-1.08-4.el7.x86_64.rpm

[root@ybb-test-01 dependent1]# yum -y localinstall ./*

已加载插件：fastestmirror

正在检查 ./perl-Config-Tiny-2.14-7.el7.noarch.rpm: perl-Config-Tiny-2.14-7.el7.noarch

./perl-Config-Tiny-2.14-7.el7.noarch.rpm：不更新已安装的软件包。

正在检查 ./perl-Email-Date-Format-1.002-15.el7.noarch.rpm: perl-Email-Date-Format-1.002-15.el7.noarch

./perl-Email-Date-Format-1.002-15.el7.noarch.rpm：不更新已安装的软件包。

正在检查 ./perl-Log-Dispatch-2.41-1.el7.1.noarch.rpm: perl-Log-Dispatch-2.41-1.el7.1.noarch

./perl-Log-Dispatch-2.41-1.el7.1.noarch.rpm：不更新已安装的软件包。

正在检查 ./perl-Mail-Sender-0.8.23-1.el7.noarch.rpm: perl-Mail-Sender-0.8.23-1.el7.noarch

./perl-Mail-Sender-0.8.23-1.el7.noarch.rpm：不更新已安装的软件包。

正在检查 ./perl-Mail-Sendmail-0.79-21.el7.noarch.rpm: perl-Mail-Sendmail-0.79-21.el7.noarch

./perl-Mail-Sendmail-0.79-21.el7.noarch.rpm：不更新已安装的软件包。

正在检查 ./perl-MIME-Lite-3.030-1.el7.noarch.rpm: perl-MIME-Lite-3.030-1.el7.noarch

./perl-MIME-Lite-3.030-1.el7.noarch.rpm：不更新已安装的软件包。

正在检查 ./perl-MIME-Types-1.38-2.el7.noarch.rpm: perl-MIME-Types-1.38-2.el7.noarch

./perl-MIME-Types-1.38-2.el7.noarch.rpm：不更新已安装的软件包。

正在检查 ./perl-Parallel-ForkManager-1.18-2.el7.noarch.rpm: perl-Parallel-ForkManager-1.18-2.el7.noarch

./perl-Parallel-ForkManager-1.18-2.el7.noarch.rpm：不更新已安装的软件包。

正在检查 ./perl-Params-Validate-1.08-4.el7.x86_64.rpm: perl-Params-Validate-1.08-4.el7.x86_64

./perl-Params-Validate-1.08-4.el7.x86_64.rpm：不更新已安装的软件包。

无须任何处理

以上这些包均已安装过所以显示不更新已安装的软件包

(2)安装MHA相关组件包

[root@ybb-test-02-6 mha]#rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm

准备中... ################################# [100%]

软件包 mha4mysql-node-0.58-0.el7.centos.noarch 已经安装

[root@ybb-test-02-6 mha]# rpm -ivh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm

准备中... ################################# [100%]

软件包 mha4mysql-manager-0.58-0.el7.centos.noarch 已经安装

[root@ybb-test-02-6 mha]#

4.2.2 MHA安装(其他节点安装MHA_node)

192.168.13.189(mha_node)

192.168.13.190(mha_node)

192.168.13.192(mha_node)

在mha_node安装mha4mysql-node-0.58-0.el7.centos.noarch.rpm，【注意：MHA_node版本不论6或7都可以使用低版本】

至此，MHA_node安装完成

4.2.3 MHA配置

(1)mkdir /etc/mha——创建目录

(2)vi /etc/mha/mha.cnf——编写MHA配置文件

[root@ybb-test-02-6 mha]# cat mha.cnf

[server default]

#mysql_admin and password

user=root

password=XINyang3009@@

#mha work_dir and mha_log

manager_workdir=/etc/mha

manager_log=/etc/mha/manager.log

#ssh connection account

ssh_user=root

#AB copy account and password

repl_user=slave

repl_password=XINyang3009@@

ping_interval=1

#mysql server

[server1]

hostname=192.168.13.189

ssh_port=22

master_binlog_dir=/var/lib/mysql

candidate_master=1

[server2]

hostname=192.168.13.190

ssh_port=22

master_binlog_dir=/var/lib/mysql

candidate_master=1

[server4]

hostname=192.168.13.192

ssh_port=22

master_binlog_dir=/var/lib/mysql

candidate_master=1

[root@ybb-test-02-6 mha]#

4.2.4 MHA互信和环境检查

(1)互信检查：masterha_check_ssh --conf=/etc/mha/mha.cnf

[root@ybb-test-02-6 mha]# masterha_check_ssh --conf=/etc/mha/mha.cnf

Thu Feb 28 09:50:57 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Thu Feb 28 09:50:57 2019 - [info] Reading application default configuration from /etc/mha/mha.cnf..

Thu Feb 28 09:50:57 2019 - [info] Reading server configuration from /etc/mha/mha.cnf..

Thu Feb 28 09:50:57 2019 - [info] Starting SSH connection tests..

Thu Feb 28 09:50:58 2019 - [debug]

Thu Feb 28 09:50:57 2019 - [debug] Connecting via SSH from root@192.168.13.189(192.168.13.189:22) to root@192.168.13.190(192.168.13.190:22)..

Thu Feb 28 09:50:58 2019 - [debug] ok.

Thu Feb 28 09:50:58 2019 - [debug] Connecting via SSH from root@192.168.13.189(192.168.13.189:22) to root@192.168.13.192(192.168.13.192:22)..

Thu Feb 28 09:50:58 2019 - [debug] ok.

Thu Feb 28 09:50:59 2019 - [debug]

Thu Feb 28 09:50:57 2019 - [debug] Connecting via SSH from root@192.168.13.190(192.168.13.190:22) to root@192.168.13.189(192.168.13.189:22)..

Thu Feb 28 09:50:58 2019 - [debug] ok.

Thu Feb 28 09:50:58 2019 - [debug] Connecting via SSH from root@192.168.13.190(192.168.13.190:22) to root@192.168.13.192(192.168.13.192:22)..

Thu Feb 28 09:50:59 2019 - [debug] ok.

Thu Feb 28 09:51:00 2019 - [debug]

Thu Feb 28 09:50:58 2019 - [debug] Connecting via SSH from root@192.168.13.192(192.168.13.192:22) to root@192.168.13.189(192.168.13.189:22)..

Thu Feb 28 09:50:59 2019 - [debug] ok.

Thu Feb 28 09:50:59 2019 - [debug] Connecting via SSH from root@192.168.13.192(192.168.13.192:22) to root@192.168.13.190(192.168.13.190:22)..

Thu Feb 28 09:51:00 2019 - [debug] ok.

Thu Feb 28 09:51:00 2019 - [info] All SSH connection tests passed successfully.

[root@ybb-test-02-6 mha]#

(2)环境健康检查：masterha_check_repl --conf=/etc/mha/mha.cnf

[root@ybb-test-02-6 mha]# masterha_check_repl --conf=/etc/mha/mha.cnf

Thu Feb 28 09:53:58 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Thu Feb 28 09:53:58 2019 - [info] Reading application default configuration from /etc/mha/mha.cnf..

Thu Feb 28 09:53:58 2019 - [info] Reading server configuration from /etc/mha/mha.cnf..

Thu Feb 28 09:53:58 2019 - [info] MHA::MasterMonitor version 0.58.

Thu Feb 28 09:53:59 2019 - [info] Multi-master configuration is detected. Current primary(writable) master is 192.168.13.190(192.168.13.190:3306)——可写

Thu Feb 28 09:53:59 2019 - [info] Master configurations are as below:

Master 192.168.13.190(192.168.13.190:3306), replicating from 192.168.13.189(192.168.13.189:3306)

Master 192.168.13.189(192.168.13.189:3306), replicating from 192.168.13.190(192.168.13.190:3306), read-only——可读(双主模式，只能有一个可写)

Thu Feb 28 09:53:59 2019 - [info] GTID failover mode = 0

Thu Feb 28 09:53:59 2019 - [info] Dead Servers:

Thu Feb 28 09:53:59 2019 - [info] Alive Servers:

Thu Feb 28 09:53:59 2019 - [info] 192.168.13.189(192.168.13.189:3306)

Thu Feb 28 09:53:59 2019 - [info] 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 09:53:59 2019 - [info] 192.168.13.192(192.168.13.192:3306)

Thu Feb 28 09:53:59 2019 - [info] Alive Slaves:

Thu Feb 28 09:53:59 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 09:53:59 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 09:53:59 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 09:53:59 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 09:53:59 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 09:53:59 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 09:53:59 2019 - [info] Current Alive Master: 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 09:53:59 2019 - [info] Checking slave configurations..

Thu Feb 28 09:53:59 2019 - [warning] relay_log_purge=0 is not set on slave 192.168.13.189(192.168.13.189:3306).

Thu Feb 28 09:53:59 2019 - [warning] relay_log_purge=0 is not set on slave 192.168.13.192(192.168.13.192:3306).

Thu Feb 28 09:53:59 2019 - [info] Checking replication filtering settings..

Thu Feb 28 09:53:59 2019 - [info] binlog_do_db= , binlog_ignore_db=

Thu Feb 28 09:53:59 2019 - [info] Replication filtering check ok.

Thu Feb 28 09:53:59 2019 - [info] GTID (with auto-pos) is not supported

Thu Feb 28 09:53:59 2019 - [info] Starting SSH connection tests..

Thu Feb 28 09:54:02 2019 - [info] All SSH connection tests passed successfully.

Thu Feb 28 09:54:02 2019 - [info] Checking MHA Node version..

Thu Feb 28 09:54:03 2019 - [info] Version check ok.

Thu Feb 28 09:54:03 2019 - [info] Checking SSH publickey authentication settings on the current master..

Thu Feb 28 09:54:04 2019 - [info] HealthCheck: SSH to 192.168.13.190 is reachable.

Thu Feb 28 09:54:04 2019 - [info] Master MHA Node version is 0.54.

Thu Feb 28 09:54:04 2019 - [info] Checking recovery script configurations on 192.168.13.190(192.168.13.190:3306)..

Thu Feb 28 09:54:04 2019 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --start_file=on.000001

Thu Feb 28 09:54:04 2019 - [info] Connecting to root@192.168.13.190(192.168.13.190:22)..

Creating /var/tmp if not exists.. ok.

Checking output directory is accessible or not..

ok.

Binlog found at /var/lib/mysql, up to on.000001

Thu Feb 28 09:54:05 2019 - [info] Binlog setting check done.

Thu Feb 28 09:54:05 2019 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..

Thu Feb 28 09:54:05 2019 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.13.189 --slave_ip=192.168.13.189 --slave_port=3306 --workdir=/var/tmp --target_version=8.0.13 --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=ybb-test-mysql-1-relay-bin.000002 --slave_pass=xxx

Thu Feb 28 09:54:05 2019 - [info] Connecting to root@192.168.13.189(192.168.13.189:22)..

Checking slave recovery environment settings..

Relay log found at /var/lib/mysql, up to ybb-test-mysql-1-relay-bin.000002

Temporary relay log file is /var/lib/mysql/ybb-test-mysql-1-relay-bin.000002

Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.

done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Thu Feb 28 09:54:05 2019 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.13.192 --slave_ip=192.168.13.192 --slave_port=3306 --workdir=/var/tmp --target_version=8.0.13 --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=ybb-test-mysql-4-relay-bin.000002 --slave_pass=xxx

Thu Feb 28 09:54:05 2019 - [info] Connecting to root@192.168.13.192(192.168.13.192:22)..

Checking slave recovery environment settings..

Relay log found at /var/lib/mysql, up to ybb-test-mysql-4-relay-bin.000002

Temporary relay log file is /var/lib/mysql/ybb-test-mysql-4-relay-bin.000002

Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.

Testing mysql connection and privileges..

mysql: [Warning] Using a password on the command line interface can be insecure.

done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Thu Feb 28 09:54:06 2019 - [info] Slaves settings check done.

Thu Feb 28 09:54:06 2019 - [info]

192.168.13.190(192.168.13.190:3306) (current master)

+--192.168.13.189(192.168.13.189:3306)

+--192.168.13.192(192.168.13.192:3306)

Thu Feb 28 09:54:06 2019 - [info] Checking replication health on 192.168.13.189..

Thu Feb 28 09:54:06 2019 - [info] ok.

Thu Feb 28 09:54:06 2019 - [info] Checking replication health on 192.168.13.192..

Thu Feb 28 09:54:06 2019 - [info] ok.

Thu Feb 28 09:54:06 2019 - [warning] master_ip_failover_script is not defined.

Thu Feb 28 09:54:06 2019 - [warning] shutdown_script is not defined.

Thu Feb 28 09:54:06 2019 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

[root@ybb-test-02-6 mha]#

4.2.5 启动及相关日志文件

(1)启动：

nohup masterha_manager --conf=/etc/mha/mha.cnf > /tmp/mha_manager.log < /dev/null 2>&1 &

[root@ybb-test-02-6 mha]#nohup masterha_manager --conf=/etc/mha/mha.cnf > /tmp/mha_manager.log &1 &

[1] 6492

[root@ybb-test-02-6 mha]# jobs

[1]+ 运行中nohup masterha_manager --conf=/etc/mha/mha.cnf > /tmp/mha_manager.log < /dev/null 2>&1 &

[root@ybb-test-02-6 mha]#

(2)日志文件

manager.log文件是MHA的启动日志文件，当启动成功会自动生成一个这样的文件并且记载了启动过程

[root@ybb-test-02-6 mha]# tail -f manager.log

Thu Feb 28 10:05:57 2019 - [warning] master_ip_failover_script is not defined.

Thu Feb 28 10:05:57 2019 - [warning] shutdown_script is not defined.

Thu Feb 28 10:05:57 2019 - [info] Set master ping interval 1 seconds.

Thu Feb 28 10:05:57 2019 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.

Thu Feb 28 10:05:57 2019 - [info] Starting ping health check on 192.168.13.190(192.168.13.190:3306)..

Thu Feb 28 10:06:01 2019 - [warning] Got error when monitoring master: at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 489.

Thu Feb 28 10:06:01 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln491] Target master's advisory lock is already held by someone. Please check whether you monitor the same master from multiple monitoring processes.

Thu Feb 28 10:06:01 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln511] Error happened on health checking. at /usr/bin/masterha_manager line 50.

Thu Feb 28 10:06:01 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.

Thu Feb 28 10:06:01 2019 - [info] Got exit code 1 (Not master dead).——这些是默认自带的

下面是完整的启动流程：

Thu Feb 28 10:06:41 2019 - [info] MHA::MasterMonitor version 0.58.

Thu Feb 28 10:06:41 2019 - [warning] /etc/mha/mha.master_status.health already exists. You might have killed manager with SIGKILL(-9), may run two or more monitoring process for the same application, or use the same working directory. Check for details, and consider setting --workdir separately.

Thu Feb 28 10:06:43 2019 - [info] Multi-master configuration is detected. Current primary(writable) master is 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:06:43 2019 - [info] Master configurations are as below:

Master 192.168.13.190(192.168.13.190:3306), replicating from 192.168.13.189(192.168.13.189:3306)

Master 192.168.13.189(192.168.13.189:3306), replicating from 192.168.13.190(192.168.13.190:3306), read-only

Thu Feb 28 10:06:43 2019 - [info] GTID failover mode = 0

Thu Feb 28 10:06:43 2019 - [info] Dead Servers:

Thu Feb 28 10:06:43 2019 - [info] Alive Servers:

Thu Feb 28 10:06:43 2019 - [info] 192.168.13.189(192.168.13.189:3306)

Thu Feb 28 10:06:43 2019 - [info] 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:06:43 2019 - [info] 192.168.13.192(192.168.13.192:3306)

Thu Feb 28 10:06:43 2019 - [info] Alive Slaves:

Thu Feb 28 10:06:43 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:06:43 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:06:43 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:06:43 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:06:43 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:06:43 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:06:43 2019 - [info] Current Alive Master: 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:06:43 2019 - [info] Checking slave configurations..

Thu Feb 28 10:06:43 2019 - [warning] relay_log_purge=0 is not set on slave 192.168.13.189(192.168.13.189:3306).

Thu Feb 28 10:06:43 2019 - [warning] relay_log_purge=0 is not set on slave 192.168.13.192(192.168.13.192:3306).

Thu Feb 28 10:06:43 2019 - [info] Checking replication filtering settings..

Thu Feb 28 10:06:43 2019 - [info] binlog_do_db= , binlog_ignore_db=

Thu Feb 28 10:06:43 2019 - [info] Replication filtering check ok.

Thu Feb 28 10:06:43 2019 - [info] GTID (with auto-pos) is not supported

Thu Feb 28 10:06:43 2019 - [info] Starting SSH connection tests..

Thu Feb 28 10:06:46 2019 - [info] All SSH connection tests passed successfully.

Thu Feb 28 10:06:46 2019 - [info] Checking MHA Node version..

Thu Feb 28 10:06:47 2019 - [info] Version check ok.

Thu Feb 28 10:06:47 2019 - [info] Checking SSH publickey authentication settings on the current master..

Thu Feb 28 10:06:47 2019 - [info] HealthCheck: SSH to 192.168.13.190 is reachable.

Thu Feb 28 10:06:48 2019 - [info] Master MHA Node version is 0.54.

Thu Feb 28 10:06:48 2019 - [info] Checking recovery script configurations on 192.168.13.190(192.168.13.190:3306)..

Thu Feb 28 10:06:48 2019 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --start_file=on.000001

Thu Feb 28 10:06:48 2019 - [info] Connecting to root@192.168.13.190(192.168.13.190:22)..

Creating /var/tmp if not exists.. ok.

Checking output directory is accessible or not..

ok.

Binlog found at /var/lib/mysql, up to on.000001

Thu Feb 28 10:06:48 2019 - [info] Binlog setting check done.

Thu Feb 28 10:06:48 2019 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..

Thu Feb 28 10:06:48 2019 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.13.189 --slave_ip=192.168.13.189 --slave_port=3306 --workdir=/var/tmp --target_version=8.0.13 --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=ybb-test-mysql-1-relay-bin.000002 --slave_pass=xxx

Thu Feb 28 10:06:48 2019 - [info] Connecting to root@192.168.13.189(192.168.13.189:22)..

Checking slave recovery environment settings..

Relay log found at /var/lib/mysql, up to ybb-test-mysql-1-relay-bin.000002

Temporary relay log file is /var/lib/mysql/ybb-test-mysql-1-relay-bin.000002

Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.

done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Thu Feb 28 10:06:49 2019 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.13.192 --slave_ip=192.168.13.192 --slave_port=3306 --workdir=/var/tmp --target_version=8.0.13 --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=ybb-test-mysql-4-relay-bin.000002 --slave_pass=xxx

Thu Feb 28 10:06:49 2019 - [info] Connecting to root@192.168.13.192(192.168.13.192:22)..

Checking slave recovery environment settings..

Relay log found at /var/lib/mysql, up to ybb-test-mysql-4-relay-bin.000002

Temporary relay log file is /var/lib/mysql/ybb-test-mysql-4-relay-bin.000002

Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.

Testing mysql connection and privileges..

mysql: [Warning] Using a password on the command line interface can be insecure.

done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Thu Feb 28 10:06:50 2019 - [info] Slaves settings check done.

Thu Feb 28 10:06:50 2019 - [info]

192.168.13.190(192.168.13.190:3306) (current master)

+--192.168.13.189(192.168.13.189:3306)

+--192.168.13.192(192.168.13.192:3306)

Thu Feb 28 10:06:50 2019 - [warning] master_ip_failover_script is not defined.

Thu Feb 28 10:06:50 2019 - [warning] shutdown_script is not defined.

Thu Feb 28 10:06:50 2019 - [info] Set master ping interval 1 seconds.

Thu Feb 28 10:06:50 2019 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.

Thu Feb 28 10:06:50 2019 - [info] Starting ping health check on 192.168.13.190(192.168.13.190:3306)..

Thu Feb 28 10:06:54 2019 - [warning] Got error when monitoring master: at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 489.

Thu Feb 28 10:06:54 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln491] Target master's advisory lock is already held by someone. Please check whether you monitor the same master from multiple monitoring processes.

Thu Feb 28 10:06:54 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln511] Error happened on health checking. at /usr/bin/masterha_manager line 50.

Thu Feb 28 10:06:54 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.

Thu Feb 28 10:06:54 2019 - [info] Got exit code 1(Not master dead).

启动成功。

4.2.6 MHA测试

MHA其目的就是当master故障时能及时剔除掉故障并能推选出new_master从而使其他slave的master指向new_master，使业务不中断。

(1)停掉master(192.168.13.190)上的mysql

(2)查看MHA的进程日志manager.log

查看红色标记部分即可

[root@ybb-test-02-6 mha]# tail -f manager.log

Thu Feb 28 10:14:54 2019 - [warning] Got error on MySQL select ping: 2013 (Lost connection to MySQL server during query)

Thu Feb 28 10:14:54 2019 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --binlog_prefix=on

Thu Feb 28 10:14:55 2019 - [info] HealthCheck: SSH to 192.168.13.190 is reachable.

Thu Feb 28 10:14:55 2019 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.13.190' (111))

Thu Feb 28 10:14:55 2019 - [warning] Connection failed 2 time(s)..

Thu Feb 28 10:14:56 2019 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.13.190' (111))

Thu Feb 28 10:14:56 2019 - [warning] Connection failed 3 time(s)..

Thu Feb 28 10:14:57 2019 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.13.190' (111))

Thu Feb 28 10:14:57 2019 - [warning] Connection failed 4 time(s)..

Thu Feb 28 10:14:57 2019 - [warning] Master is not reachable from health checker!

Thu Feb 28 10:14:57 2019 - [warning] Master 192.168.13.190(192.168.13.190:3306) is not reachable!

Thu Feb 28 10:14:57 2019 - [warning] SSH is reachable.

Thu Feb 28 10:14:57 2019 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha/mha.cnf again, and trying to connect to all servers to check server status..

Thu Feb 28 10:14:57 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Thu Feb 28 10:14:57 2019 - [info] Reading application default configuration from /etc/mha/mha.cnf..

Thu Feb 28 10:14:57 2019 - [info] Reading server configuration from /etc/mha/mha.cnf..

Thu Feb 28 10:14:58 2019 - [info] GTID failover mode = 0

Thu Feb 28 10:14:58 2019 - [info] Dead Servers:

Thu Feb 28 10:14:58 2019 - [info] 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:14:58 2019 - [info] Alive Servers:

Thu Feb 28 10:14:58 2019 - [info] 192.168.13.189(192.168.13.189:3306)

Thu Feb 28 10:14:58 2019 - [info] 192.168.13.192(192.168.13.192:3306)

Thu Feb 28 10:14:58 2019 - [info] Alive Slaves:

Thu Feb 28 10:14:58 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:14:58 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:14:58 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:14:58 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:14:58 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:14:58 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:14:58 2019 - [info] Checking slave configurations..

Thu Feb 28 10:14:58 2019 - [warning] relay_log_purge=0 is not set on slave 192.168.13.189(192.168.13.189:3306).

Thu Feb 28 10:14:58 2019 - [warning] relay_log_purge=0 is not set on slave 192.168.13.192(192.168.13.192:3306).

Thu Feb 28 10:14:58 2019 - [info] Checking replication filtering settings..

Thu Feb 28 10:14:58 2019 - [info] Replication filtering check ok.

Thu Feb 28 10:14:58 2019 - [info] Master is down!

Thu Feb 28 10:14:58 2019 - [info] Terminating monitoring script.

Thu Feb 28 10:14:58 2019 - [info] Got exit code 20 (Master dead).

Thu Feb 28 10:14:58 2019 - [info] MHA::MasterFailover version 0.58.

Thu Feb 28 10:14:58 2019 - [info] Starting master failover.

Thu Feb 28 10:14:58 2019 - [info]

Thu Feb 28 10:14:58 2019 - [info] * Phase 1: Configuration Check Phase..

Thu Feb 28 10:14:58 2019 - [info]

Thu Feb 28 10:15:00 2019 - [info] GTID failover mode = 0

Thu Feb 28 10:15:00 2019 - [info] Dead Servers:

Thu Feb 28 10:15:00 2019 - [info] 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:00 2019 - [info] Checking master reachability via MySQL(double check)...

Thu Feb 28 10:15:00 2019 - [info] ok.

Thu Feb 28 10:15:00 2019 - [info] Alive Servers:

Thu Feb 28 10:15:00 2019 - [info] 192.168.13.189(192.168.13.189:3306)

Thu Feb 28 10:15:00 2019 - [info] 192.168.13.192(192.168.13.192:3306)

Thu Feb 28 10:15:00 2019 - [info] Alive Slaves:

Thu Feb 28 10:15:00 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:00 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:00 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:00 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:00 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:00 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:00 2019 - [info] Starting Non-GTID based failover.

Thu Feb 28 10:15:00 2019 - [info]

Thu Feb 28 10:15:00 2019 - [info] ** Phase 1: Configuration Check Phase completed.

Thu Feb 28 10:15:00 2019 - [info]

Thu Feb 28 10:15:00 2019 - [info] * Phase 2: Dead Master Shutdown Phase..

Thu Feb 28 10:15:00 2019 - [info]

Thu Feb 28 10:15:00 2019 - [info] Forcing shutdown so that applications never connect to the current master..

Thu Feb 28 10:15:00 2019 - [warning] master_ip_failover_script is not set. Skipping invalidating dead master IP address.

Thu Feb 28 10:15:00 2019 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.

Thu Feb 28 10:15:01 2019 - [info] * Phase 2: Dead Master Shutdown Phase completed.

Thu Feb 28 10:15:01 2019 - [info]

Thu Feb 28 10:15:01 2019 - [info] * Phase 3: Master Recovery Phase..

Thu Feb 28 10:15:01 2019 - [info]

Thu Feb 28 10:15:01 2019 - [info] * Phase 3.1: Getting Latest Slaves Phase..

Thu Feb 28 10:15:01 2019 - [info]

Thu Feb 28 10:15:01 2019 - [info] The latest binary log file/position on all slaves is on.000001:661

Thu Feb 28 10:15:01 2019 - [info] Latest slaves (Slaves that received relay log files to the latest):

Thu Feb 28 10:15:01 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:01 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:01 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:01 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:01 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:01 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:01 2019 - [info] The oldest binary log file/position on all slaves is on.000001:661

Thu Feb 28 10:15:01 2019 - [info] Oldest slaves:

Thu Feb 28 10:15:01 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:01 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:01 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:01 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:01 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:01 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:01 2019 - [info]

Thu Feb 28 10:15:01 2019 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..

Thu Feb 28 10:15:01 2019 - [info]

Thu Feb 28 10:15:01 2019 - [info] Fetching dead master's binary logs..

Thu Feb 28 10:15:01 2019 - [info] Executing command on the dead master 192.168.13.190(192.168.13.190:3306): save_binary_logs --command=save --start_file=on.000001 --start_pos=661 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.58

Creating /var/tmp if not exists.. ok.

Concat binary/relay logs from on.000001 pos 661 to on.000001 EOF into /var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog ..

Dumping binlog format description event, from position 0 to 124.. ok.

Dumping effective binlog data from /var/lib/mysql/on.000001 position 661 to tail(684).. ok.

Concat succeeded.

Thu Feb 28 10:15:03 2019 - [info] scp from root@192.168.13.190:/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog to local:/etc/mha/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog succeeded.

Thu Feb 28 10:15:03 2019 - [info] HealthCheck: SSH to 192.168.13.189 is reachable.

Thu Feb 28 10:15:04 2019 - [info] HealthCheck: SSH to 192.168.13.192 is reachable.

Thu Feb 28 10:15:04 2019 - [info]

Thu Feb 28 10:15:04 2019 - [info] * Phase 3.3: Determining New Master Phase..

Thu Feb 28 10:15:04 2019 - [info]

Thu Feb 28 10:15:04 2019 - [info] Finding the latest slave that has all relay logs for recovering other slaves..

Thu Feb 28 10:15:04 2019 - [info] All slaves received relay logs to the same position. No need to resync each other.

Thu Feb 28 10:15:04 2019 - [info] Searching new master from slaves..

Thu Feb 28 10:15:04 2019 - [info] Candidate masters from the configuration file:

Thu Feb 28 10:15:04 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:04 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:04 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:04 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled

Thu Feb 28 10:15:04 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)

Thu Feb 28 10:15:04 2019 - [info] Primary candidate for the new Master (candidate_master is set)

Thu Feb 28 10:15:04 2019 - [info] Non-candidate masters:

Thu Feb 28 10:15:04 2019 - [info] Searching from candidate_master slaves which have received the latest relay log events..

Thu Feb 28 10:15:04 2019 - [info] New master is 192.168.13.189(192.168.13.189:3306)

Thu Feb 28 10:15:04 2019 - [info] Starting master failover..

Thu Feb 28 10:15:04 2019 - [info]

From:

192.168.13.190(192.168.13.190:3306) (current master)

+--192.168.13.189(192.168.13.189:3306)

+--192.168.13.192(192.168.13.192:3306)

To:

192.168.13.189(192.168.13.189:3306) (new master)

+--192.168.13.192(192.168.13.192:3306)

Thu Feb 28 10:15:04 2019 - [info]

Thu Feb 28 10:15:04 2019 - [info] * Phase 3.4: New Master Diff Log Generation Phase..

Thu Feb 28 10:15:04 2019 - [info]

Thu Feb 28 10:15:04 2019 - [info] This server has all relay logs. No need to generate diff files from the latest slave.

Thu Feb 28 10:15:04 2019 - [info] Sending binlog..

Thu Feb 28 10:15:05 2019 - [info] scp from local:/etc/mha/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog to root@192.168.13.189:/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog succeeded.

Thu Feb 28 10:15:05 2019 - [info]

Thu Feb 28 10:15:05 2019 - [info] * Phase 3.5: Master Log Apply Phase..

Thu Feb 28 10:15:05 2019 - [info]

Thu Feb 28 10:15:05 2019 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.

Thu Feb 28 10:15:05 2019 - [info] Starting recovery on 192.168.13.189(192.168.13.189:3306)..

Thu Feb 28 10:15:05 2019 - [info] Generating diffs succeeded.

Thu Feb 28 10:15:05 2019 - [info] Waiting until all relay logs are applied.

Thu Feb 28 10:15:05 2019 - [info] done.

Thu Feb 28 10:15:05 2019 - [info] Getting slave status..

Thu Feb 28 10:15:05 2019 - [info] This slave(192.168.13.189)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(on.000001:661). No need to recover from Exec_Master_Log_Pos.

Thu Feb 28 10:15:05 2019 - [info] Connecting to the target slave host 192.168.13.189, running recover script..

Thu Feb 28 10:15:05 2019 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user='root' --slave_host=192.168.13.189 --slave_ip=192.168.13.189 --slave_port=3306 --apply_files=/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog --workdir=/var/tmp --target_version=8.0.13 --timestamp=20190228101458 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.58 --slave_pass=xxx

Thu Feb 28 10:15:05 2019 - [info]

Applying differential binary/relay log files /var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog on 192.168.13.189:3306. This may take long time...

Applying log files succeeded.

Thu Feb 28 10:15:05 2019 - [info] All relay logs were successfully applied.

Thu Feb 28 10:15:05 2019 - [info] Getting new master's binlog name and position..

Thu Feb 28 10:15:06 2019 - [info] mysql-bin.000009:654

Thu Feb 28 10:15:06 2019 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.13.189', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000009', MASTER_LOG_POS=654, MASTER_USER='slave', MASTER_PASSWORD='xxx';

Thu Feb 28 10:15:06 2019 - [warning] master_ip_failover_script is not set. Skipping taking over new master IP address.

Thu Feb 28 10:15:06 2019 - [info] Setting read_only=0 on 192.168.13.189(192.168.13.189:3306)..

Thu Feb 28 10:15:06 2019 - [info] ok.

Thu Feb 28 10:15:06 2019 - [info] ** Finished master recovery successfully.

Thu Feb 28 10:15:06 2019 - [info] * Phase 3: Master Recovery Phase completed.

Thu Feb 28 10:15:06 2019 - [info]

Thu Feb 28 10:15:06 2019 - [info] * Phase 4: Slaves Recovery Phase..

Thu Feb 28 10:15:06 2019 - [info]

Thu Feb 28 10:15:06 2019 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..

Thu Feb 28 10:15:06 2019 - [info]

Thu Feb 28 10:15:06 2019 - [info] -- Slave diff file generation on host 192.168.13.192(192.168.13.192:3306) started, pid: 11154. Check tmp log /etc/mha/192.168.13.192_3306_20190228101458.log if it takes time..

Thu Feb 28 10:15:07 2019 - [info]

Thu Feb 28 10:15:07 2019 - [info] Log messages from 192.168.13.192 ...

Thu Feb 28 10:15:07 2019 - [info]

Thu Feb 28 10:15:06 2019 - [info] This server has all relay logs. No need to generate diff files from the latest slave.

Thu Feb 28 10:15:07 2019 - [info] End of log messages from 192.168.13.192.

Thu Feb 28 10:15:07 2019 - [info] -- 192.168.13.192(192.168.13.192:3306) has the latest relay log events.

Thu Feb 28 10:15:07 2019 - [info] Generating relay diff files from the latest slave succeeded.

Thu Feb 28 10:15:07 2019 - [info]

Thu Feb 28 10:15:07 2019 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..

Thu Feb 28 10:15:07 2019 - [info]

Thu Feb 28 10:15:07 2019 - [info] -- Slave recovery on host 192.168.13.192(192.168.13.192:3306) started, pid: 11157. Check tmp log /etc/mha/192.168.13.192_3306_20190228101458.log if it takes time..

Thu Feb 28 10:15:09 2019 - [info]

Thu Feb 28 10:15:09 2019 - [info] Log messages from 192.168.13.192 ...

Thu Feb 28 10:15:09 2019 - [info]

Thu Feb 28 10:15:07 2019 - [info] Sending binlog..

Thu Feb 28 10:15:07 2019 - [info] scp from local:/etc/mha/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog to root@192.168.13.192:/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog succeeded.

Thu Feb 28 10:15:07 2019 - [info] Starting recovery on 192.168.13.192(192.168.13.192:3306)..

Thu Feb 28 10:15:07 2019 - [info] Generating diffs succeeded.

Thu Feb 28 10:15:07 2019 - [info] Waiting until all relay logs are applied.

Thu Feb 28 10:15:07 2019 - [info] done.

Thu Feb 28 10:15:07 2019 - [info] Getting slave status..

Thu Feb 28 10:15:07 2019 - [info] This slave(192.168.13.192)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(on.000001:661). No need to recover from Exec_Master_Log_Pos.

Thu Feb 28 10:15:07 2019 - [info] Connecting to the target slave host 192.168.13.192, running recover script..

Thu Feb 28 10:15:07 2019 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user='root' --slave_host=192.168.13.192 --slave_ip=192.168.13.192 --slave_port=3306 --apply_files=/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog --workdir=/var/tmp --target_version=8.0.13 --timestamp=20190228101458 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.58 --slave_pass=xxx

Thu Feb 28 10:15:08 2019 - [info]

Applying differential binary/relay log files /var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog on 192.168.13.192:3306. This may take long time...

Applying log files succeeded.

Thu Feb 28 10:15:08 2019 - [info] All relay logs were successfully applied.

Thu Feb 28 10:15:08 2019 - [info] Resetting slave 192.168.13.192(192.168.13.192:3306) and starting replication from the new master 192.168.13.189(192.168.13.189:3306)..

Thu Feb 28 10:15:08 2019 - [info] Executed CHANGE MASTER.

Thu Feb 28 10:15:08 2019 - [info] Slave started.

Thu Feb 28 10:15:09 2019 - [info] End of log messages from 192.168.13.192.

Thu Feb 28 10:15:09 2019 - [info] -- Slave recovery on host 192.168.13.192(192.168.13.192:3306) succeeded.

Thu Feb 28 10:15:09 2019 - [info] All new slave servers recovered successfully.

Thu Feb 28 10:15:09 2019 - [info]

Thu Feb 28 10:15:09 2019 - [info] * Phase 5: New master cleanup phase..

Thu Feb 28 10:15:09 2019 - [info]

Thu Feb 28 10:15:09 2019 - [info] Resetting slave info on the new master..

Thu Feb 28 10:15:09 2019 - [info] 192.168.13.189: Resetting slave info succeeded.

Thu Feb 28 10:15:09 2019 - [info] Master failover to 192.168.13.189(192.168.13.189:3306) completed successfully.

Thu Feb 28 10:15:09 2019 - [info]

----- Failover Report -----

mha: MySQL Master failover 192.168.13.190(192.168.13.190:3306) to 192.168.13.189(192.168.13.189:3306) succeeded

Master 192.168.13.190(192.168.13.190:3306) is down!

Check MHA Manager logs at ybb-test-02-6:/etc/mha/manager.log for details.

Started automated(non-interactive) failover.

The latest slave 192.168.13.189(192.168.13.189:3306) has all relay logs for recovery.

Selected 192.168.13.189(192.168.13.189:3306) as a new master.

192.168.13.189(192.168.13.189:3306): OK: Applying all logs succeeded.

192.168.13.192(192.168.13.192:3306): This host has the latest relay log events.

Generating relay diff files from the latest slave succeeded.

192.168.13.192(192.168.13.192:3306): OK: Applying all logs succeeded. Slave started, replicating from 192.168.13.189(192.168.13.189:3306)

192.168.13.189(192.168.13.189:3306): Resetting slave info succeeded.

Master failover to 192.168.13.189(192.168.13.189:3306) completed successfully.

查看192.168.13.192上mysql的master是不是已经成功转变了：

由此可见，MHA已经起到高可用的目的了

4.3 集群小结

一、或者，其他方案？不管哪种方案都是有其场景限制或说规模限制，以及优缺点的。

1、首先反对大家做读写分离，关于这方面的原因解释太多次数(增加技术复杂度、可能导致读到落后的数据等)，只说一点：99.8%的业务场景没有必要做读写分离，只要做好数据库设计优化和配置合适正确的主机即可。

2、Keepalived+MySQL --确实有脑裂的问题，还无法做到准确判断mysqld是否HANG的情况；

3、DRBD+Heartbeat+MySQL --同样有脑裂的问题，还无法做到准确判断mysqld是否HANG的情况，且DRDB是不需要的，增加反而会出问题；

3、MySQL Proxy --不错的项目，可惜官方半途夭折了，不建议用，无法高可用，是一个写分离；

4、MySQL Cluster --社区版本不支持NDB是错误的言论，商用案例确实不多，主要是跟其业务场景要求有关系、这几年发展有点乱不过现在已经上正规了、对网络要求高；

5、MySQL + MHA --可以解决脑裂的问题，需要的IP多，小集群是可以的，但是管理大的就麻烦，其次MySQL + MMM的话且坑很多，有MHA就没必要采用MMM建议：

(1)若是双主复制的模式，不用做数据拆分，那么就可以选择MHA或Keepalive或heartbeat

(2)若是双主复制，还做了数据的拆分，则可以考虑采用Cobar；

二、 MHA是自动的master故障转移和Slave提升的软件包.它是基于标准的MySQL复制(异步/半同步)。该软件由两部分组成：MHA Manager(管理节点)和MHA Node(数据节点)。

1)MHA Manager可以单独部署在一台独立的机器上管理多个master-slave集群，也可以部署在一台slave节点上。MHA Manager会定时探测集群中的node节点,当发现master出现故障的时候,它可以自动将具有最新数据的slave提升为新的master,然后将所有其它的slave导向新的master上.整个故障转移过程对应用程序是透明的。

2)MHA Node运行在每台MySQL服务器上，它通过监控具备解析和清理logs功能的脚本来加快故障转移的。

第5章错误集锦

5.1 主从报错

1、Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'

解决办法：(1)stop slave;(2)reset slave; (3)start slave;若一遍不行，隔一会儿再试一下

reset slave 将使slave忘记主从复制关系的位置信息。该语句将被用于干净的启动,它删除master.info文件和relay-log.info文件以及所有的relay log文件并重新启用一个新的relaylog文件。使用reset slave之前必须使用stop slave命令将复制进程停止。

2、若执行此3条命令不行，则使用以下方法

(1)删除所有bin-log日志【find / -name“*bin-log*” -exec ls {} \;】

(2)重新生成bin-log日志文件

change master to

-> master_log_file='bin_log.000001',

-> master_log_pos=155;

Start slave；

(3)若还是报和之前一样的错误，则：

① stop slave;

② reset slave;

③ start slave;

(4)完成

3、若提示连接超时或用户名、密码错误：

mysql -ubackup -h 2.2.2.2 -p123456，看看在从上能不能连上主mysql。

能：用户名+密码没问题

不能：用户名或密码错误

4、找不到mysql临时密码

[root@ybb-test-02-6 mysql]# cat /var/log/mysqld.log |grep password

[root@ybb-test-02-6 mysql]#

(1)删除原来安装过的mysql残留的数据(这一步非常重要，问题就出在这)

rm -rf /var/lib/mysql

(3)重启mysqld服务systemctl restart mysqld

(4)查看mysql日志文件

[root@ybb-test-02-6 mysql]# cat /var/log/mysqld.log |grep password

2019-02-25T08:50:58.053021Z 5 [Note] [MY-010454] [Server] A temporary password is generated for root@localhost: X=0gw#AAQdx*

5、User slave does not exist or does not have REPLICATION SLAVE privilege! Other slaves can not start replication from this host.

在master创建slave同步用户时未赋予权限，解决办法：

GRANT REPLICATION SLAVE ON *.* TO 'slave'@'%';

flush privileges;

6、

这个时候先等一会儿或查看日志，show slave status\G;若没有错误等一会儿错误就会报出来(几分钟)，报错如下：

Slave SQL for channel '': Could not execute Update_rows event on table mysql.user; Can't find record in 'user', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000006, end_log_pos 1768, Error_code: MY-001032

解决办法：

stop slave;

set global sql_slave_skip_counter=1;

start slave;

5.2 集群报错

1、[error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln781] Multi-master configuration is detected, but two or more masters are either writable (read-only is not set) or dead! Check configurations for details. Master configurations are as below:

由于MHA目前支持一主多从模式也就是只支持一个master写，其他master只能读的模式，所以当多个master并存时，只能设置其他master为read-only模式即可

解决办法：在其他master执行mysql -uroot -p -e "set global read_only=1"即可

2、

Mon Mar 4 15:59:33 2019 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln398] 192.168.13.190(192.168.13.190:3306): User slave does not exist or does not have REPLICATION SLAVE privilege! Other slaves can not start replication from this host.

Mon Mar 4 15:59:33 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/ServerManager.pm line 1403.

Mon Mar 4 15:59:33 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.

这是因为从192.168.13.190上面同步用户slave不存在导致，因为当master挂了之后，任何从都有可能成为master，所以若从没有同步用户slave会导致同步失败。