Mysql的AB及读写
第4章 集群
4.2 MHA
4.2.1 ssh互信
4.2.1 ABB(一主二从)/AABB(二主多从)架构搭建
4.2.2 MHA安装(其他节点安装MHA_node)
4.2.4 MHA互信和环境检查
4.2.5 启动及相关日志文件
第5章 错误集锦
第1章 Mysql的AB配置
当MySQL的版本是5.5或高于5.5的时候,其中大部分的内容相似
主要是5.5之后不再支持master打头的参数
主|从
IP地址
允许同步的用户
读|写
mysql版本
Mycat版本
JDK版本
操作系统版本
master
192.168.13.189
slave
write
mysql-8.0.13
CentOS 7.4.1708
192.168.13.192
copy
write
mysql-8.0.13
Mycat-server-1.6.6.1
1.8.0_191
CentOS 7.4.1708
slave
192.168.13.190
slave
read
mysql-8.0.13
CentOS 7.4.1708
192.168.13.191
slave
read
mysql-8.0.13
CentOS 7.4.1708
192.168.13.193
copy
read
mysql-8.0.13
CentOS 7.4.1708
1.1master配置
Master:
在/etc/my.cnf 添加:
server-id = 1
log-bin=/home/mysql/logs/binlog/bin-log(若报错则不加路径,如下)
log-bin=bin-log
max_binlog_size = 500M
binlog_cache_size = 128K
binlog-do-db = adb
binlog-ignore-db = mysql
log-slave-updates
expire_logs_day=2(在centos7.3上易报错,若报错,可不加)
binlog_format="MIXED"
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
server-id = 1 #服务器标志号,注意在配置文件中不能出现多个这样的标识,如果出现多个的话mysql以第一个为准,一组主从中此标识号不能重复。
log-bin=/home/mysql/logs/binlog/bin-log #开启bin-log,并指定文件目录和文件名前缀。
max_binlog_size = 500M #每个bin-log最大大小,当此大小等于500M时会自动生成一个新的日志文件。一条记录不会写在2个日志文件中,所以有时日志文件会超过此大小。
binlog_cache_size = 128K #日志缓存大小
binlog-do-db = adb #需要同步的数据库名字,如果是多个,就以此格式在写一行即可。
binlog-ignore-db = mysql #不需要同步的数据库名字,如果是多个,就以此格式在写一行即可。
log-slave-updates #当Slave从Master数据库读取日志时更新新写入日志中,如果只启动log-bin而没有启动log-slave-updates则Slave只记录针对自己数据库操作的更新。
expire_logs_day=2 #设置bin-log日志文件保存的天数,此参数mysql5.0以下版本不支持。
binlog_format=”MIXED” #设置bin-log日志文件格式为:MIXED,可以防止主键重复。
接着重启MySQL:/etc/init.d/mysqld restart
再进入MySQL:mysql -u root -p123456
创建backup用户:
创建一个用于让从数据库连接的用户(以前那种方式不能用了)
mysql> CREATE USER 'copy'@'%' IDENTIFIED WITH mysql_native_password BY 'tqw961110';(密码强度可能不够会报错,XINyang3009@@)或下面一个
mysql>CREATE USER 'slave'@'%' IDENTIFIED BY 'XINyang3009@@';
mysql>grant all on *.* to 'copy'@'%' identified by "10jqka@123" #这个方式也可以
mysql>flush privileges #刷新授权表
mysql> show master status;
Master的操作就完了
【server_id千万不要一样,否则会报错】
mysql> show master status;【192.168.13.192】
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000005 | 610 | mysql | | |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
mysql> show master status;【192.168.13.189】
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000006 | 3490 | | | |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
1.2slave配置步骤
在/etc/my.cnf 添加:
server_id=2
replicate-do-db=database1 #要同步的数据库的名字
replicate-ignore-db=mysql #被忽略的数据库
接着重启MySQL:/etc/init.d/mysqld restart
再进入MySQL:mysql -u root -p123456
同步执行此操作:
mysql> change master to
-> master_host='2.2.2.2',
-> master_user='backup',
-> master_password='123456',
-> master_log_file='bin_log.000003', # master上记录的日志文件
-> master_log_pos=120; # master上记录的日志文件位置
Query OK, 0 rows affected, 2 warnings (0.06 sec)
同步开启:start slave;
同步关闭:stop slave;
最后开启同步 start slave即可
1.2.1 192.168.13.190
mysql> change master to
-> master_host='192.168.13.189',
-> master_user='slave',
-> master_password='XINyang3009',
-> master_log_file='mysql-bin.000006',
-> master_log_pos=3490;
Query OK, 0 rows affected, 2 warnings (0.06 sec)
mysql> start slave;
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.13.189
Master_User: slave
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000006
Read_Master_Log_Pos: 3490
Relay_Log_File: ybb-test-mysql-2-relay-bin.000012
Relay_Log_Pos: 3704
Relay_Master_Log_File: mysql-bin.000006
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB: mysql
Slave_IO_Running: Yes(必须是yes)
Slave_SQL_Running: Yes(必须是yes)
Replicate_Do_DB: mysql
1.2.2 192.168.13.191
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.13.189
Master_User: slave
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000006
Read_Master_Log_Pos: 3490
Relay_Log_File: ybb-test-mysql-2-relay-bin.000012
Relay_Log_Pos: 3704
Relay_Master_Log_File: mysql-bin.000006
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB: mysql
Slave_IO_Running: Yes(必须是yes)
Slave_SQL_Running: Yes(必须是yes)
Replicate_Do_DB: mysql
1.2.3 192.168.13.192
1.2.4 192.168.13.193
1.2.4 192.168.13.189
至此,二主三从搭建完成。
#Slave_IO_Running:连接到主库,并读取主库的日志到本地,生成本地日志文件
#Slave_SQL_Running:读取本地日志文件,并执行日志里的SQL命令。
第2章读写分离
在这里读写分离中间件用的是mycat,部署在192.168.13.192上
2.1安装mycat
1、安装JDK,mycat依赖Java环境
2、解压Mycat-server-1.6.6.1-release-20181031195535-linux.tar.gz,同时修改环境变量 /etc/profile
[root@ybb-test-mysql-4 conf]# cat /etc/profile
3、进入配置文件目录cd ./mycat/conf
(1)server.xml是登陆mycat的用户进行配置的配置文件
(2)schema.xml 对真实数据库进行配置的配置文件
2.1.1 server.xml
[root@ybb-test-mysql-4 conf]# cat server.xml
97
99 XINyang3009@@
100 TESTDB
101
102
103
111
112
113
114 user
115 TESTDB
116
117
这里配置了两个可以来连接的用户
用户1 test 密码test给予了此用户TESTDB数据库的权限
用户2 user 密码user给予了此用户TESTDB数据库的只读权限
注意这里的TESTDB 不一定是你数据库上的真实库名.可以任意指定.只要和接下来的schema.xml的配置文件中的库名统一即可
2.1.1 schema.xml
[root@ybb-test-mysql-4 conf]# cat schema.xml
writeType="0" dbType="mysql" dbDriver="native" switchType="1" slaveThreshold="100">
select user()
1、
这里TESTDB 就是我们对外声称的我们有数据库的名称 必须和server.xml中的用户指定的数据库名称一致,添加一个dataNode="dn1"是指定了我们这个库只在dn1上.没有进行分库
2、
这里只需要改database的名字db1就是你真实数据库服务上的数据库名,根据你自己的数据库名进行修改.
3、
这里只需要配置三个地方 balance="1"与writeType="0" ,switchType=”1”
a. balance 属性负载均衡类型,目前的取值有4种:
1. balance="0", 不开启读写分离机制,所有读操作都发送到当前可用的writeHost上。
2. balance="1",全部的readHost与stand by writeHost参与select语句的负载均衡,简单的说,当双主双从模式(M1 ->S1,M2->S2,并且M1与M2互为主备),正常情况下,M2,S1,S2都参与select语句的负载均衡。
3. balance="2",所有读操作都随机的在writeHost、readhost上分发。
4. balance="3", 所有读请求随机的分发到wiriterHost对应的readhost执行,writerHost不负担读压力,注意balance=3只在1.4及其以后版本有,1.3没有。
b. writeType 属性
负载均衡类型,目前的取值有 3 种:
1. writeType="0", 所有写操作发送到配置的第一个writeHost,第一个挂了切到还生存的第二个
writeHost,重新启动后已切换后的为准,切换记录在配置文件中:dnindex.properties .
2. writeType="1",所有写操作都随机的发送到配置的writeHost。
3. writeType="2",没实现。
c. switchType 属性
1: 表示不自动切换
1 :默认值,自动切换
2 :基于MySQL主从同步的状态决定是否切换
4、
这里是配置的我们的两台读写服务器IP地址访问端口和 访问用户的用户名和密码
2.2启动mycat
2.2.1 启动mycat
到bin目录下,执行./mycat restart
[root@ybb-test-mysql-4 bin]# ps -aux|grep mycat
root 30319 0.0 0.0 17816 744 ? Sl 15:51 0:00 /usr/local/mysql/mycat/bin/./wrapper-linux-x86-64 /usr/local/mysql/mycat/conf/wrapper.conf wrapper.syslog.ident=mycat wrapper.pidfile=/usr/local/mysql/mycat/logs/mycat.pid wrapper.daemonize=TRUE wrapper.lockfile=/var/lock/subsys/mycat
root 30321 3.5 6.5 6927816 253020 ? Sl 15:51 0:03 java -DMYCAT_HOME=. -server -XX:MaxPermSize=64M -XX:+AggressiveOpts -XX:MaxDirectMemorySize=2G -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=1984 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Xmx4G -Xms1G -Djava.library.path=lib -classpath lib/wrapper.jar:conf:lib/asm-4.0.jar:lib/commons-collections-3.2.1.jar:lib/commons-lang-2.6.jar:lib/curator-client-2.11.0.jar:lib/curator-framework-2.11.0.jar:lib/curator-recipes-2.11.0.jar:lib/disruptor-3.3.4.jar:lib/dom4j-1.6.1.jar:lib/druid-1.0.26.jar:lib/ehcache-core-2.6.11.jar:lib/fastjson-1.2.12.jar:lib/guava-19.0.jar:lib/hamcrest-core-1.3.jar:lib/hamcrest-library-1.3.jar:lib/jline-0.9.94.jar:lib/joda-time-2.9.3.jar:lib/jsr305-2.0.3.jar:lib/kryo-2.10.jar:lib/leveldb-0.7.jar:lib/leveldb-api-0.7.jar:lib/libwrapper-linux-ppc-64.so:lib/libwrapper-linux-x86-32.so:lib/libwrapper-linux-x86-64.so:lib/log4j-1.2-api-2.5.jar:lib/log4j-1.2.17.jar:lib/log4j-api-2.5.jar:lib/log4j-core-2.5.jar:lib/log4j-slf4j-impl-2.5.jar:lib/mapdb-1.0.7.jar:lib/minlog-1.2.jar:lib/mongo-java-driver-2.11.4.jar:lib/Mycat-server-1.6.6.1-release.jar:lib/mysql-binlog-connector-java-0.16.1.jar:lib/mysql-connector-java-5.1.35.jar:lib/netty-3.7.0.Final.jar:lib/netty-buffer-4.1.9.Final.jar:lib/netty-common-4.1.9.Final.jar:lib/objenesis-1.2.jar:lib/reflectasm-1.03.jar:lib/sequoiadb-driver-1.12.jar:lib/slf4j-api-1.6.1.jar:lib/univocity-parsers-2.2.1.jar:lib/velocity-1.7.jar:lib/wrapper.jar:lib/zookeeper-3.4.6.jar -Dwrapper.key=aBVwhEZHoe1K2Vpw -Dwrapper.port=32000 -Dwrapper.jvm.port.min=31000 -Dwrapper.jvm.port.max=31999 -Dwrapper.pid=30319 -Dwrapper.version=3.2.3 -Dwrapper.native_library=wrapper -Dwrapper.service=TRUE -Dwrapper.cpu.timeout=10 -Dwrapper.jvmid=1 org.tanukisoftware.wrapper.WrapperSimpleApp io.mycat.MycatStartup start
root 30458 0.0 0.0 112676 956 pts/2 S+ 15:53 0:00 grep --color=auto mycat
[root@ybb-test-mysql-4 bin]#netstat -anlpt|grep :*066
tcp6 0 0 :::8066 :::* LISTEN 30321/java
tcp6 0 0 :::9066 :::* LISTEN 30321/java
[root@ybb-test-mysql-4 bin]#
8066:数据端口
9066:管理端口
命令行的登陆是通过 9066 管理端口来操作。
2.2.2 mycat日志
[root@ybb-test-mysql-4 logs]# cd /usr/local/mysql/mycat/logs
warpper 日志:mycat启动,停止,添加为服务等都会记录到此日志文件,如果系统环境配置错误或缺少配置时,导致Mycat无法启动,可以通过查看warrpper.log定位具体错误原因。
mycat.log:为mycat主要日志文件,记录了启动时分配的相关buffer信息,数据源连接信息,连接池,动态类加载信息等等在log4j.xml文件中进行相关配置,如保留个数,大小,字符集,日志文件大小等。非启动状态下可以删除,启动后会自动生成该日志文件
2.3登录mycat相关问题
命令:mysql -uroot -pXINyang3009@@ -h192.168.13.192 -P9066
[root@ybb-test-mysql-4 conf]# mysql -uroot -pXINyang3009@@ -h192.168.13.192 -P9066
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1045 (HY000): Access denied for user 'root', because password is error
特别注意:本次实验的mysql版本均为mysql-8.0.13,此故障折腾了我两三天,以上的配置查看了无数遍,后来查阅相关资料才知道目前MyCat仍主要面对MySql 5.5, 5.6, 5.7版,对最新的MySql 8尚未完全支持,需要用户对MySql 8和MyCat的配置进行一系列的修改。其中对mycat的登陆方式有改变,Mycat登录逻辑库的传统方式是:mysql -uroot -p -h127.0.0.1 -P8066 -DTESTDB,对于MySql 8,会报密码错误方式,这是由于Mysql 8的缺省加密方式已经改为caching_sha2_password,而MyCat对此尚不支持。当然可以通过修改MySQL8的加密方式使其通过,但未对MySQL8完全支持,所以会在功能上也会大打折扣,目前mycat官网并没有对MySQL8版本的详细文档,官网的文档目前对MySQL5.X依旧适用。
之前做实验之所以成功是因为我在云上也装了一台mysql 5.6,并且也是用5.6做的主从,当时做完之后没有用用mysql 8把版本做连接测试,用的是5.6版本,而此次用mysql8版本连接测试显示始终失败,还折腾两天左右,我才想到可能是版本的问题,为此我又在mysql5.6版本上重新做了一遍验证一下是否是由于版本太新导致mycat不支持从而出现的错误,于是又开始折腾很久。
1、版本简介:
OS版本
IP
读|写
MySQL版本
mycat版本
CentOS 7.2
2.2.2.10
读|写
5.6.19
Mycat-server-1.6.6.1
为尽快达到实验目的,只在一台虚机安装MySQL(时长约3h),并读写也在该主机上。
2、安装MySQL 5.6(详细过程略)
3、配置文件server.xml和schema.xml
4、mycat启动
到bin目录下,执行./mycat restart
5、登录
[root@localhost conf]# mysql -uroot -pXINyang3009@@ -h2.2.2.10 -P9066
Warning: Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.6.29-mycat-1.6.6.1-release-20181031195535 MyCat Server (monitor)
Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>
当出现mycat版本就说明登录成功。
6、查看当前可读写的入口
mysql> show @@datasource;
+----------+--------+-------+----------+------+------+--------+------+------+---------+-----------+---
| DATANODE | NAME | TYPE | HOST | PORT | W/R | ACTIVE | IDLE | SIZE | EXECUTE | READ_LOAD | WRITE_LOAD |
+----------+--------+-------+----------+------+------+--------+------+------+---------+-----------+---|dn1 | write1 | mysql | 2.2.2.10 | 3306 | W | 0 | 10 | 1000 | 442 | 2 | 1 |
| dn1 | write1 | mysql | 2.2.2.10 | 3306 | R | 0 | 8 | 1000 | 436 | 0 | 0 |
+----------+--------+-------+----------+------+------+--------+------+------+---------+-----------+---
2 rows in set (0.02 sec)
7、知识点总结
8、总结
由上可知,最新的mycat并不支持最新版本的mysql,也许能够连上去但在并未完全开放之前,相信很多mycat的功能会受限。
经过第二次试验可知,在mysql 8上线的配置并没有错,只不过是由于mycat的不支持最新的mysql版本导致,所以等mycat支持mysql 8的时候就可以生效了
2.4小结
由于mycat目前并不全面支持mysql最新版本,所以目前读写分离我选的是mysql官方推荐的mysql-proxy,功能虽小但基于mysql 8版本的过度期还是一款不错的产品。
第3章mysql-proxy
3.1 安装mysql-proxy
1、使用本地镜像安装相关环境:
yum install -y gcc* gcc-c++* autoconf* automake* zlib* libxml* ncurses-devel* libmcrypt* libtool* flex* pkgconfig* libevent* glib*
2、下载mysql-proxy-0.8.5-linux-glibc2.3-x86-64bit.tar
3、解压安装
4、环境变量
[root@localhost ~]# vi .bash_profile
PATH=$PATH:$HOME/bin:/mysqlprox/proxy/bin
3.2 配置mysql-proxy
mkdir /home/mysql-proxy/logs【日志文件位置】
mkdir /home/mysql-proxy/lua 【脚本位置】
cd /usr/local/mysql-proxy【mysql-proxy安装位置】
cp share/doc/mysql-proxy/rw-splitting.lua ./lua 【复制读写分离配置文件】
vi /etc/mysql-proxy.cnf【创建配置文件】
[mysql-proxy]
user=root 【运行mysql-proxy用户】
admin-username=proxy 【主从mysql共有的用户】
admin-password=123456 【用户的密码】
proxy-address=192.168.11.31:4040【mysql-proxy运行ip和端口,不加端口,默认4040】
proxy-read-only-backend-addresses=192.168.11.31 【指定后端从slave读取数据】
proxy-backend-addresses=192.168.11.34 【指定后端主master写入数据】
proxy-lua-script=/usr/local/mysql-proxy/lua/rw-splitting.lua 【指定读写分离配置文件位置】
admin-lua-script=/usr/local/mysql-proxy/lua/admin.lua 【指定管理脚本】
log-file=/var/log/mysql-proxy.log
log-level=info 【定义log日志级别】
daemon=true【以守护进程方式运行】
keepalive=true 【mysql-proxy崩溃时,尝试重启】
(1)(critical) Key file contains key “keepalive”which has a value that cannot be interpreted.
daemon=1
keepalive=1
<1> 给配置文件执行权限
chmod 660 /etc/mysql-porxy.cnf
<2> 修改读写分离配置文件
vim /usr/local/mysql-proxy/lua/rw-splitting.luaif not proxy.global.config.rwsplit
min_idle_connections = 1, 【默认超过4个连接数时,才开始读写分离,改为1】
max_idle_connections = 1, 【默认8,改为1】
3.3 启动mysql-proxy和查看
启动:/usr/local/mysql-proxy/bin/mysql-proxy --defaults-file=/etc/mysql-proxy.cnf
查看:netstat -tupln | grep 4000
这样一个读写分离就完全了
第4章 集群
目前Mysql集群用的MHA或keepalive的方式做的,但都通过VIP进行管理,若是在云上做集群会有一些隐患。和H3C云工程师进行交流了解,他们不建议我们在云上做任何的集群,因为集群有心跳检测,会与虚机的备份有冲突,当虚机进行备份时会造成心跳停止,导致集群报错,解决办法就是不使用虚机备份,使用其他方式进行备份方式。中国经济网就是采用虚机做集群但不采用虚机备份的方式。
其他备份方式与虚机备份又有区别:
(1)其他备份方式:只能备份指定的数据,不能备份虚机的相关信息。
(2)虚机备份:不仅能备份指定的数据还能备份虚机的相关配置信息,在虚机崩溃时还可以进行虚机恢复。
4.1 版本介绍
属性
IP地址
允许同步的用户
读|写
mysql版本
MHA版本
操作系统版本
master
192.168.13.189
slave
write
mysql-8.0.13
0.54
CentOS 7.4
192.168.13.190
slave
read
mysql-8.0.13
0.54
CentOS 7.4
slave
192.168.13.192
slave
read
mysql-8.0.13
0.58
CentOS 7.4
MHA_manager
192.168.13.171
mysql-8.0.13
0.58
CentOS 7.4
MHA实现机制:
(1)监控AB的状态
(2)完整的选举机制(看谁的数据跟master最接近)
(3)让一个B切换到新A
(4)保证数据的完整性(通过差异还原)
因为AB复制是异步复制,所以可能有一些数据尚没有被B拉到其relay_log中,即AB数据不一致,MHA是怎样解决这种情况的呢?
(1)mha_manager使用scp命令将A当前binlog拷贝到mha_manager
(2)待新A(选举:依据谁的relay_log新)产生后,mha_manager再拿着旧A的binlog和新的relay_log做比对,并进行差异还原以保证新A和旧A数据的一致性
(3)mha_manager最后拿着老A的binlog去找复制组中其他B做差异还原,保证数据的一致性
4.2 MHA
实验步骤:
1)ssh证书户信任(ssh的密钥验证)
2)ABB/AABB架构
3)安装mha_manager、mha_node
4)测试
4.2.1 ssh互信
(1)ssh-keygen -t rsa
ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.13.189
ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.13.190
ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.13.192
ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.13.171
(2)在192.168.13.189、192.168.13.190、192.168.13.192、192.168.13.171主机上将(1)都执行一遍即可,实现这四台主机ssh连接不需要密码
4.2.1 ABB(一主二从)/AABB(二主多从)架构搭建
略
4.2.1 MHA安装(管理节点MHA_manager)
(1)安装依赖包
[root@ybb-test-01 mha]# ls dependent1
perl-Config-Tiny-2.14-7.el7.noarch.rpm perl-Mail-Sender-0.8.23-1.el7.noarch.rpm perl-MIME-Types-1.38-2.el7.noarch.rpm
perl-Email-Date-Format-1.002-15.el7.noarch.rpm perl-Mail-Sendmail-0.79-21.el7.noarch.rpm perl-Parallel-ForkManager-1.18-2.el7.noarch.rpm
perl-Log-Dispatch-2.41-1.el7.1.noarch.rpm perl-MIME-Lite-3.030-1.el7.noarch.rpm perl-Params-Validate-1.08-4.el7.x86_64.rpm
[root@ybb-test-01 dependent1]# yum -y localinstall ./*
已加载插件:fastestmirror
正在检查 ./perl-Config-Tiny-2.14-7.el7.noarch.rpm: perl-Config-Tiny-2.14-7.el7.noarch
./perl-Config-Tiny-2.14-7.el7.noarch.rpm:不更新已安装的软件包。
正在检查 ./perl-Email-Date-Format-1.002-15.el7.noarch.rpm: perl-Email-Date-Format-1.002-15.el7.noarch
./perl-Email-Date-Format-1.002-15.el7.noarch.rpm:不更新已安装的软件包。
正在检查 ./perl-Log-Dispatch-2.41-1.el7.1.noarch.rpm: perl-Log-Dispatch-2.41-1.el7.1.noarch
./perl-Log-Dispatch-2.41-1.el7.1.noarch.rpm:不更新已安装的软件包。
正在检查 ./perl-Mail-Sender-0.8.23-1.el7.noarch.rpm: perl-Mail-Sender-0.8.23-1.el7.noarch
./perl-Mail-Sender-0.8.23-1.el7.noarch.rpm:不更新已安装的软件包。
正在检查 ./perl-Mail-Sendmail-0.79-21.el7.noarch.rpm: perl-Mail-Sendmail-0.79-21.el7.noarch
./perl-Mail-Sendmail-0.79-21.el7.noarch.rpm:不更新已安装的软件包。
正在检查 ./perl-MIME-Lite-3.030-1.el7.noarch.rpm: perl-MIME-Lite-3.030-1.el7.noarch
./perl-MIME-Lite-3.030-1.el7.noarch.rpm:不更新已安装的软件包。
正在检查 ./perl-MIME-Types-1.38-2.el7.noarch.rpm: perl-MIME-Types-1.38-2.el7.noarch
./perl-MIME-Types-1.38-2.el7.noarch.rpm:不更新已安装的软件包。
正在检查 ./perl-Parallel-ForkManager-1.18-2.el7.noarch.rpm: perl-Parallel-ForkManager-1.18-2.el7.noarch
./perl-Parallel-ForkManager-1.18-2.el7.noarch.rpm:不更新已安装的软件包。
正在检查 ./perl-Params-Validate-1.08-4.el7.x86_64.rpm: perl-Params-Validate-1.08-4.el7.x86_64
./perl-Params-Validate-1.08-4.el7.x86_64.rpm:不更新已安装的软件包。
无须任何处理
以上这些包均已安装过所以显示不更新已安装的软件包
(2)安装MHA相关组件包
[root@ybb-test-02-6 mha]#rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm
准备中... ################################# [100%]
软件包 mha4mysql-node-0.58-0.el7.centos.noarch 已经安装
[root@ybb-test-02-6 mha]# rpm -ivh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
准备中... ################################# [100%]
软件包 mha4mysql-manager-0.58-0.el7.centos.noarch 已经安装
[root@ybb-test-02-6 mha]#
4.2.2 MHA安装(其他节点安装MHA_node)
192.168.13.189(mha_node)
192.168.13.190(mha_node)
192.168.13.192(mha_node)
在mha_node安装mha4mysql-node-0.58-0.el7.centos.noarch.rpm,【注意:MHA_node版本不论6或7都可以使用低版本】
至此,MHA_node安装完成
4.2.3 MHA配置
(1)mkdir /etc/mha——创建目录
(2)vi /etc/mha/mha.cnf——编写MHA配置文件
[root@ybb-test-02-6 mha]# cat mha.cnf
[server default]
#mysql_admin and password
user=root
password=XINyang3009@@
#mha work_dir and mha_log
manager_workdir=/etc/mha
manager_log=/etc/mha/manager.log
#ssh connection account
ssh_user=root
#AB copy account and password
repl_user=slave
repl_password=XINyang3009@@
ping_interval=1
#mysql server
[server1]
hostname=192.168.13.189
ssh_port=22
master_binlog_dir=/var/lib/mysql
candidate_master=1
[server2]
hostname=192.168.13.190
ssh_port=22
master_binlog_dir=/var/lib/mysql
candidate_master=1
[server4]
hostname=192.168.13.192
ssh_port=22
master_binlog_dir=/var/lib/mysql
candidate_master=1
[root@ybb-test-02-6 mha]#
4.2.4 MHA互信和环境检查
(1)互信检查:masterha_check_ssh --conf=/etc/mha/mha.cnf
[root@ybb-test-02-6 mha]# masterha_check_ssh --conf=/etc/mha/mha.cnf
Thu Feb 28 09:50:57 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Feb 28 09:50:57 2019 - [info] Reading application default configuration from /etc/mha/mha.cnf..
Thu Feb 28 09:50:57 2019 - [info] Reading server configuration from /etc/mha/mha.cnf..
Thu Feb 28 09:50:57 2019 - [info] Starting SSH connection tests..
Thu Feb 28 09:50:58 2019 - [debug]
Thu Feb 28 09:50:57 2019 - [debug] Connecting via SSH from root@192.168.13.189(192.168.13.189:22) to root@192.168.13.190(192.168.13.190:22)..
Thu Feb 28 09:50:58 2019 - [debug] ok.
Thu Feb 28 09:50:58 2019 - [debug] Connecting via SSH from root@192.168.13.189(192.168.13.189:22) to root@192.168.13.192(192.168.13.192:22)..
Thu Feb 28 09:50:58 2019 - [debug] ok.
Thu Feb 28 09:50:59 2019 - [debug]
Thu Feb 28 09:50:57 2019 - [debug] Connecting via SSH from root@192.168.13.190(192.168.13.190:22) to root@192.168.13.189(192.168.13.189:22)..
Thu Feb 28 09:50:58 2019 - [debug] ok.
Thu Feb 28 09:50:58 2019 - [debug] Connecting via SSH from root@192.168.13.190(192.168.13.190:22) to root@192.168.13.192(192.168.13.192:22)..
Thu Feb 28 09:50:59 2019 - [debug] ok.
Thu Feb 28 09:51:00 2019 - [debug]
Thu Feb 28 09:50:58 2019 - [debug] Connecting via SSH from root@192.168.13.192(192.168.13.192:22) to root@192.168.13.189(192.168.13.189:22)..
Thu Feb 28 09:50:59 2019 - [debug] ok.
Thu Feb 28 09:50:59 2019 - [debug] Connecting via SSH from root@192.168.13.192(192.168.13.192:22) to root@192.168.13.190(192.168.13.190:22)..
Thu Feb 28 09:51:00 2019 - [debug] ok.
Thu Feb 28 09:51:00 2019 - [info] All SSH connection tests passed successfully.
[root@ybb-test-02-6 mha]#
(2)环境健康检查:masterha_check_repl --conf=/etc/mha/mha.cnf
[root@ybb-test-02-6 mha]# masterha_check_repl --conf=/etc/mha/mha.cnf
Thu Feb 28 09:53:58 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Feb 28 09:53:58 2019 - [info] Reading application default configuration from /etc/mha/mha.cnf..
Thu Feb 28 09:53:58 2019 - [info] Reading server configuration from /etc/mha/mha.cnf..
Thu Feb 28 09:53:58 2019 - [info] MHA::MasterMonitor version 0.58.
Thu Feb 28 09:53:59 2019 - [info] Multi-master configuration is detected. Current primary(writable) master is 192.168.13.190(192.168.13.190:3306)——可写
Thu Feb 28 09:53:59 2019 - [info] Master configurations are as below:
Master 192.168.13.190(192.168.13.190:3306), replicating from 192.168.13.189(192.168.13.189:3306)
Master 192.168.13.189(192.168.13.189:3306), replicating from 192.168.13.190(192.168.13.190:3306), read-only——可读(双主模式,只能有一个可写)
Thu Feb 28 09:53:59 2019 - [info] GTID failover mode = 0
Thu Feb 28 09:53:59 2019 - [info] Dead Servers:
Thu Feb 28 09:53:59 2019 - [info] Alive Servers:
Thu Feb 28 09:53:59 2019 - [info] 192.168.13.189(192.168.13.189:3306)
Thu Feb 28 09:53:59 2019 - [info] 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 09:53:59 2019 - [info] 192.168.13.192(192.168.13.192:3306)
Thu Feb 28 09:53:59 2019 - [info] Alive Slaves:
Thu Feb 28 09:53:59 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled
Thu Feb 28 09:53:59 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 09:53:59 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Feb 28 09:53:59 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled
Thu Feb 28 09:53:59 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 09:53:59 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Feb 28 09:53:59 2019 - [info] Current Alive Master: 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 09:53:59 2019 - [info] Checking slave configurations..
Thu Feb 28 09:53:59 2019 - [warning] relay_log_purge=0 is not set on slave 192.168.13.189(192.168.13.189:3306).
Thu Feb 28 09:53:59 2019 - [warning] relay_log_purge=0 is not set on slave 192.168.13.192(192.168.13.192:3306).
Thu Feb 28 09:53:59 2019 - [info] Checking replication filtering settings..
Thu Feb 28 09:53:59 2019 - [info] binlog_do_db= , binlog_ignore_db=
Thu Feb 28 09:53:59 2019 - [info] Replication filtering check ok.
Thu Feb 28 09:53:59 2019 - [info] GTID (with auto-pos) is not supported
Thu Feb 28 09:53:59 2019 - [info] Starting SSH connection tests..
Thu Feb 28 09:54:02 2019 - [info] All SSH connection tests passed successfully.
Thu Feb 28 09:54:02 2019 - [info] Checking MHA Node version..
Thu Feb 28 09:54:03 2019 - [info] Version check ok.
Thu Feb 28 09:54:03 2019 - [info] Checking SSH publickey authentication settings on the current master..
Thu Feb 28 09:54:04 2019 - [info] HealthCheck: SSH to 192.168.13.190 is reachable.
Thu Feb 28 09:54:04 2019 - [info] Master MHA Node version is 0.54.
Thu Feb 28 09:54:04 2019 - [info] Checking recovery script configurations on 192.168.13.190(192.168.13.190:3306)..
Thu Feb 28 09:54:04 2019 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --start_file=on.000001
Thu Feb 28 09:54:04 2019 - [info] Connecting to root@192.168.13.190(192.168.13.190:22)..
Creating /var/tmp if not exists.. ok.
Checking output directory is accessible or not..
ok.
Binlog found at /var/lib/mysql, up to on.000001
Thu Feb 28 09:54:05 2019 - [info] Binlog setting check done.
Thu Feb 28 09:54:05 2019 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Thu Feb 28 09:54:05 2019 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.13.189 --slave_ip=192.168.13.189 --slave_port=3306 --workdir=/var/tmp --target_version=8.0.13 --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=ybb-test-mysql-1-relay-bin.000002 --slave_pass=xxx
Thu Feb 28 09:54:05 2019 - [info] Connecting to root@192.168.13.189(192.168.13.189:22)..
Checking slave recovery environment settings..
Relay log found at /var/lib/mysql, up to ybb-test-mysql-1-relay-bin.000002
Temporary relay log file is /var/lib/mysql/ybb-test-mysql-1-relay-bin.000002
Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Thu Feb 28 09:54:05 2019 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.13.192 --slave_ip=192.168.13.192 --slave_port=3306 --workdir=/var/tmp --target_version=8.0.13 --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=ybb-test-mysql-4-relay-bin.000002 --slave_pass=xxx
Thu Feb 28 09:54:05 2019 - [info] Connecting to root@192.168.13.192(192.168.13.192:22)..
Checking slave recovery environment settings..
Relay log found at /var/lib/mysql, up to ybb-test-mysql-4-relay-bin.000002
Temporary relay log file is /var/lib/mysql/ybb-test-mysql-4-relay-bin.000002
Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.
Testing mysql connection and privileges..
mysql: [Warning] Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Thu Feb 28 09:54:06 2019 - [info] Slaves settings check done.
Thu Feb 28 09:54:06 2019 - [info]
192.168.13.190(192.168.13.190:3306) (current master)
+--192.168.13.189(192.168.13.189:3306)
+--192.168.13.192(192.168.13.192:3306)
Thu Feb 28 09:54:06 2019 - [info] Checking replication health on 192.168.13.189..
Thu Feb 28 09:54:06 2019 - [info] ok.
Thu Feb 28 09:54:06 2019 - [info] Checking replication health on 192.168.13.192..
Thu Feb 28 09:54:06 2019 - [info] ok.
Thu Feb 28 09:54:06 2019 - [warning] master_ip_failover_script is not defined.
Thu Feb 28 09:54:06 2019 - [warning] shutdown_script is not defined.
Thu Feb 28 09:54:06 2019 - [info] Got exit code 0 (Not master dead).
MySQL Replication Health is OK.
[root@ybb-test-02-6 mha]#
4.2.5 启动及相关日志文件
(1)启动:
nohup masterha_manager --conf=/etc/mha/mha.cnf > /tmp/mha_manager.log < /dev/null 2>&1 &
[root@ybb-test-02-6 mha]#nohup masterha_manager --conf=/etc/mha/mha.cnf > /tmp/mha_manager.log &1 &
[1] 6492
[root@ybb-test-02-6 mha]# jobs
[1]+ 运行中nohup masterha_manager --conf=/etc/mha/mha.cnf > /tmp/mha_manager.log < /dev/null 2>&1 &
[root@ybb-test-02-6 mha]#
(2)日志文件
manager.log文件是MHA的启动日志文件,当启动成功会自动生成一个这样的文件并且记载了启动过程
[root@ybb-test-02-6 mha]# tail -f manager.log
Thu Feb 28 10:05:57 2019 - [warning] master_ip_failover_script is not defined.
Thu Feb 28 10:05:57 2019 - [warning] shutdown_script is not defined.
Thu Feb 28 10:05:57 2019 - [info] Set master ping interval 1 seconds.
Thu Feb 28 10:05:57 2019 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Thu Feb 28 10:05:57 2019 - [info] Starting ping health check on 192.168.13.190(192.168.13.190:3306)..
Thu Feb 28 10:06:01 2019 - [warning] Got error when monitoring master: at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 489.
Thu Feb 28 10:06:01 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln491] Target master's advisory lock is already held by someone. Please check whether you monitor the same master from multiple monitoring processes.
Thu Feb 28 10:06:01 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln511] Error happened on health checking. at /usr/bin/masterha_manager line 50.
Thu Feb 28 10:06:01 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Thu Feb 28 10:06:01 2019 - [info] Got exit code 1 (Not master dead).——这些是默认自带的
下面是完整的启动流程:
Thu Feb 28 10:06:41 2019 - [info] MHA::MasterMonitor version 0.58.
Thu Feb 28 10:06:41 2019 - [warning] /etc/mha/mha.master_status.health already exists. You might have killed manager with SIGKILL(-9), may run two or more monitoring process for the same application, or use the same working directory. Check for details, and consider setting --workdir separately.
Thu Feb 28 10:06:43 2019 - [info] Multi-master configuration is detected. Current primary(writable) master is 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 10:06:43 2019 - [info] Master configurations are as below:
Master 192.168.13.190(192.168.13.190:3306), replicating from 192.168.13.189(192.168.13.189:3306)
Master 192.168.13.189(192.168.13.189:3306), replicating from 192.168.13.190(192.168.13.190:3306), read-only
Thu Feb 28 10:06:43 2019 - [info] GTID failover mode = 0
Thu Feb 28 10:06:43 2019 - [info] Dead Servers:
Thu Feb 28 10:06:43 2019 - [info] Alive Servers:
Thu Feb 28 10:06:43 2019 - [info] 192.168.13.189(192.168.13.189:3306)
Thu Feb 28 10:06:43 2019 - [info] 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 10:06:43 2019 - [info] 192.168.13.192(192.168.13.192:3306)
Thu Feb 28 10:06:43 2019 - [info] Alive Slaves:
Thu Feb 28 10:06:43 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled
Thu Feb 28 10:06:43 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 10:06:43 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Feb 28 10:06:43 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled
Thu Feb 28 10:06:43 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 10:06:43 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Feb 28 10:06:43 2019 - [info] Current Alive Master: 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 10:06:43 2019 - [info] Checking slave configurations..
Thu Feb 28 10:06:43 2019 - [warning] relay_log_purge=0 is not set on slave 192.168.13.189(192.168.13.189:3306).
Thu Feb 28 10:06:43 2019 - [warning] relay_log_purge=0 is not set on slave 192.168.13.192(192.168.13.192:3306).
Thu Feb 28 10:06:43 2019 - [info] Checking replication filtering settings..
Thu Feb 28 10:06:43 2019 - [info] binlog_do_db= , binlog_ignore_db=
Thu Feb 28 10:06:43 2019 - [info] Replication filtering check ok.
Thu Feb 28 10:06:43 2019 - [info] GTID (with auto-pos) is not supported
Thu Feb 28 10:06:43 2019 - [info] Starting SSH connection tests..
Thu Feb 28 10:06:46 2019 - [info] All SSH connection tests passed successfully.
Thu Feb 28 10:06:46 2019 - [info] Checking MHA Node version..
Thu Feb 28 10:06:47 2019 - [info] Version check ok.
Thu Feb 28 10:06:47 2019 - [info] Checking SSH publickey authentication settings on the current master..
Thu Feb 28 10:06:47 2019 - [info] HealthCheck: SSH to 192.168.13.190 is reachable.
Thu Feb 28 10:06:48 2019 - [info] Master MHA Node version is 0.54.
Thu Feb 28 10:06:48 2019 - [info] Checking recovery script configurations on 192.168.13.190(192.168.13.190:3306)..
Thu Feb 28 10:06:48 2019 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --start_file=on.000001
Thu Feb 28 10:06:48 2019 - [info] Connecting to root@192.168.13.190(192.168.13.190:22)..
Creating /var/tmp if not exists.. ok.
Checking output directory is accessible or not..
ok.
Binlog found at /var/lib/mysql, up to on.000001
Thu Feb 28 10:06:48 2019 - [info] Binlog setting check done.
Thu Feb 28 10:06:48 2019 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Thu Feb 28 10:06:48 2019 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.13.189 --slave_ip=192.168.13.189 --slave_port=3306 --workdir=/var/tmp --target_version=8.0.13 --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=ybb-test-mysql-1-relay-bin.000002 --slave_pass=xxx
Thu Feb 28 10:06:48 2019 - [info] Connecting to root@192.168.13.189(192.168.13.189:22)..
Checking slave recovery environment settings..
Relay log found at /var/lib/mysql, up to ybb-test-mysql-1-relay-bin.000002
Temporary relay log file is /var/lib/mysql/ybb-test-mysql-1-relay-bin.000002
Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Thu Feb 28 10:06:49 2019 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.13.192 --slave_ip=192.168.13.192 --slave_port=3306 --workdir=/var/tmp --target_version=8.0.13 --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=ybb-test-mysql-4-relay-bin.000002 --slave_pass=xxx
Thu Feb 28 10:06:49 2019 - [info] Connecting to root@192.168.13.192(192.168.13.192:22)..
Checking slave recovery environment settings..
Relay log found at /var/lib/mysql, up to ybb-test-mysql-4-relay-bin.000002
Temporary relay log file is /var/lib/mysql/ybb-test-mysql-4-relay-bin.000002
Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.
Testing mysql connection and privileges..
mysql: [Warning] Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Thu Feb 28 10:06:50 2019 - [info] Slaves settings check done.
Thu Feb 28 10:06:50 2019 - [info]
192.168.13.190(192.168.13.190:3306) (current master)
+--192.168.13.189(192.168.13.189:3306)
+--192.168.13.192(192.168.13.192:3306)
Thu Feb 28 10:06:50 2019 - [warning] master_ip_failover_script is not defined.
Thu Feb 28 10:06:50 2019 - [warning] shutdown_script is not defined.
Thu Feb 28 10:06:50 2019 - [info] Set master ping interval 1 seconds.
Thu Feb 28 10:06:50 2019 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Thu Feb 28 10:06:50 2019 - [info] Starting ping health check on 192.168.13.190(192.168.13.190:3306)..
Thu Feb 28 10:06:54 2019 - [warning] Got error when monitoring master: at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 489.
Thu Feb 28 10:06:54 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln491] Target master's advisory lock is already held by someone. Please check whether you monitor the same master from multiple monitoring processes.
Thu Feb 28 10:06:54 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln511] Error happened on health checking. at /usr/bin/masterha_manager line 50.
Thu Feb 28 10:06:54 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Thu Feb 28 10:06:54 2019 - [info] Got exit code 1(Not master dead).
启动成功。
4.2.6 MHA测试
MHA其目的就是当master故障时能及时剔除掉故障并能推选出new_master从而使其他slave的master指向new_master,使业务不中断。
(1)停掉master(192.168.13.190)上的mysql
(2)查看MHA的进程日志manager.log
查看红色标记部分即可
[root@ybb-test-02-6 mha]# tail -f manager.log
Thu Feb 28 10:14:54 2019 - [warning] Got error on MySQL select ping: 2013 (Lost connection to MySQL server during query)
Thu Feb 28 10:14:54 2019 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --binlog_prefix=on
Thu Feb 28 10:14:55 2019 - [info] HealthCheck: SSH to 192.168.13.190 is reachable.
Thu Feb 28 10:14:55 2019 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.13.190' (111))
Thu Feb 28 10:14:55 2019 - [warning] Connection failed 2 time(s)..
Thu Feb 28 10:14:56 2019 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.13.190' (111))
Thu Feb 28 10:14:56 2019 - [warning] Connection failed 3 time(s)..
Thu Feb 28 10:14:57 2019 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.13.190' (111))
Thu Feb 28 10:14:57 2019 - [warning] Connection failed 4 time(s)..
Thu Feb 28 10:14:57 2019 - [warning] Master is not reachable from health checker!
Thu Feb 28 10:14:57 2019 - [warning] Master 192.168.13.190(192.168.13.190:3306) is not reachable!
Thu Feb 28 10:14:57 2019 - [warning] SSH is reachable.
Thu Feb 28 10:14:57 2019 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha/mha.cnf again, and trying to connect to all servers to check server status..
Thu Feb 28 10:14:57 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Feb 28 10:14:57 2019 - [info] Reading application default configuration from /etc/mha/mha.cnf..
Thu Feb 28 10:14:57 2019 - [info] Reading server configuration from /etc/mha/mha.cnf..
Thu Feb 28 10:14:58 2019 - [info] GTID failover mode = 0
Thu Feb 28 10:14:58 2019 - [info] Dead Servers:
Thu Feb 28 10:14:58 2019 - [info] 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 10:14:58 2019 - [info] Alive Servers:
Thu Feb 28 10:14:58 2019 - [info] 192.168.13.189(192.168.13.189:3306)
Thu Feb 28 10:14:58 2019 - [info] 192.168.13.192(192.168.13.192:3306)
Thu Feb 28 10:14:58 2019 - [info] Alive Slaves:
Thu Feb 28 10:14:58 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled
Thu Feb 28 10:14:58 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 10:14:58 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Feb 28 10:14:58 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled
Thu Feb 28 10:14:58 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 10:14:58 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Feb 28 10:14:58 2019 - [info] Checking slave configurations..
Thu Feb 28 10:14:58 2019 - [warning] relay_log_purge=0 is not set on slave 192.168.13.189(192.168.13.189:3306).
Thu Feb 28 10:14:58 2019 - [warning] relay_log_purge=0 is not set on slave 192.168.13.192(192.168.13.192:3306).
Thu Feb 28 10:14:58 2019 - [info] Checking replication filtering settings..
Thu Feb 28 10:14:58 2019 - [info] Replication filtering check ok.
Thu Feb 28 10:14:58 2019 - [info] Master is down!
Thu Feb 28 10:14:58 2019 - [info] Terminating monitoring script.
Thu Feb 28 10:14:58 2019 - [info] Got exit code 20 (Master dead).
Thu Feb 28 10:14:58 2019 - [info] MHA::MasterFailover version 0.58.
Thu Feb 28 10:14:58 2019 - [info] Starting master failover.
Thu Feb 28 10:14:58 2019 - [info]
Thu Feb 28 10:14:58 2019 - [info] * Phase 1: Configuration Check Phase..
Thu Feb 28 10:14:58 2019 - [info]
Thu Feb 28 10:15:00 2019 - [info] GTID failover mode = 0
Thu Feb 28 10:15:00 2019 - [info] Dead Servers:
Thu Feb 28 10:15:00 2019 - [info] 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 10:15:00 2019 - [info] Checking master reachability via MySQL(double check)...
Thu Feb 28 10:15:00 2019 - [info] ok.
Thu Feb 28 10:15:00 2019 - [info] Alive Servers:
Thu Feb 28 10:15:00 2019 - [info] 192.168.13.189(192.168.13.189:3306)
Thu Feb 28 10:15:00 2019 - [info] 192.168.13.192(192.168.13.192:3306)
Thu Feb 28 10:15:00 2019 - [info] Alive Slaves:
Thu Feb 28 10:15:00 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled
Thu Feb 28 10:15:00 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 10:15:00 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Feb 28 10:15:00 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled
Thu Feb 28 10:15:00 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 10:15:00 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Feb 28 10:15:00 2019 - [info] Starting Non-GTID based failover.
Thu Feb 28 10:15:00 2019 - [info]
Thu Feb 28 10:15:00 2019 - [info] ** Phase 1: Configuration Check Phase completed.
Thu Feb 28 10:15:00 2019 - [info]
Thu Feb 28 10:15:00 2019 - [info] * Phase 2: Dead Master Shutdown Phase..
Thu Feb 28 10:15:00 2019 - [info]
Thu Feb 28 10:15:00 2019 - [info] Forcing shutdown so that applications never connect to the current master..
Thu Feb 28 10:15:00 2019 - [warning] master_ip_failover_script is not set. Skipping invalidating dead master IP address.
Thu Feb 28 10:15:00 2019 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Thu Feb 28 10:15:01 2019 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Thu Feb 28 10:15:01 2019 - [info]
Thu Feb 28 10:15:01 2019 - [info] * Phase 3: Master Recovery Phase..
Thu Feb 28 10:15:01 2019 - [info]
Thu Feb 28 10:15:01 2019 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Thu Feb 28 10:15:01 2019 - [info]
Thu Feb 28 10:15:01 2019 - [info] The latest binary log file/position on all slaves is on.000001:661
Thu Feb 28 10:15:01 2019 - [info] Latest slaves (Slaves that received relay log files to the latest):
Thu Feb 28 10:15:01 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled
Thu Feb 28 10:15:01 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 10:15:01 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Feb 28 10:15:01 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled
Thu Feb 28 10:15:01 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 10:15:01 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Feb 28 10:15:01 2019 - [info] The oldest binary log file/position on all slaves is on.000001:661
Thu Feb 28 10:15:01 2019 - [info] Oldest slaves:
Thu Feb 28 10:15:01 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled
Thu Feb 28 10:15:01 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 10:15:01 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Feb 28 10:15:01 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled
Thu Feb 28 10:15:01 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 10:15:01 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Feb 28 10:15:01 2019 - [info]
Thu Feb 28 10:15:01 2019 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..
Thu Feb 28 10:15:01 2019 - [info]
Thu Feb 28 10:15:01 2019 - [info] Fetching dead master's binary logs..
Thu Feb 28 10:15:01 2019 - [info] Executing command on the dead master 192.168.13.190(192.168.13.190:3306): save_binary_logs --command=save --start_file=on.000001 --start_pos=661 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.58
Creating /var/tmp if not exists.. ok.
Concat binary/relay logs from on.000001 pos 661 to on.000001 EOF into /var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog ..
Dumping binlog format description event, from position 0 to 124.. ok.
Dumping effective binlog data from /var/lib/mysql/on.000001 position 661 to tail(684).. ok.
Concat succeeded.
Thu Feb 28 10:15:03 2019 - [info] scp from root@192.168.13.190:/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog to local:/etc/mha/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog succeeded.
Thu Feb 28 10:15:03 2019 - [info] HealthCheck: SSH to 192.168.13.189 is reachable.
Thu Feb 28 10:15:04 2019 - [info] HealthCheck: SSH to 192.168.13.192 is reachable.
Thu Feb 28 10:15:04 2019 - [info]
Thu Feb 28 10:15:04 2019 - [info] * Phase 3.3: Determining New Master Phase..
Thu Feb 28 10:15:04 2019 - [info]
Thu Feb 28 10:15:04 2019 - [info] Finding the latest slave that has all relay logs for recovering other slaves..
Thu Feb 28 10:15:04 2019 - [info] All slaves received relay logs to the same position. No need to resync each other.
Thu Feb 28 10:15:04 2019 - [info] Searching new master from slaves..
Thu Feb 28 10:15:04 2019 - [info] Candidate masters from the configuration file:
Thu Feb 28 10:15:04 2019 - [info] 192.168.13.189(192.168.13.189:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled
Thu Feb 28 10:15:04 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 10:15:04 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Feb 28 10:15:04 2019 - [info] 192.168.13.192(192.168.13.192:3306) Version=8.0.13 (oldest major version between slaves) log-bin:enabled
Thu Feb 28 10:15:04 2019 - [info] Replicating from 192.168.13.190(192.168.13.190:3306)
Thu Feb 28 10:15:04 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Thu Feb 28 10:15:04 2019 - [info] Non-candidate masters:
Thu Feb 28 10:15:04 2019 - [info] Searching from candidate_master slaves which have received the latest relay log events..
Thu Feb 28 10:15:04 2019 - [info] New master is 192.168.13.189(192.168.13.189:3306)
Thu Feb 28 10:15:04 2019 - [info] Starting master failover..
Thu Feb 28 10:15:04 2019 - [info]
From:
192.168.13.190(192.168.13.190:3306) (current master)
+--192.168.13.189(192.168.13.189:3306)
+--192.168.13.192(192.168.13.192:3306)
To:
192.168.13.189(192.168.13.189:3306) (new master)
+--192.168.13.192(192.168.13.192:3306)
Thu Feb 28 10:15:04 2019 - [info]
Thu Feb 28 10:15:04 2019 - [info] * Phase 3.4: New Master Diff Log Generation Phase..
Thu Feb 28 10:15:04 2019 - [info]
Thu Feb 28 10:15:04 2019 - [info] This server has all relay logs. No need to generate diff files from the latest slave.
Thu Feb 28 10:15:04 2019 - [info] Sending binlog..
Thu Feb 28 10:15:05 2019 - [info] scp from local:/etc/mha/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog to root@192.168.13.189:/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog succeeded.
Thu Feb 28 10:15:05 2019 - [info]
Thu Feb 28 10:15:05 2019 - [info] * Phase 3.5: Master Log Apply Phase..
Thu Feb 28 10:15:05 2019 - [info]
Thu Feb 28 10:15:05 2019 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.
Thu Feb 28 10:15:05 2019 - [info] Starting recovery on 192.168.13.189(192.168.13.189:3306)..
Thu Feb 28 10:15:05 2019 - [info] Generating diffs succeeded.
Thu Feb 28 10:15:05 2019 - [info] Waiting until all relay logs are applied.
Thu Feb 28 10:15:05 2019 - [info] done.
Thu Feb 28 10:15:05 2019 - [info] Getting slave status..
Thu Feb 28 10:15:05 2019 - [info] This slave(192.168.13.189)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(on.000001:661). No need to recover from Exec_Master_Log_Pos.
Thu Feb 28 10:15:05 2019 - [info] Connecting to the target slave host 192.168.13.189, running recover script..
Thu Feb 28 10:15:05 2019 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user='root' --slave_host=192.168.13.189 --slave_ip=192.168.13.189 --slave_port=3306 --apply_files=/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog --workdir=/var/tmp --target_version=8.0.13 --timestamp=20190228101458 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.58 --slave_pass=xxx
Thu Feb 28 10:15:05 2019 - [info]
Applying differential binary/relay log files /var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog on 192.168.13.189:3306. This may take long time...
Applying log files succeeded.
Thu Feb 28 10:15:05 2019 - [info] All relay logs were successfully applied.
Thu Feb 28 10:15:05 2019 - [info] Getting new master's binlog name and position..
Thu Feb 28 10:15:06 2019 - [info] mysql-bin.000009:654
Thu Feb 28 10:15:06 2019 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.13.189', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000009', MASTER_LOG_POS=654, MASTER_USER='slave', MASTER_PASSWORD='xxx';
Thu Feb 28 10:15:06 2019 - [warning] master_ip_failover_script is not set. Skipping taking over new master IP address.
Thu Feb 28 10:15:06 2019 - [info] Setting read_only=0 on 192.168.13.189(192.168.13.189:3306)..
Thu Feb 28 10:15:06 2019 - [info] ok.
Thu Feb 28 10:15:06 2019 - [info] ** Finished master recovery successfully.
Thu Feb 28 10:15:06 2019 - [info] * Phase 3: Master Recovery Phase completed.
Thu Feb 28 10:15:06 2019 - [info]
Thu Feb 28 10:15:06 2019 - [info] * Phase 4: Slaves Recovery Phase..
Thu Feb 28 10:15:06 2019 - [info]
Thu Feb 28 10:15:06 2019 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
Thu Feb 28 10:15:06 2019 - [info]
Thu Feb 28 10:15:06 2019 - [info] -- Slave diff file generation on host 192.168.13.192(192.168.13.192:3306) started, pid: 11154. Check tmp log /etc/mha/192.168.13.192_3306_20190228101458.log if it takes time..
Thu Feb 28 10:15:07 2019 - [info]
Thu Feb 28 10:15:07 2019 - [info] Log messages from 192.168.13.192 ...
Thu Feb 28 10:15:07 2019 - [info]
Thu Feb 28 10:15:06 2019 - [info] This server has all relay logs. No need to generate diff files from the latest slave.
Thu Feb 28 10:15:07 2019 - [info] End of log messages from 192.168.13.192.
Thu Feb 28 10:15:07 2019 - [info] -- 192.168.13.192(192.168.13.192:3306) has the latest relay log events.
Thu Feb 28 10:15:07 2019 - [info] Generating relay diff files from the latest slave succeeded.
Thu Feb 28 10:15:07 2019 - [info]
Thu Feb 28 10:15:07 2019 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..
Thu Feb 28 10:15:07 2019 - [info]
Thu Feb 28 10:15:07 2019 - [info] -- Slave recovery on host 192.168.13.192(192.168.13.192:3306) started, pid: 11157. Check tmp log /etc/mha/192.168.13.192_3306_20190228101458.log if it takes time..
Thu Feb 28 10:15:09 2019 - [info]
Thu Feb 28 10:15:09 2019 - [info] Log messages from 192.168.13.192 ...
Thu Feb 28 10:15:09 2019 - [info]
Thu Feb 28 10:15:07 2019 - [info] Sending binlog..
Thu Feb 28 10:15:07 2019 - [info] scp from local:/etc/mha/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog to root@192.168.13.192:/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog succeeded.
Thu Feb 28 10:15:07 2019 - [info] Starting recovery on 192.168.13.192(192.168.13.192:3306)..
Thu Feb 28 10:15:07 2019 - [info] Generating diffs succeeded.
Thu Feb 28 10:15:07 2019 - [info] Waiting until all relay logs are applied.
Thu Feb 28 10:15:07 2019 - [info] done.
Thu Feb 28 10:15:07 2019 - [info] Getting slave status..
Thu Feb 28 10:15:07 2019 - [info] This slave(192.168.13.192)'s Exec_Master_Log_Pos equals to Read_Master_Log_Pos(on.000001:661). No need to recover from Exec_Master_Log_Pos.
Thu Feb 28 10:15:07 2019 - [info] Connecting to the target slave host 192.168.13.192, running recover script..
Thu Feb 28 10:15:07 2019 - [info] Executing command: apply_diff_relay_logs --command=apply --slave_user='root' --slave_host=192.168.13.192 --slave_ip=192.168.13.192 --slave_port=3306 --apply_files=/var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog --workdir=/var/tmp --target_version=8.0.13 --timestamp=20190228101458 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.58 --slave_pass=xxx
Thu Feb 28 10:15:08 2019 - [info]
Applying differential binary/relay log files /var/tmp/saved_master_binlog_from_192.168.13.190_3306_20190228101458.binlog on 192.168.13.192:3306. This may take long time...
Applying log files succeeded.
Thu Feb 28 10:15:08 2019 - [info] All relay logs were successfully applied.
Thu Feb 28 10:15:08 2019 - [info] Resetting slave 192.168.13.192(192.168.13.192:3306) and starting replication from the new master 192.168.13.189(192.168.13.189:3306)..
Thu Feb 28 10:15:08 2019 - [info] Executed CHANGE MASTER.
Thu Feb 28 10:15:08 2019 - [info] Slave started.
Thu Feb 28 10:15:09 2019 - [info] End of log messages from 192.168.13.192.
Thu Feb 28 10:15:09 2019 - [info] -- Slave recovery on host 192.168.13.192(192.168.13.192:3306) succeeded.
Thu Feb 28 10:15:09 2019 - [info] All new slave servers recovered successfully.
Thu Feb 28 10:15:09 2019 - [info]
Thu Feb 28 10:15:09 2019 - [info] * Phase 5: New master cleanup phase..
Thu Feb 28 10:15:09 2019 - [info]
Thu Feb 28 10:15:09 2019 - [info] Resetting slave info on the new master..
Thu Feb 28 10:15:09 2019 - [info] 192.168.13.189: Resetting slave info succeeded.
Thu Feb 28 10:15:09 2019 - [info] Master failover to 192.168.13.189(192.168.13.189:3306) completed successfully.
Thu Feb 28 10:15:09 2019 - [info]
----- Failover Report -----
mha: MySQL Master failover 192.168.13.190(192.168.13.190:3306) to 192.168.13.189(192.168.13.189:3306) succeeded
Master 192.168.13.190(192.168.13.190:3306) is down!
Check MHA Manager logs at ybb-test-02-6:/etc/mha/manager.log for details.
Started automated(non-interactive) failover.
The latest slave 192.168.13.189(192.168.13.189:3306) has all relay logs for recovery.
Selected 192.168.13.189(192.168.13.189:3306) as a new master.
192.168.13.189(192.168.13.189:3306): OK: Applying all logs succeeded.
192.168.13.192(192.168.13.192:3306): This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
192.168.13.192(192.168.13.192:3306): OK: Applying all logs succeeded. Slave started, replicating from 192.168.13.189(192.168.13.189:3306)
192.168.13.189(192.168.13.189:3306): Resetting slave info succeeded.
Master failover to 192.168.13.189(192.168.13.189:3306) completed successfully.
查看192.168.13.192上mysql的master是不是已经成功转变了:
由此可见,MHA已经起到高可用的目的了
4.3 集群小结
一、或者,其他方案? 不管哪种方案都是有其场景限制 或说 规模限制,以及优缺点的。
1、首先反对大家做读写分离,关于这方面的原因解释太多次数(增加技术复杂度、可能导致读到落后的数据等),只说一点:99.8%的业务场景没有必要做读写分离,只要做好数据库设计优化 和配置合适正确的主机即可。
2、Keepalived+MySQL --确实有脑裂的问题,还无法做到准确判断mysqld是否HANG的情况;
3、DRBD+Heartbeat+MySQL --同样有脑裂的问题,还无法做到准确判断mysqld是否HANG的情况,且DRDB是不需要的,增加反而会出问题;
3、MySQL Proxy --不错的项目,可惜官方半途夭折了,不建议用,无法高可用,是一个写分离;
4、MySQL Cluster --社区版本不支持NDB是错误的言论,商用案例确实不多,主要是跟其业务场景要求有关系、这几年发展有点乱不过现在已经上正规了、对网络要求高;
5、MySQL + MHA --可以解决脑裂的问题,需要的IP多,小集群是可以的,但是管理大的就麻烦,其次MySQL + MMM的话且坑很多,有MHA就没必要采用MMM建议:
(1)若是双主复制的模式,不用做数据拆分,那么就可以选择MHA或Keepalive或heartbeat
(2)若是双主复制,还做了数据的拆分,则可以考虑采用Cobar;
二、 MHA是自动的master故障转移和Slave提升的软件包.它是基于标准的MySQL复制(异步/半同步)。该软件由两部分组成:MHA Manager(管理节点)和MHA Node(数据节点)。
1)MHA Manager可以单独部署在一台独立的机器上管理多个master-slave集群,也可以部署在一台slave节点上。MHA Manager会定时探测集群中的node节点,当发现master出现故障的时候,它可以自动将具有最新数据的slave提升为新的master,然后将所有其它的slave导向新的master上.整个故障转移过程对应用程序是透明的。
2)MHA Node运行在每台MySQL服务器上,它通过监控具备解析和清理logs功能的脚本来加快故障转移的。
第5章错误集锦
5.1 主从报错
1、Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'
解决办法:(1)stop slave;(2)reset slave; (3)start slave;若一遍不行,隔一会儿再试一下
reset slave 将使slave忘记主从复制关系的位置信息。该语句将被用于干净的启动,它删除master.info文件和relay-log.info文件以及所有的relay log文件并重新启用一个新的relaylog文件。使用reset slave之前必须使用stop slave命令将复制进程停止。
2、若执行此3条命令不行,则使用以下方法
(1)删除所有bin-log日志【find / -name“*bin-log*” -exec ls {} \;】
(2)重新生成bin-log日志文件
change master to
-> master_log_file='bin_log.000001',
-> master_log_pos=155;
Start slave;
(3)若还是报和之前一样的错误,则:
① stop slave;
② reset slave;
③ start slave;
(4)完成
3、若提示连接超时或用户名、密码错误:
mysql -ubackup -h 2.2.2.2 -p123456,看看在从上能不能连上主mysql。
能:用户名+密码没问题
不能:用户名或密码错误
4、找不到mysql临时密码
[root@ybb-test-02-6 mysql]# cat /var/log/mysqld.log |grep password
[root@ybb-test-02-6 mysql]#
(1)删除原来安装过的mysql残留的数据(这一步非常重要,问题就出在这)
rm -rf /var/lib/mysql
(3)重启mysqld服务systemctl restart mysqld
(4)查看mysql日志文件
[root@ybb-test-02-6 mysql]# cat /var/log/mysqld.log |grep password
2019-02-25T08:50:58.053021Z 5 [Note] [MY-010454] [Server] A temporary password is generated for root@localhost: X=0gw#AAQdx*
5、User slave does not exist or does not have REPLICATION SLAVE privilege! Other slaves can not start replication from this host.
在master创建slave同步用户时未赋予权限,解决办法:
GRANT REPLICATION SLAVE ON *.* TO 'slave'@'%';
flush privileges;
6、
这个时候先等一会儿或查看日志,show slave status\G;若没有错误等一会儿错误就会报出来(几分钟),报错如下:
Slave SQL for channel '': Could not execute Update_rows event on table mysql.user; Can't find record in 'user', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000006, end_log_pos 1768, Error_code: MY-001032
解决办法:
stop slave;
set global sql_slave_skip_counter=1;
start slave;
5.2 集群报错
1、[error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln781] Multi-master configuration is detected, but two or more masters are either writable (read-only is not set) or dead! Check configurations for details. Master configurations are as below:
由于MHA目前支持一主多从模式也就是只支持一个master写,其他master只能读的模式,所以当多个master并存时,只能设置其他master为read-only模式即可
解决办法:在其他master执行mysql -uroot -p -e "set global read_only=1"即可
2、
Mon Mar 4 15:59:33 2019 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln398] 192.168.13.190(192.168.13.190:3306): User slave does not exist or does not have REPLICATION SLAVE privilege! Other slaves can not start replication from this host.
Mon Mar 4 15:59:33 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/ServerManager.pm line 1403.
Mon Mar 4 15:59:33 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
这是因为从192.168.13.190上面同步用户slave不存在导致,因为当master挂了之后,任何从都有可能成为master,所以若从没有同步用户slave会导致同步失败。