MySQL 主从复制的形式
- 一主一从
- 主主复制
- 一主多从
- 多主一从
- 级联复制
一主一从
节点规划
主机 | ip | 角色 |
---|---|---|
yfm01 | 192.168.199.101 | MASTER |
yfm02 | 192.168.199.102 | SLAVE |
MYSQL | 5.7.21 | |
CENTOS | CentOS Linux release 7.9.2009 (Core) |
查看mysql版本
select version();
查看centos版本
cat /etc/centos-release
192.168.199.101,/etc/my.cnf
[mysqld]
#开启二进制日志
log-bin=mysql-bin
#设置server-id,建议使用ip最后3位
server-id=101
192.168.199.102,/etc/my.cnf
[mysqld]
#开启中继日志
relay-log=mysql-relay
#设置server-id,建议使用ip最后3位
server-id=102
2台服务器mysql服务重启
在master建立账户并且授权slave
mysql> GRANT REPLICATION SLAVE ON *.* TO 'root'@'192.168.199.102' IDENTIFIED BY '123456';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
查看master状态
mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000001 | 603 | | | |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
告知从服务器二进制文件名与位置
mysql> CHANGE MASTER TO master_host = '192.168.199.101',master_user = 'root',master_password = '123456',master_log_file = 'mysql-bin.000001',master_log_pos = 603;
Query OK, 0 rows affected, 2 warnings (0.00 sec)
mysql> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.00 sec)
开启复制
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
查看主从复制是否配置成功
mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.199.101
Master_User: root
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000001
Read_Master_Log_Pos: 603
Relay_Log_File: mysql-relay.000002
Relay_Log_Pos: 320
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
当看到Slave_IO_State:Waiting for master ot send event 、Slave_IO_Running: YES、Slave_SQL_Running: YES才表明状态正常
验证主从复制是否成功
在Master中创建数据库并创建数据表并插入一条数据
mysql> create database test;
Query OK, 1 row affected (0.00 sec)
mysql> use test;
Database changed
mysql> create table tab1(id int auto_increment,name varchar(10),primary key(id));
Query OK, 0 rows affected (0.00 sec)
mysql> insert into tab1(id,name) values (1,'why');
Query OK, 1 row affected (0.03 sec)
在slave中查询这条数据
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| sys |
| test |
+--------------------+
5 rows in set (0.00 sec)
mysql> use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select * from test;
ERROR 1146 (42S02): Table 'test.test' doesn't exist
mysql> select * from tab1;
+----+------+
| id | name |
+----+------+
| 1 | why |
+----+------+
1 row in set (0.00 sec)
常用语句
show master status: 查看master的状态,尤其是当前的日志及位置
show slave status 查看slave的状态
reset slave 重置slave状态
start slave 启动slave状态
stop slave 暂停slave状态
mycat读写分离
实现MYSQL读写分离可以基于第三方插件,也可以通过开发修改代码实现,具体实现的读写分离的常见方式有如下四种:
- Amoeba读写分离;
- MySQL-Proxy读写分离;
- Mycat读写分离;
- 基于程序读写分离(效率很高,实施难度大,开发改代码);
节点规划
主机名 | ip | 角色 |
---|---|---|
yfm01 | 192.168.199.101 | 写数据库 |
yfm02 | 192.168.199.102 | 从数据库 |
yfm03 | 192.168.199.103 | myacat |
下载mycat安装包
mkdir -p /data/mycat
cd /data/mycat && wget http://dl.mycat.org.cn/1.6.7.4/Mycat-server-1.6.7.4-test-20191113141017-linux.tar.gz
tar -zxvf Mycat-server-1.6.7.4-test-20191113141017-linux.tar.gz
[root@yfm03 mycat]# groupadd mycat
[root@yfm03 mycat]# useradd -r -g mycat mycat
[root@yfm03 mycat]# chown -R mycat.mycat /data/mycat/mycat
设置环境变量
echo 'export MYCAT_HOME=/data/mycat/mycat' >> /etc/profile
echo 'export PATH=$MYCAT_HOME/bin:$PATH' >> /etc/profile
source /etc/profile
修改日志级别
vi /data/mycat/mycat/conf/log4j2.xml
添加 <AsyncLogger name="io.mycat" level="debug"...> 这个节点,表示对 io.mycat 包的日志级别为 debug
<Loggers>
<AsyncLogger name="io.mycat" level="debug" includeLocation="true" additivity="false">
<!--<AppenderRef ref="Console"/>-->
<AppenderRef ref="RollingFile"/>
</AsyncLogger>
<asyncRoot level="info" includeLocation="true">
<!--<AppenderRef ref="Console" />-->
<AppenderRef ref="RollingFile"/>
</asyncRoot>
</Loggers>
修改server.xml、shcema.xml详细配置在文章末尾
启动mycat
mycat start
检查mycat是否成功启动
ps -ef|grep mycat
查看mycat日志
[root@yfm03 logs]# cat NFO | jvm 1 | 2021/01/22 10:10:01 | Caused by: io.mycat.config.util.ConfigException: Illegal table conf : table [ TAB1 ] rule function [ rang-long ] partition size : 3 > table datanode size : 1, please make sure table datanode size = function partition size
可以看到mycat启动失败了
修改/data/mycat/mycat/conf/rule.xml
<function name="mod-long" class="io.mycat.route.function.PartitionByMod">
<!-- how many data nodes -->
<property name="count">1</property>
</function>
修改/data/mycat/mycat/conf/autopartition-long.txt
#默认是三个,我们需要删除最后两个,不然就会报错,说节点少了
#K=1000条记录,M=10000条记录,那么下面三个配置就是0~500万的记录会存在数据库节点1的表中,500万~1000万会存在节点2的表中
# range start-end ,data node index
# K=1000,M=10000.
0-500M=0
# 500M-1000M=1
# 1000M-1500M=2
ERROR | wrapper | 2021/01/22 10:38:40 | Startup failed: Timed out waiting for a signal from the JVM.
ERROR | wrapper | 2021/01/22 10:38:40 | JVM did not exit on request, terminated
修改/data/mycat/mycat/conf/wrapper.cof
添加wrapper.startup.timeout=300
测试
在yfm01上连接mycat
mysql -uroot -p123456 -P8066 -h192.168.199.103
测试读:初始时master、slave,test库中tab1表有一条why的数据,直接在slave库中将该只修改为why3
[root@yfm02 ~]# mysql -umycat -P8066 -h192.168.199.103 -p
Enter password:
mysql> select * from tab1;
+----+------+
| id | name |
+----+------+
| 1 | why3 |
+----+------+
1 row in set (0.06 sec)
可以看到通过mycat查询出来的值为why3,主库此时值为why
查看mycat日志cat /data/mycat/mycat/logs/mycat.log
2021-01-22 15:50:12.399 DEBUG [$_NIOREACTOR-0-RW] (io.mycat.server.NonBlockingSession.releaseConnection(NonBlockingSession.java:386)) - release connection MySQLConnection@567426715 [id=12, lastTime=1611301812387, user=root, schema=test, old shema=test, borrowed=true, fromSlaveDB=true, threadId=1390, charset=utf8, txIsolation=3, autocommit=true, attachment=dn1{SELECT *
FROM tab1
LIMIT 100}, respHandler=SingleNodeHandler [node=dn1{SELECT *
FROM tab1
LIMIT 100}, packetId=6], host=192.168.199.102, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=false]
读操作都被分派到readhost
测试写:插入一条数据看从库是否同步过去
查看mycat日志cat /data/mycat/mycat/logs/mycat.log
2021-01-22 15:48:00.593 DEBUG [$_NIOREACTOR-0-RW] (io.mycat.backend.datasource.PhysicalDBNode.getConnection(PhysicalDBNode.java:133)) - rrs.getRunOnSlave() default
2021-01-22 15:48:00.601 DEBUG [$_NIOREACTOR-0-RW] (io.mycat.server.NonBlockingSession.releaseConnection(NonBlockingSession.java:386)) - release connection MySQLConnection@1449597107 [id=7, lastTime=1611301680585, user=root, schema=test, old shema=test, borrowed=true, fromSlaveDB=false, threadId=1580, charset=utf8, txIsolation=3, autocommit=true, attachment=dn1{INSERT INTO tab1 (id, name)
VALUES ('2', 'yfm02')}, respHandler=SingleNodeHandler [node=dn1{INSERT INTO tab1 (id, name)
VALUES ('2', 'yfm02')}, packetId=1], host=192.168.199.101, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=true]
可以看到写入操作执行在master,且slave也已经正常同步
server.xml
<?xml version="1.0" encoding="UTF-8"?>
<!-- - - Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License. - You
may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0
- - Unless required by applicable law or agreed to in writing, software -
distributed under the License is distributed on an "AS IS" BASIS, - WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - See the
License for the specific language governing permissions and - limitations
under the License. -->
<!DOCTYPE mycat:server SYSTEM "server.dtd">
<mycat:server xmlns:mycat="http://io.mycat/">
<system>
<property name="nonePasswordLogin">0</property> <!-- 0为需要密码登陆、1为不需要密码登陆 ,默认为0,设置为1则需要指定默认账户-->
<property name="useHandshakeV10">1</property>
<property name="useSqlStat">0</property> <!-- 1为开启实时统计、0为关闭 -->
<property name="useGlobleTableCheck">0</property> <!-- 1为开启全加班一致性检测、0为关闭 -->
<property name="sqlExecuteTimeout">300</property> <!-- SQL 执行超时 单位:秒-->
<property name="sequnceHandlerType">5</property>
<!--<property name="sequnceHandlerPattern">(?:(\s*next\s+value\s+for\s*MYCATSEQ_(\w+))(,|\)|\s)*)+</property>-->
<!--必须带有MYCATSEQ_或者 mycatseq_进入序列匹配流程 注意MYCATSEQ_有空格的情况-->
<property name="sequnceHandlerPattern">(?:(\s*next\s+value\s+for\s*MYCATSEQ_(\w+))(,|\)|\s)*)+</property>
<property name="subqueryRelationshipCheck">false</property> <!-- 子查询中存在关联查询的情况下,检查关联字段中是否有分片字段 .默认 false -->
<property name="sequenceHanlderClass">io.mycat.route.sequence.handler.HttpIncrSequenceHandler</property>
<!-- <property name="useCompression">1</property>--> <!--1为开启mysql压缩协议-->
<!-- <property name="fakeMySQLVersion">5.6.20</property>--> <!--设置模拟的MySQL版本号-->
<!-- <property name="processorBufferChunk">40960</property> -->
<!--
<property name="processors">1</property>
<property name="processorExecutor">32</property>
-->
<!--默认为type 0: DirectByteBufferPool | type 1 ByteBufferArena | type 2 NettyBufferPool -->
<property name="processorBufferPoolType">0</property>
<!--默认是65535 64K 用于sql解析时最大文本长度 -->
<!--<property name="maxStringLiteralLength">65535</property>-->
<!--<property name="sequnceHandlerType">0</property>-->
<!--<property name="backSocketNoDelay">1</property>-->
<!--<property name="frontSocketNoDelay">1</property>-->
<!--<property name="processorExecutor">16</property>-->
<!--
<property name="serverPort">8066</property> <property name="managerPort">9066</property>
<property name="idleTimeout">300000</property> <property name="bindIp">0.0.0.0</property>
<property name="dataNodeIdleCheckPeriod">300000</property> 5 * 60 * 1000L; //连接空闲检查
<property name="frontWriteQueueSize">4096</property> <property name="processors">32</property> -->
<!--分布式事务开关,0为不过滤分布式事务,1为过滤分布式事务(如果分布式事务内只涉及全局表,则不过滤),2为不过滤分布式事务,但是记录分布式事务日志-->
<property name="handleDistributedTransactions">0</property>
<!--
off heap for merge/order/group/limit 1开启 0关闭
-->
<property name="useOffHeapForMerge">0</property>
<!--
单位为m
-->
<property name="memoryPageSize">64k</property>
<!--
单位为k
-->
<property name="spillsFileBufferSize">1k</property>
<property name="useStreamOutput">0</property>
<!--
单位为m
-->
<property name="systemReserveMemorySize">384m</property>
<!--是否采用zookeeper协调切换 -->
<property name="useZKSwitch">false</property>
<!-- XA Recovery Log日志路径 -->
<!--<property name="XARecoveryLogBaseDir">./</property>-->
<!-- XA Recovery Log日志名称 -->
<!--<property name="XARecoveryLogBaseName">tmlog</property>-->
<!--如果为 true的话 严格遵守隔离级别,不会在仅仅只有select语句的时候在事务中切换连接-->
<property name="strictTxIsolation">false</property>
<property name="useZKSwitch">true</property>
</system>
<!-- 全局SQL防火墙设置 -->
<!--白名单可以使用通配符%或着*-->
<!--例如<host host="127.0.0.*" user="root"/>-->
<!--例如<host host="127.0.*" user="root"/>-->
<!--例如<host host="127.*" user="root"/>-->
<!--例如<host host="1*7.*" user="root"/>-->
<!--这些配置情况下对于127.0.0.1都能以root账户登录-->
<!--
<firewall>
<whitehost>
<host host="1*7.0.0.*" user="root"/>
</whitehost>
<blacklist check="false">
</blacklist>
</firewall>
-->
<user name="mycat" defaultAccount="true">
<property name="password">mycat</property>
<property name="schemas">testDB</property>
<property name="defaultSchema">testDB</property>
<!--No MyCAT Database selected 错误前会尝试使用该schema作为schema,不设置则为null,报错 -->
<!-- 表级 DML 权限设置 -->
<!--
<privileges check="false">
<schema name="TESTDB" dml="0110" >
<table name="tb01" dml="0000"></table>
<table name="tb02" dml="1111"></table>
</schema>
</privileges>
-->
</user>
<user name="user">
<property name="password">user</property>
<property name="schemas">testDB</property>
<property name="readOnly">true</property>
<property name="defaultSchema">testDB</property>
</user>
</mycat:server>
schema.xml
<?xml version="1.0"?>
<!DOCTYPE mycat:schema SYSTEM "schema.dtd">
<mycat:schema xmlns:mycat="http://io.mycat/">
<schema name="testDB" checkSQLschema="true" sqlMaxLimit="100">
<!-- auto sharding by id (long) -->
<!--splitTableNames 启用<table name 属性使用逗号分割配置多个表-->
<table name="tab1" dataNode="dn1" rule="auto-sharding-long" splitTableNames ="true"/>
<!-- <table name="oc_call" primaryKey="ID" dataNode="dn1$0-743" rule="latest-month-calldate"
/> -->
</schema>
<!-- <dataNode name="dn1$0-743" dataHost="localhost1" database="db$0-743"
/> -->
<dataNode name="dn1" dataHost="localhost1" database="test" />
<!--<dataNode name="dn4" dataHost="sequoiadb1" database="SAMPLE" />
<dataNode name="jdbc_dn1" dataHost="jdbchost" database="db1" />
<dataNode name="jdbc_dn2" dataHost="jdbchost" database="db2" />
<dataNode name="jdbc_dn3" dataHost="jdbchost" database="db3" /> -->
<dataHost name="localhost1" maxCon="1000" minCon="10" balance="1"
writeType="0" dbType="mysql" dbDriver="native" switchType="1" slaveThreshold="100">
<heartbeat>select user()</heartbeat>
<!-- can have multi write hosts -->
<writeHost host="hostM1" url="192.168.199.101:3306" user="root"
password="123456">
<readHost host="hostS1" url="192.168.199.102:3306" user="root" password="123456" />
</writeHost>
<!-- <writeHost host="hostM2" url="localhost:3316" user="root" password="123456"/> -->
</dataHost>
<!--
<dataHost name="sequoiadb1" maxCon="1000" minCon="1" balance="0" dbType="sequoiadb" dbDriver="jdbc">
<heartbeat> </heartbeat>
<writeHost host="hostM1" url="sequoiadb://1426587161.dbaas.sequoialab.net:11920/SAMPLE" user="jifeng" password="jifeng"></writeHost>
</dataHost>
<dataHost name="oracle1" maxCon="1000" minCon="1" balance="0" writeType="0" dbType="oracle" dbDriver="jdbc"> <heartbeat>select 1 from dual</heartbeat>
<connectionInitSql>alter session set nls_date_format='yyyy-mm-dd hh24:mi:ss'</connectionInitSql>
<writeHost host="hostM1" url="jdbc:oracle:thin:@127.0.0.1:1521:nange" user="base" password="123456" > </writeHost> </dataHost>
<dataHost name="jdbchost" maxCon="1000" minCon="1" balance="0" writeType="0" dbType="mongodb" dbDriver="jdbc">
<heartbeat>select user()</heartbeat>
<writeHost host="hostM" url="mongodb://192.168.0.99/test" user="admin" password="123456" ></writeHost> </dataHost>
<dataHost name="sparksql" maxCon="1000" minCon="1" balance="0" dbType="spark" dbDriver="jdbc">
<heartbeat> </heartbeat>
<writeHost host="hostM1" url="jdbc:hive2://feng01:10000" user="jifeng" password="jifeng"></writeHost> </dataHost> -->
<!-- <dataHost name="jdbchost" maxCon="1000" minCon="10" balance="0" dbType="mysql"
dbDriver="jdbc"> <heartbeat>select user()</heartbeat> <writeHost host="hostM1"
url="jdbc:mysql://localhost:3306" user="root" password="123456"> </writeHost>
</dataHost> -->
</mycat:schema>
schema配置解析:
writeType:负载均衡类型,目前的取值有 3 种
writeType="0", 所有写操作发送到配置的第一个 writeHost,第一个挂了切到还生存的第二个writeHost,重新启动后已切换后的为准,切换记录在配置文件中:dnindex.properties .
writeType="1",所有写操作都随机的发送到配置的 writeHost。
writeType=“2",没实现。
balance:属性负载均衡类型,目前的取值有 4 种:
balance="0", 不开启读写分离机制,所有读操作都发送到当前可用的writeHost 上。
balance="1",全部的 readHost 与 stand by writeHost 参与 select 语句的负载均衡,简单的说,当双主双从模式(M1 ->S1 , M2->S2,并且 M1 与 M2 互为主备),正常情况下, M2,S1,S2 都参与 select 语句的负载均衡。
balance="2",所有读操作都随机的在 writeHost、 readhost 上分发。
balance="3", 所有读请求随机的分发到 wiriterHost 对应的 readhost 执行,writerHost 不负担读压力,如果从服务器都挂了,过一段延迟之后,自动读主服务器。注意 balance=3 只在 1.4 及其以后版本有, 1.3 没有。
switchType
-1 表示不自动切换
1 默认值,自动切换。
2 基于MySQL 主从同步的状态决定是否切换 心跳语句为 show slave status
3 基于MySQLgalarycluster的切换机制(适合集群)(1.4.1)心跳语句为 show status like ‘wsrep%’
总结:
常见坑位1:此配置在主机宕机后,msyql数据库就瘫痪了。
解决:可以把其中一台从机分配成writeHost,再第一个writeHost(主机)宕机后自动切换至第二个writeHost。
常见坑位2:此时出现了新的问题,读数据可以,写数据也可以。在把第一个wirteHost启动后,新添加的数据并没有保存到主机中而是报存到了第二个writeHost(从机中)。
解决:官网表示讨论了很久,建议重新手工配置主从复制关系。原从配成现主,原主作为现从加入到集群中。
参考:https://blog.csdn.net/zh15732621679/article/details/78816338
这里配置了互为主从
主主复制
节点规划
主机 | ip | 角色 |
---|---|---|
yfm01 | 192.168.199.101 | MASTER、SLAVE |
yfm02 | 192.168.199.102 | SLAVE、MASTER |
MySQL | 5.7.21 | |
CentOS | CentOS Linux release 7.9.2009 (Core) |
192.168.199.101,/etc/my.cnf
[mysqld]
server-id=101
log-bin=mysql-bin
auto-increment-increment=2
auto-increment-offset=1
log-slave-updates
192.168.199.102,/etc/my.cnf
[mysqld]
server-id=102
log-bin=mysql-bin
auto-increment-increment=2
auto-increment-offset=2
log-slave-updates
2台服务器mysql服务重启
添加主从同步账户
在yfm01:
mysql> GRANT REPLICATION SLAVE ON *.* TO 'root'@'192.168.199.102' IDENTIFIED BY '123456';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
在yfm02:
mysql> GRANT REPLICATION SLAVE ON *.* TO 'root'@'192.168.199.101' IDENTIFIED BY '123456';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
查看master状态
在yfm01:
mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000003 | 154 | | | |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
在yfm02:
mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000001 | 624 | | | |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
配置同步信息:
在yfm01上:
mysql> change master to master_host='192.168.199.102',master_port=3306,master_user='root',master_password='123456',master_log_file='mysql-bin.000001',master_log_pos=624;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> start slave;
Query OK, 0 rows affected (0.01 sec)
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.199.102
Master_User: root
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000001
Read_Master_Log_Pos: 624
Relay_Log_File: yfm01-relay-bin.000002
Relay_Log_Pos: 320
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
当看到Slave_IO_State:Waiting for master ot send event 、Slave_IO_Running: YES、Slave_SQL_Running: YES才表明状态正常
在yfm02上:
mysql> change master to master_host='192.168.199.101',master_port=3306,master_user='root',master_password='123456',master_log_file='mysql-bin.000003',master_log_pos=154;
出现报错start slave;ERROR 1872 (HY000): Slave failed to initialize relay log info structure from the repository
查看mysql日志
2021-01-21T22:15:40.930581Z 0 [ERROR] Failed to open the relay log './mysql-relay.000005' (relay_log_pos 367).
2021-01-21T22:15:40.930587Z 0 [ERROR] Could not find target log file mentioned in relay log info in the index file './yfm02-relay-bin.index' during relay log initialization.
2021-01-21T22:15:40.930715Z 0 [ERROR] Slave: Failed to initialize the master info structure for channel ''; its record may still be present in 'mysql.slave_master_info' table, consider deleting it.
通过查找资料需要执行reset slave;
mysql> reset slave;
Query OK, 0 rows affected (0.01 sec)
mysql> change master to master_host='192.168.199.101',master_port=3306,master_user='root',master_password='123456',master_log_file='mysql-bin.000003',master_log_pos=154;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> start slave;
Query OK, 0 rows affected (0.01 sec)
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.199.101
Master_User: root
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 154
Relay_Log_File: yfm02-relay-bin.000002
Relay_Log_Pos: 320
Relay_Master_Log_File: mysql-bin.000003
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
验证测试
分别在yfm01、yfm02上新增或者修改数据看能否同步即可,这里省略了
常用语句
show master status: 查看master的状态,尤其是当前的日志及位置
show slave status 查看slave的状态
reset slave 重置slave状态
start slave 启动slave状态
stop slave 暂停slave状态
MySQL的高可用方案一般有如下几种:
keepalived+双主,MHA,PXC,MMM,Heartbeat+DRBD等,比较常用的是keepalived+双主,MHA和PXC。
keepalived主从 + mysql双主
节点规划
主机 | ip | 角色 |
---|---|---|
yfm01 | 192.168.199.101 | MASTER |
yfm02 | 192.168.199.102 | BACKUP |
keepalived | keepalived-1.3.5-19.el7.x86_64 | |
MySQL | 5.7.21 | |
CentOS | CentOS Linux release 7.9.2009 (Core) |
查看keepalived版本 rpm -qa|grep keepalived
keepalived和mysql的安装参考之前的文章
yfm01,/etc/keepalived/keepalived.conf(下面配置中没有使用lvs的负载均衡功能,所以不需要配置虚拟服务器virtual server)
! Configuration File for keepalived
global_defs {
router_id MASTER
}
vrrp_script chk_mysql_port { #检测mysql服务是否在运行。有很多方式,比如进程,用脚本检测等等
script "/etc/keepalived/chk_mysql.sh" #这里通过脚本监测
interval 2 #脚本执行间隔,每2s检测一次
weight -5 #脚本结果导致的优先级变更,检测失败(脚本返回非0)则优先级 -5
fall 2 #检测连续2次失败才算确定是真失败。会用weight减少优先级(1-255之间)
rise 1 #检测1次成功就算成功。但不修改优先级
}
vrrp_instance VI_1 {
state MASTER
interface ens33 #指定虚拟ip的网卡接口
virtual_router_id 51 #路由器标识,MASTER和BACKUP必须是一致的
priority 101 #定义优先级,数字越大,优先级越高,在同一个vrrp_instance下,MASTER的优先级必须大于BACKUP的优先级。这样MASTER故障恢复后,就可以将VIP资源再次抢回来
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.199.99
}
track_script {
chk_mysql_port
}
}
yfm02,/etc/keepalived/keepalived.conf(nopreempt不设置)
! Configuration File for keepalived
global_defs {
router_id BACKUP
}
vrrp_script chk_mysql_port { #检测mysql服务是否在运行。有很多方式,比如进程,用脚本检测等等
script "/etc/keepalived/chk_mysql.sh" #这里通过脚本监测
interval 2 #脚本执行间隔,每2s检测一次
weight -5 #脚本结果导致的优先级变更,检测失败(脚本返回非0)则优先级 -5
fall 2 #检测连续2次失败才算确定是真失败。会用weight减少优先级(1-255之间)
rise 1 #检测1次成功就算成功。但不修改优先级
}
vrrp_instance VI_1 {
state BACKUP
interface ens33 #指定虚拟ip的网卡接口
virtual_router_id 51 #路由器标识,MASTER和BACKUP必须是一致的
priority 101 #定义优先级,数字越大,优先级越高,在同一个vrrp_instance下,MASTER的优先级必须大于BACKUP的优先级。这样MASTER故障恢复后,就可以将VIP资源再次抢回来
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.199.99
}
track_script {
chk_mysql_port
}
}
/etc/keepalived/chk_mysql.sh
#!/bin/bash
counter=$(netstat -na|grep "LISTEN"|grep "3306"|wc -l)
if [ "${counter}" -eq 0 ]; then
/etc/init.d/keepalived stop
fi
chmod +x /etc/keepalived/chk_mysql.sh
验证测试
通过Mysql客户端通过VIP连接,看是否连接成功
[root@yfm01 data]# mysql -h192.168.199.99 -uroot -p
Enter password:
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| sys |
| test |
+--------------------+
5 rows in set (0.01 sec)
mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| tab1 |
+----------------+
1 row in set (0.00 sec)
mysql> select * from tab1;
+----+-------+
| id | name |
+----+-------+
| 1 | why |
| 2 | yfm02 |
| 3 | yfm04 |
| 5 | yfm05 |
+----+-------+
4 rows in set (0.00 sec)
默认情况下,vip是在master1上的。使用"ip addr"命令查看vip切换情况
[root@yfm01 data]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:7b:30:82 brd ff:ff:ff:ff:ff:ff
inet 192.168.199.101/24 brd 192.168.199.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 192.168.199.99/32 scope global ens33
valid_lft forever preferred_lft forever
inet6 fe80::be4f:8aab:e347:6e66/64 scope link tentative noprefixroute dadfailed
valid_lft forever preferred_lft forever
inet6 fe80::fca9:7d90:fd6:d63a/64 scope link noprefixroute
valid_lft forever preferred_lft forever
inet6 fe80::bf02:a10:c932:6672/64 scope link tentative noprefixroute dadfailed
valid_lft forever preferred_lft forever
停止master1机器上的mysql服务,根据配置中的脚本,mysql服务停了,keepalived也会停,从而vip资源将会切换到master2机器上。(mysql服务没有起来的时候,keepalived服务也无法顺利启动!)
[root@yfm01 keepalived]# service mysql stop
Shutting down MySQL........... SUCCESS!
[root@yfm01 keepalived]# ps -ef|grep keepalived
root 8967 6853 0 21:14 pts/2 00:00:00 grep --color=auto keepalived
[root@yfm01 keepalived]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:7b:30:82 brd ff:ff:ff:ff:ff:ff
inet 192.168.199.101/24 brd 192.168.199.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet6 fe80::be4f:8aab:e347:6e66/64 scope link tentative noprefixroute dadfailed
valid_lft forever preferred_lft forever
inet6 fe80::fca9:7d90:fd6:d63a/64 scope link noprefixroute
valid_lft forever preferred_lft forever
inet6 fe80::bf02:a10:c932:6672/64 scope link tentative noprefixroute dadfailed
valid_lft forever preferred_lft forever
如上结果,发现32位子网掩码的vip没有了,说明此时vip资源已不在master1机器上了
再次启动master1的mysql和keepalived服务,会发现vip资源再次从yfm022机器上切换回来了
[root@yfm01 keepalived]# service mysql start
Starting MySQL. SUCCESS!
[root@yfm01 keepalived]# systemctl start keepalived
[root@yfm01 keepalived]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:7b:30:82 brd ff:ff:ff:ff:ff:ff
inet 192.168.199.101/24 brd 192.168.199.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 192.168.199.99/32 scope global ens33
valid_lft forever preferred_lft forever
inet6 fe80::be4f:8aab:e347:6e66/64 scope link tentative noprefixroute dadfailed
valid_lft forever preferred_lft forever
inet6 fe80::fca9:7d90:fd6:d63a/64 scope link noprefixroute
valid_lft forever preferred_lft forever
inet6 fe80::bf02:a10:c932:6672/64 scope link tentative noprefixroute dadfailed
valid_lft forever preferred_lft forever
优化chk_mysql.sh
上面的mysql监测脚本有点过于简单且粗暴,即脚本一旦监测到Master的mysql服务关闭,就立刻把keepalived服务关闭,从而实现vip转移!
下面对该脚本进行优化,优化后,当监测到Master的mysql服务关闭后,就会将vip切换到Backup上(但此时Master的keepalived服务不会被暴力kill)
当Master的mysql服务恢复后,就会再次将VIP资源切回来!
#!/bin/bash
MYSQL=/usr/local/mysql/bin/mysql
MYSQL_HOST=localhost
MYSQL_USER=root
MYSQL_PASSWORD=123456
CHECK_TIME=3
#mysql is working MYSQL_OK is 1 , mysql down MYSQL_OK is 0
MYSQL_OK=1
function check_mysql_helth (){
$MYSQL -h $MYSQL_HOST -u $MYSQL_USER -p${MYSQL_PASSWORD} -e "show status;" >/dev/null 2>&1
if [ $? = 0 ] ;then
MYSQL_OK=1
else
MYSQL_OK=0
fi
return $MYSQL_OK
}
while [ $CHECK_TIME -ne 0 ]
do
let "CHECK_TIME -= 1"
check_mysql_helth
if [ $MYSQL_OK = 1 ] ; then
CHECK_TIME=0
exit 0
fi
if [ $MYSQL_OK -eq 0 ] && [ $CHECK_TIME -eq 0 ]
then
pkill keepalived
exit 1
fi
sleep 1
done
问题:
问题1:keepalived切换时长:如果关闭keepalived,vip切换在3s以内;如果通过判断优先级,大概在7,8秒(检测脚本时长+确认收到高优先级时长+发送arp通告时长)。
问题2:keepalived切换主从延迟及中断是否需要判断?
问题3:双主单活的读写权限是否要控制下,切换为standby只有读权限?
MHA
节点规划
主机 | ip | 安装软件 | 角色 |
---|---|---|---|
yfm01 | 192.168.199.101 | MySQL、MHA-NODE、MHA-MANAGER | MASTER |
yfm02 | 192.168.199.102 | MySQL、MHA-NODE | SLAVE1 |
yfm04 | 192.168.199.104 | MySQL、MHA-NODE | SLAVE2 |
VIP | 192.168.199.99 | ||
MySQL | 5.7.21 | ||
CentOS | CentOS Linux release 7.9.2009 (Core) |
需要现部署一套一主两从的数据库,可以参考之前的文章
因为之前搭建的yfm01、yfm02主从环境,后面又改成了yfm01、yfm02主主环境,当想把它们重新改为主从环境时,在修改完/etc/my.cnf后,执行主从同步的过程中,主从同步失败了;
这里演示下同步失败后的解决办法
备份yfm01数据
mysqldump -uroot -p --all-databases > mysql.bak.sql
将mysql.bak.sql拷贝到yfm02
scp root@192.168.199.101:/data/mysql/mysql.bak.sql .
停止yfm02从库同步状态
stop slave;
导入数据备份
source /data/mysql/mysql.bak.sql
设置从库同步
CHANGE MASTER TO master_host = '192.168.199.101',master_user = 'root',master_password = '123456',master_log_file = 'mysql-bin.000008',master_log_pos = 970;
重新开启同步
start slave;
半同步配置
yfm01
mysql> install plugin rpl_semi_sync_master soname 'semisync_master.so';
Query OK, 0 rows affected (0.00 sec)
/etc/my.cnf增加如下配置
[mysqld]
rpl_semi_sync_master_enabled = 1
rpl_semi_sync_master_timeout = 1000 # 1 second
yfm02、yfm04
mysql> install plugin rpl_semi_sync_slave soname 'semisync_slave.so';
Query OK, 0 rows affected (0.00 sec)
mysql> install plugin rpl_semi_sync_master soname 'semisync_master.so';
Query OK, 0 rows affected (0.01 sec)
查看是否成功加载
mysql> show plugins;
+----------------------------+----------+--------------------+--------------------+---------+
| Name | Status | Type | Library | License |
+----------------------------+----------+--------------------+--------------------
| rpl_semi_sync_slave | ACTIVE | REPLICATION | semisync_slave.so | GPL |
| rpl_semi_sync_master | ACTIVE | REPLICATION | semisync_master.so | GPL |
+----------------------------+----------+--------------------+--------------------+---------+
46 rows in set (0.00 sec)
/etc/my.cnf增加如下配置
[mysqld]
rpl_semi_sync_slave_enabled = 1
然后重启数据库
重启yfm02、yfm03上的IO线程
mysql> STOP SLAVE IO_THREAD;
Query OK, 0 rows affected (0.00 sec)
mysql> START SLAVE IO_THREAD;
Query OK, 0 rows affected (0.00 sec)
如果没有重启,则默认还是异步复制,重启后,slave会在master上注册为半同步复制的slave角色。
这时候yfm01,/var/log/mysql.log会显示如下日志
2021-01-27T03:43:55.555691Z 7 [Note] While initializing dump thread for slave with UUID <c964d57e-5f25-11eb-b6b9-000c29462169>, found a zombie dump thread with the same UUID. Master is killing the zombie dump thread(3).
2021-01-27T03:43:55.555740Z 7 [Note] Start binlog_dump to master_thread_id(7) slave_server(104), pos(mysql-bin.000002, 362)
2021-01-27T03:43:55.555747Z 7 [Note] Start semi-sync binlog_dump to slave (server_id: 104), pos(mysql-bin.000002, 362)
2021-01-27T03:43:55.555802Z 3 [Note] Stop semi-sync binlog_dump to slave (server_id: 104)
查看半同步是否在运行
MASTER
mysql> show status like 'Rpl_semi_sync_master_status';
+-----------------------------+-------+
| Variable_name | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_status | ON |
+-----------------------------+-------+
1 row in set (0.00 sec)
SLAVE
mysql> show status like 'Rpl_semi_sync_slave_status';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| Rpl_semi_sync_slave_status | ON |
+----------------------------+-------+
1 row in set (0.00 sec)
yfm01,/etc/my.cnf
# For advice on how to change settings please see
# http://dev.mysql.com/doc/refman/5.7/en/server-configuration-defaults.html
[mysqld]
#
# Remove leading # and set to the amount of RAM for the most important data
# cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%.
# innodb_buffer_pool_size = 128M
#
# Remove leading # to turn on a very important data integrity option: logging
# changes to the binary log between backups.
# log_bin
#
# Remove leading # to set options mainly useful for reporting servers.
# The server defaults are faster for transactions and fast SELECTs.
# Adjust sizes as needed, experiment to find the optimal values.
# join_buffer_size = 128M
# sort_buffer_size = 2M
# read_rnd_buffer_size = 2M
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
#开启二进制日志
log-bin=mysql-bin
#设置server-id,建议使用ip最后3位
server-id=101
log_slave_updates = 1 # 从库通过binlog更新的数据写进从库二进制日志中,必加,否则切换后可能丢失数据
skip-name-resolve # 建议加上 非必须项
plugin-load=rpl_semi_sync_master=semisync_master.so
rpl_semi_sync_master_enabled = 1
rpl_semi_sync_master_timeout = 1000 # 1 second
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
yfm02,/etc/my.cnf
# For advice on how to change settings please see
# http://dev.mysql.com/doc/refman/5.7/en/server-configuration-defaults.html
[mysqld]
#
# Remove leading # and set to the amount of RAM for the most important data
# cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%.
# innodb_buffer_pool_size = 128M
#
# Remove leading # to turn on a very important data integrity option: logging
# changes to the binary log between backups.
# log_bin
#
# Remove leading # to set options mainly useful for reporting servers.
# The server defaults are faster for transactions and fast SELECTs.
# Adjust sizes as needed, experiment to find the optimal values.
# join_buffer_size = 128M
# sort_buffer_size = 2M
# read_rnd_buffer_size = 2M
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
#开启中继日志
relay-log=mysql-relay
#设置server-id,建议使用ip最后3位
server-id=102
relay_log_purge = 0 # 关闭自动清理中继日志
log_slave_updates = 1 # 从库通过binlog更新的数据写进从库二进制日志中,必加,否则切换后可能丢失数据
skip-name-resolve # 建议加上 非必须项
plugin-load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
rpl-semi-sync-master-enabled = 1
rpl-semi-sync-slave-enabled = 1
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
配置SSH免登录
ssh-keygen #生成秘钥文件
ssh-copy-id root@192.168.199.101 #将本机公钥复制到远程机器公钥
关闭“ARP查询”功能,不然ping本机ip很慢
echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce
echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce
/sbin/sysctl -p >/dev/null 2>&1
不然ssh root@192.168.199.101很慢,这也可以解决上面的问题
vi /etc/hosts
192.168.199.101 127.0.0.1
安装MHA依赖
yum install perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker perl-CPAN perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes perl-DBI mysql-libs perl-Email-Date-Format perl-File-Remove perl-Mail-Sender perl-Mail-Sendmail perl-MIME-Lite perl-MIME-Types perl-Module-Install perl-Module-ScanDeps perl-YAML -y
下载压缩包
mkdir -p /data/mha && cd /data/mha
wget https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58.tar.gz
wget https://github.com/yoshinorim/mha4mysql-manager/releases/download/v0.58/mha4mysql-manager-0.58.tar.gz
全部节点部署mha4mysql-node
tar -zxvf mha4mysql-node-0.58.tar.gz
cd mha4mysql-node-0.58/
perl Makefile.PL
make && make install
#执行完成后会得到四个工具
[root@yfm01 mha4mysql-node-0.58]# ls -al /usr/local/bin/
total 13888
-r-xr-xr-x 1 root root 17639 Jan 26 04:00 apply_diff_relay_logs
-r-xr-xr-x 1 root root 4807 Jan 26 04:00 filter_mysqlbinlog
-r-xr-xr-x 1 root root 8337 Jan 26 04:00 purge_relay_logs
-r-xr-xr-x 1 root root 7525 Jan 26 04:00 save_binary_logs
yfm01节点部署mha4mysql-manager
tar -zxvf mha4mysql-manager-0.58.tar.gz
cd mha4mysql-manager-0.58/
perl Makefile.PL
make && make install
#执行完成后会得到manager工具
[root@yfm01 mha4mysql-manager-0.58]# ls -al /usr/local/bin/ | grep masterha
-r-xr-xr-x 1 root root 1995 Jan 26 04:04 masterha_check_repl
-r-xr-xr-x 1 root root 1779 Jan 26 04:04 masterha_check_ssh
-r-xr-xr-x 1 root root 1865 Jan 26 04:04 masterha_check_status
-r-xr-xr-x 1 root root 3201 Jan 26 04:04 masterha_conf_host
-r-xr-xr-x 1 root root 2517 Jan 26 04:04 masterha_manager
-r-xr-xr-x 1 root root 2165 Jan 26 04:04 masterha_master_monitor
-r-xr-xr-x 1 root root 2373 Jan 26 04:04 masterha_master_switch
-r-xr-xr-x 1 root root 5172 Jan 26 04:04 masterha_secondary_check
-r-xr-xr-x 1 root root 1739 Jan 26 04:04 masterha_stop
yfm01节点配置mha
编写配置文件
mkdir /usr/local/mha && cd /usr/local/mha
vi /usr/local/mha/mha.cnf
[server default]
#设置日志位置
manager_log=/usr/local/mha/manager.log
#设置工作目录
manager_workdir=/usr/local/mha
#设置ip自动飘移脚本路径
master_ip_failover_script="/usr/local/mha/scripts/master_ip_failover"
#设置ip手动飘移脚本路径
master_ip_online_change_script="/usr/local/mha/scripts/master_ip_online_change"
#设置监控mysql的用户
user=root
#设置监控用户对应密码
password=123456
#设置ping包发送时间间隔
ping_interval=1
#设置远程mysql切换时binlog保存路径
remote_workdir=/usr/local/mha
#设置主从复制用户
repl_user=root
#主从复制用户密码
repl_password=123456
#发生漂移时报警脚本路径
report_script=""
#检查MySQL主服务器的可用性
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.199.101 -s 192.168.199.102 -s 192.168.199.104
#发生故障后关闭故障主机的脚本路径
shutdown_script=""
#ssh的用户
ssh_user=root
[server1]
hostname=192.168.199.101
port=3306
[server2]
hostname=192.168.199.102
port=3306
[server3]
hostname=192.168.199.104
port=3306
#设置候选master
candidate_master=1
#设置当slave的relay_logs落后master达到100M时不会选择此slave作为新的master
check_repl_delay=0
编写飘移脚本
mkdir /usr/local/mha/scripts && cd /usr/local/mha/scripts
vi /usr/local/mha/scripts/master_ip_failover
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
my $vip = '192.168.199.99/24'; # Virtual IP
my $key = "1";
my $int = "ens33";
my $ssh_start_vip = "/sbin/ifconfig $int:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig $int:$key down";
my $arp_effect = "/sbin/arping -Uq -s192.168.199.99 -I $int 192.168.199.2 -c 3"; # Virtual IP and gatway
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);
exit &main();
sub main {
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
#`ssh $ssh_user\@cluster1 \" $ssh_start_vip \"`;
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {
return 0 unless ($ssh_user);
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
编写手动ip飘移脚本master_ip_online_change
vi /usr/local/mha/scripts/master_ip_online_change
#!/bin/bash
source /root/.bash_profile
vip=`echo '192.168.199.99/24'` #设置VIP
key=`echo '1'`
command=`echo "$1" | awk -F = '{print $2}'`
orig_master_host=`echo "$2" | awk -F = '{print $2}'`
new_master_host=`echo "$7" | awk -F = '{print $2}'`
orig_master_ssh_user=`echo "${12}" | awk -F = '{print $2}'`
new_master_ssh_user=`echo "${13}" | awk -F = '{print $2}'`
#要求服务的网卡识别名一样(本机为ens33)
stop_vip=`echo "ssh root@$orig_master_host /usr/sbin/ifconfig ens33:$key down"`
start_vip=`echo "ssh root@$new_master_host /usr/sbin/ifconfig ens33:$key $vip"`
if [ $command = 'stop' ]
then
echo -e "\n\n\n****************************\n"
echo -e "Disabled thi VIP - $vip on old master: $orig_master_host \n"
$stop_vip
if [ $? -eq 0 ]
then
echo "Disabled the VIP successfully"
else
echo "Disabled the VIP failed"
fi
echo -e "***************************\n\n\n"
fi
if [ $command = 'start' -o $command = 'status' ]
then
echo -e "\n\n\n*************************\n"
echo -e "Enabling the VIP - $vip on new master: $new_master_host \n"
$start_vip
if [ $? -eq 0 ]
then
echo "Enabled the VIP successfully"
else
echo "Enabled the VIP failed"
fi
echo -e "***************************\n\n\n"
fi
为脚本添加可执行权限
chmod +x master_ip_failover
chmod +x master_ip_online_change
启动MHA集群
验证SSH状态
masterha_check_ssh --conf=/usr/local/mha/mha.cnf
验证数据库集群状态
masterha_check_repl --conf=/usr/local/mha/mha.cnf
[root@yfm01 scripts]# masterha_check_repl --conf=/usr/local/mha/mha.cnf
Wed Jan 27 19:34:25 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed Jan 27 19:34:25 2021 - [info] Reading application default configuration from /usr/local/mha/mha.cnf..
Wed Jan 27 19:34:25 2021 - [info] Reading server configuration from /usr/local/mha/mha.cnf..
Wed Jan 27 19:34:25 2021 - [info] MHA::MasterMonitor version 0.58.
Wed Jan 27 19:34:25 2021 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln301] install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted.
Compilation failed in require at (eval 37) line 3.
at /usr/local/share/perl5/MHA/DBHelper.pm line 208.
at /usr/local/share/perl5/MHA/Server.pm line 166.
Wed Jan 27 19:34:25 2021 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln301] install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted.
Compilation failed in require at (eval 37) line 3.
at /usr/local/share/perl5/MHA/DBHelper.pm line 208.
at /usr/local/share/perl5/MHA/Server.pm line 166.
Wed Jan 27 19:34:25 2021 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln301] install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted.
Compilation failed in require at (eval 37) line 3.
at /usr/local/share/perl5/MHA/DBHelper.pm line 208.
at /usr/local/share/perl5/MHA/Server.pm line 166.
Wed Jan 27 19:34:26 2021 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln309] Got fatal error, stopping operations
Wed Jan 27 19:34:26 2021 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/local/share/perl5/MHA/MasterMonitor.pm line 329.
Wed Jan 27 19:34:26 2021 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Wed Jan 27 19:34:26 2021 - [info] Got exit code 1 (Not master dead).
MySQL Replication Health is NOT OK!
解决办法(本人下载时翻墙了)
yum install -y cpan
cpan -D DBI
cpan DBD::mysql
[root@yfm01 scripts]# cpan DBD::mysql
Reading '/root/.cpan/Metadata'
Database was generated on Wed, 27 Jan 2021 13:55:40 GMT
Running install for module 'DBD::mysql'
Running make for D/DV/DVEEDEN/DBD-mysql-4.050.tar.gz
Checksum for /root/.cpan/sources/authors/id/D/DV/DVEEDEN/DBD-mysql-4.050.tar.gz ok
CPAN.pm: Building D/DV/DVEEDEN/DBD-mysql-4.050.tar.gz
Can't exec "mysql_config": No such file or directory at Makefile.PL line 89.
Cannot find the file 'mysql_config'! Your execution PATH doesn't seem
not contain the path to mysql_config. Resorting to guessed values!
PLEASE NOTE:
For 'make test' to run properly, you must ensure that the
database user 'root' can connect to your MySQL server
and has the proper privileges that these tests require such
as 'drop table', 'create table', 'drop procedure', 'create procedure'
as well as others.
mysql> grant all privileges on test.* to 'root'@'localhost' identified by 's3kr1t';
You can also optionally set the user to run 'make test' with:
perl Makefile.PL --testuser=username
Can't exec "mysql_config": No such file or directory at Makefile.PL line 603.
Can't find mysql_config. Use --mysql_config option to specify where mysql_config is located
Failed to determine directory of mysql.h. Use
perl Makefile.PL --cflags=-I<dir>
to set this directory. For details see DBD::mysql::INSTALL,
section "C Compiler flags" or type
perl Makefile.PL --help
Warning: No success on command[/usr/bin/perl Makefile.PL]
DVEEDEN/DBD-mysql-4.050.tar.gz
/usr/bin/perl Makefile.PL -- NOT OK
Running make test
Make had some problems, won't test
Running make install
Make had some problems, won't install
检查是否包含mysql_config
which mysql_config
whereis mysql_config
yum install -y mysql-devel
然后再次执行cpan DBD::mysql
Wed Jan 27 20:25:42 2021 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.199.102 --slave_ip=192.168.199.102 --slave_port=3306 --workdir=/usr/local/mha --target_version=5.7.33-log --manager_version=0.58 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx
Wed Jan 27 20:25:42 2021 - [info] Connecting to root@192.168.199.102(192.168.199.102:22)..
Checking slave recovery environment settings..
Opening /var/lib/mysql/relay-log.info ... ok.
Relay log found at /var/lib/mysql, up to mysql-relay.000009
Temporary relay log file is /var/lib/mysql/mysql-relay.000009
Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.
Testing mysql connection and privileges..
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1227 (42000) at line 1: Access denied; you need (at least one of) the SUPER privilege(s) for this operation
mysql command failed with rc 1:0!
at /usr/local/bin/apply_diff_relay_logs line 404.
main::check() called at /usr/local/bin/apply_diff_relay_logs line 536
eval {...} called at /usr/local/bin/apply_diff_relay_logs line 514
main::main() called at /usr/local/bin/apply_diff_relay_logs line 121
yfm01
update user set Super_priv='Y' where User='root'
flush privileges
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1142 (42000) at line 1: CREATE command denied to user 'root'@'192.168.199.102' for table 'apply_diff_relay_logs_test'
mysql command failed with rc 1:0!
at /usr/local/bin/apply_diff_relay_logs line 404.
main::check() called at /usr/local/bin/apply_diff_relay_logs line 536
eval {...} called at /usr/local/bin/apply_diff_relay_logs line 514
main::main() called at /usr/local/bin/apply_diff_relay_logs line 121
yfm01
GRANT ALL PRIVILEGES ON *.* TO 'root'@'192.168.199.102'
GRANT ALL PRIVILEGES ON *.* TO 'root'@'192.168.199.104'
yfm01节点 添加虚拟ip
ifconfig ens33:1 192.168.199.99/24
删除虚拟ip
ip addr del 192.168.199.99/24 dev ens33
启动mha进程
nohup masterha_manager --conf=/usr/local/mha/mha.cnf > /tmp/mha_manager.log < /dev/null 2>&1 &
检查MHA集群运行状态
[root@yfm01 scripts]# masterha_check_status --conf=/usr/local/mha/mha.cnf
mha (pid:44432) is running(0:PING_OK), master:192.168.199.101
验证测试
我这里搞了一番之后,现在yfm04是master了
模拟master宕机
systemctl stop mysqld
VIP 会漂移到新的MSATER上
[root@yfm04 ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:46:21:69 brd ff:ff:ff:ff:ff:ff
inet 192.168.199.104/24 brd 192.168.199.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 192.168.199.99/24 brd 192.168.199.255 scope global secondary ens33:1
valid_lft forever preferred_lft forever
inet6 fe80::fca9:7d90:fd6:d63a/64 scope link tentative noprefixroute dadfailed
valid_lft forever preferred_lft forever
inet6 fe80::be4f:8aab:e347:6e66/64 scope link tentative noprefixroute dadfailed
valid_lft forever preferred_lft forever
inet6 fe80::bf02:a10:c932:6672/64 scope link tentative noprefixroute dadfailed
valid_lft forever preferred_lft forever
[root@yfm04 ~]# systemctl stop mysqld
[root@yfm04 ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:46:21:69 brd ff:ff:ff:ff:ff:ff
inet 192.168.199.104/24 brd 192.168.199.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet6 fe80::fca9:7d90:fd6:d63a/64 scope link tentative noprefixroute dadfailed
valid_lft forever preferred_lft forever
inet6 fe80::be4f:8aab:e347:6e66/64 scope link tentative noprefixroute dadfailed
valid_lft forever preferred_lft forever
inet6 fe80::bf02:a10:c932:6672/64 scope link tentative noprefixroute dadfailed
valid_lft forever preferred_lft forever
切换过程需要注意的几个问题
1、切换过程会自动把read_only关闭
2.切换之后需要删除手工删除/usr/local/mha/mha.failover.complete,才能进行第二次测试
3、一旦发生切换管理进程将会退出,无法进行再次测试,需将故障数据库加入到MHA环境中来
4.原主节点重新加入到MHA时只能设置为slave
mysql> reset slave;
Query OK, 0 rows affected (0.00 sec)
mysql> CHANGE MASTER TO master_host = '192.168.199.104',master_user = 'root',master_password = '123456',master_log_file = 'mysql-bin.000007',master_log_pos = 154;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> start slave;
5.关于ip地址的接管有几种方式,这里采用的是MHA自动调用IP别名的方式,好处是在能够保证数据库状态与业务IP切换的一致性。启动管理节点之后 VIP会自动别名到当前主节点上,Keepalived也只能做到对3306的健康检查,但是做不到比如像MySQL复制中的Slave-SQL、Slave-IO进程的检查,容易出现对切换的误判。
6.注意:二级从服务器需要将log_slave_updates打开
7.手工切换需要先定义好master_ip_online_change_script脚本,不然只会切换mysql,IP地址不会绑定上去,可以根据模板来配置该脚本
8.通过设置no_master=1可以让某一个节点永远不成为新的主节点
在线切换master
masterha_master_switch --conf=/usr/local/mha/mha.cnf --master_state=alive --new_master_host=192.168.199.104 --orig_master_is_new_slave --running_updates_limit=10000 --interactive=0
Sat Jan 30 14:28:25 2021 - [error][/usr/local/share/perl5/MHA/MasterRotate.pm, ln143] Getting advisory lock failed on the current master. MHA Monitor runs on the current master. Stop MHA Manager/Monitor and try again.
Sat Jan 30 14:28:25 2021 - [error][/usr/local/share/perl5/MHA/ManagerUtil.pm, ln177] Got ERROR: at /usr/local/bin/masterha_master_switch line 53.
masterha_stop --conf=/usr/local/mha/mha.cnf
Sat Jan 30 14:33:37 2021 - [info] ok.
Sat Jan 30 14:33:37 2021 - [info] 192.168.199.104 can be new master.
Sat Jan 30 14:33:37 2021 - [info]
From:
192.168.199.101(192.168.199.101:3306) (current master)
+--192.168.199.102(192.168.199.102:3306)
+--192.168.199.104(192.168.199.104:3306)
To:
192.168.199.104(192.168.199.104:3306) (new master)
+--192.168.199.102(192.168.199.102:3306)
+--192.168.199.101(192.168.199.101:3306)
Sat Jan 30 14:33:37 2021 - [info] Checking whether 192.168.199.104(192.168.199.104:3306) is ok for the new master..
Sat Jan 30 14:33:37 2021 - [info] ok.
Sat Jan 30 14:33:37 2021 - [info] 192.168.199.101(192.168.199.101:3306): SHOW SLAVE STATUS returned empty result. To check replication filtering rules, temporarily executing CHANGE MASTER to a dummy host.