MySQL主主同步和主从同步的原理一样,只是双方都是主从角色。
环境
操作系统版本:CentOS7 64位
MySQL版本:mysql5.6.33
节点1IP:192.168.1.205 主机名:edu-mysql-01
节点2IP:192.168.1.206 主机名:edu-mysql-02
VIP(虚拟IP):192.168.1.207
下文中说到和vip即指192.168.1.207,节点1或205即指192.168.1.205,节点2或206即指192.168.1.206
MySQL高可用主要是通过Keepalived这款软件来实现多个节点的服务可用性检测和故障转移,对外提供VIP地址供客户端连接,由keepalived根据服务的状态,负责将vip漂移到真实可用的机器上,从而实现服务的高可用。阅读本文前,需要先安装MySQL、Keepalived以及MySQL主主同步配置,可参考我之前写的文章自行安装与配置:
1>、《 MySQL5.7安装与配置(YUM)》
2>、《MySQL主主数据同步》
3>、《Keepalived安装与配置》
本文配置keepalived为非抢占模式。
MySQL 主从复制官方文档:http://dev.mysql.com/doc/refman/5.6/en/replication.html
注意:
1> 主从服务器操作系统版本和位数要保持一致
2> Master和Slave数据库的版本要一致
3> Master和Slave数据库中的数据要一致,最好是空库,不然需要进行备份
主主配置
配置之前先参考《MySQL5.7安装与配置(YUM)》安装好MySQL(注意本文演示的是5.6版本,需要修改文章中的yum源为5.6)
1、安全配置
1> 防火墙
添加mysql通信端口(默认为3306)
shell> vim /etc/sysconfig/iptables
-A INPUT -m state --state NEW -m tcp -p tcp --dport 3306 -j ACCEPT
shell> service iptables restart
或关闭防火墙
shell> service iptables stop
2> 关闭selinux
shell> vi /etc/selinux/config
SELINUX=disabled
将SELINUX的值修改为disabled
2. 节点1配置(192.168.1.205)
2.1 添加数据同步配置
shell> vim /etc/my.cnf
在[mysqld]中增加以下配置项:
[mysqld]
# 服务器的ID,必须唯一,一般设置自己的IP
server_id=205
# 复制过滤:不需要备份的数据库(MySQL库一般不同步)
binlog-ignore-db=mysql
# 开启二进制日志功能,名字可以随便取,最好有含义(比如项目名)
log-bin=edu-mysql-bin
# 为每个 session 分配的内存,在事务过程中用来存储二进制日志的缓存
binlog_cache_size=1M
# 主从复制的格式(mixed,statement,row,默认格式是 statement)
binlog_format=mixed
# 二进制日志自动删除/过期的天数。默认值为 0,表示不自动删除。
expire_logs_days=7
## 跳过主从复制中遇到的所有错误或指定类型的错误,避免 slave 端复制中断。
## 如:1062 错误是指一些主键重复,1032 错误是因为主从数据库数据不一致
slave_skip_errors=1062
# 作为从服务器时的中继日志
relay_log=edu-mysql-relay-bin
relay-log-index=relay-bin.index
# log_slave_updates 表示 slave 将复制事件写进自己的二进制日志
log_slave_updates=1
# 主键自增规则,避免主从同步ID重复的问题
auto_increment_increment=2 # 自增因子(每次加2)
auto_increment_offset=1 # 自增偏移(从1开始),单数
replicate-ignore-db=mysql #不对master的mysql库进行复制
skip-slave-start #跳过赋值进程和服务器一起启动,主要是为了生产环境要修改master问题时开启
2.2 Master配置
# 先重启一下服务
shell> service mysqld restart
# 登录到mysql
shell> mysql -uroot -p
# 创建数据库同步用户,并授予相应的权限
mysql> grant replication slave, replication client on *.* to 'repl'@'192.168.1.206' identified by 'root123456';
# 刷新授权表信息
mysql> flush privileges;
# 查看binlog文件的position(偏移)和File(日志文件)的值,从机上需要用到
mysql> show master status;
+----------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+----------------------+----------+--------------+------------------+-------------------+
| edu-mysql-bin.000001 | 120 | | mysql | |
+----------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
2.3 Slave配置
# master_user和master_password:在206上执行grant replication slave...创建的用户和密码
# master_log_file和master_log_pos:在206上运行show master status;命令执行结果对应File和Position字段的值
mysql> change master to master_host='192.168.1.206',master_user='repl', master_password='root123456', master_port=3306, master_log_file='edu-mysql-bin.000001', master_log_pos=439, master_connect_retry=30;
# 查看作为从节点的状态信息
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 192.168.1.206
Master_User: repl
Master_Port: 3306
Connect_Retry: 30
Master_Log_File: edu-mysql-bin.000001
Read_Master_Log_Pos: 439
Relay_Log_File: edu-mysql-relay-bin.000001
Relay_Log_Pos: 4
Relay_Master_Log_File: edu-mysql-bin.000001
Slave_IO_Running: No
Slave_SQL_Running: No
# 省略其它配置。。。
由于此时从节点还没有启动,Slave_IO_State的值为空,Slave_IO_Running和Slave_SQL_Running线程为No表示也没有运行。
3. 节点2配置(192.168.1.206)
3.1 添加数据同步配置
shell> vim /etc/my.cnf
在[mysqld]中增加以下配置项:
[mysqld]
server_id=206
binlog-ignore-db=mysql
log-bin=edu-mysql-bin
binlog_cache_size=1M
binlog_format=mixed
expire_logs_days=7
slave_skip_errors=1062
relay_log=edu-mysql-relay-bin
relay-log-index=relay-bin.index
log_slave_updates=1
#ID自增从2开始,双数
auto_increment_increment=2
auto_increment_offset=2
replicate-ignore-db=mysql
skip-slave-start #跳过赋值进程和服务器一起启动,主要是为了生产环境要修改master问题时开启
3.2 Master配置
# 先重启一下服务
shell> service mysqld restart
# 登录到mysql
shell> mysql -uroot -p
# 创建数据库同步用户,并授予相应的权限(只允许repl用户从192.168.1.205上登录)
mysql> grant replication slave, replication client on *.* to 'repl'@'192.168.1.205' identified by 'root123456';
# 刷新授权表信息
mysql> flush privileges;
# 查看binlog文件的position(偏移)和File(日志文件)的值,从机上需要用到
mysql> show master status;
+----------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+----------------------+----------+--------------+------------------+-------------------+
| edu-mysql-bin.000001 | 439 | | mysql | |
+----------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
这时可以启动节点1(205)的slave服务
3.3 Slave配置
# master_log_file和master_log_pos:205节点上执行show master status;对应File和position的值
mysql> change master to master_host='192.168.1.205',master_user='repl', master_password='root123456', master_port=3306, master_log_file='edu-mysql-bin.000001', master_log_pos=120, master_connect_retry=30;
Query OK, 0 rows affected, 2 warnings (0.02 sec)
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 192.168.1.205
Master_User: repl
Master_Port: 3306
Connect_Retry: 30
Master_Log_File: edu-mysql-bin.000001
Read_Master_Log_Pos: 120
Relay_Log_File: edu-mysql-relay-bin.000001
Relay_Log_Pos: 4
Relay_Master_Log_File: edu-mysql-bin.000001
Slave_IO_Running: No
Slave_SQL_Running: No
Replicate_Do_DB:
#...省略其它配置
4.启动两个节点的Slave
shell> start slave;
Query OK, 0 rows affected (0.01 sec)
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.206
Master_User: repl
Master_Port: 3306
Connect_Retry: 30
Master_Log_File: edu-mysql-bin.000001
Read_Master_Log_Pos: 439
Relay_Log_File: edu-mysql-relay-bin.000002
Relay_Log_Pos: 287
Relay_Master_Log_File: edu-mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
...省略其它配置
shell> start slave;
Query OK, 0 rows affected (0.01 sec)
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.205
Master_User: repl
Master_Port: 3306
Connect_Retry: 30
Master_Log_File: edu-mysql-bin.000001
Read_Master_Log_Pos: 439
Relay_Log_File: edu-mysql-relay-bin.000002
Relay_Log_Pos: 287
Relay_Master_Log_File: edu-mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
...省略其它配置
5、验证
# 登录205创建一个数据库
shell> mysql -u root -p
mysql> create database if not exists mydb default character set utf8 collate utf8_general_ci;
mysql> create table user (id int, username varchar(30), password varchar(30));
mysql> insert into user values (1, 'yangxin', '123456');
# 下面是在206节点上的操作
#1、登录206查询所有库,是否包含mydb数据库
#2、切换到mydb库,是否包含user表,并有一条数据
#3、在206的mydb.user表插入一条数据,查看205是否同步过去
mysql> insert into user values (2,'yangxin2','123456')
详细过程如下图所示:
实现HA
节点1配置
shell> vi /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id db-01
}
vrrp_instance VI_1 {
state BACKUP # 两个节点都为BACKUP状态,根据优先级大小判断谁为MASTER
interface enp0s3
virtual_router_id 51
priority 100
advert_int 1
nopreempt # 非抢占模式
authentication {
auth_type PASS
auth_pass 1111
}
# 虚拟IP池
virtual_ipaddress {
192.168.1.207
}
}
virtual_server 192.168.1.207 3306 {
delay_loop 2
lb_algo wrr
lb_kind DR
persistence_timeout 60
protocol TCP
real_server 192.168.1.205 3306 {
weight 3
notify_down /etc/keepalived/mysql.sh # 当mysql服务down了之后,执行的脚本
TCP_CHECK {
connect_timeout 10 # mysql连接超时时长(秒)
nb_get_retry 3 # mysql服务连接失败,重试次数
delay_before_retry 3 #每隔3秒检测一次mysql服务是否可用
connect_port 3306
}
}
}
/etc/keepalived/mysql.sh 配置
#!/bin/sh
pkill keepalived
当mysql服务停止时,会执行/etc/keepalived/mysql.sh脚本,将当前节点的keepalived服务停止,这样vip就会切换到另外一个节点上,从而实现了服务的高可用。
节点2配置
shell> vi /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id db-02
}
vrrp_instance VI_1 {
state BACKUP
interface enp0s3
virtual_router_id 51
priority 90
advert_int 1
nopreempt
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.1.207
}
}
virtual_server 192.168.1.207 3306 {
delay_loop 2
lb_algo wrr
lb_kind DR
persistence_timeout 60
protocol TCP
real_server 192.168.1.206 3306 {
weight 3
notify_down /etc/keepalived/mysql.sh
TCP_CHECK {
connect_timeout 10
nb_get_retry 3
delay_before_retry 3
connect_port 3306
}
}
}
/etc/keepalived/mysql.sh 配置
#!/bin/sh
pkill keepalived
启动MySQL和Keepalived服务
shell> service mysql start
shell> service keepalived start
注意:先启动mysql,再启动keepalived。因为keepalived启动之后会去连接mysql,
检测服务是否可用,如果3次都没连接成功,则会将keepalived进程杀死。
连接测试
注意:连接到VIP地址,而非直接连接到真实的MySQL服务器。
先看看vip漂在哪台服务器上:
从上图得知,此时VIP漂移在节点1(192.168.1.205)上。
shell> mysql -u root -proot -h 192.168.1.207
注:在指定ip连接到mysql时,需要授权配置远程连接的帐号、密码和ip。如:
grant all privileges on *.* to 'root'@'%' identified by 'root' with grant option;
故障转移测试
从上面连接测试得知,此时vip漂移在192.168.1.205节点上,所以通过vip(192.168.1.207)连接到的真实服务器是192.168.1.205。将205的mysql服务停止,观察vip是否会切换到206上。
通过/var/log/message文件可以看到keepalived故障的日志
节点1的日志:
Sep 25 16:00:32 edu-mysql-01 Keepalived_healthcheckers[3517]: TCP connection to [192.168.1.205]:3306 failed.
Sep 25 16:00:35 edu-mysql-01 Keepalived_healthcheckers[3517]: TCP connection to [192.168.1.205]:3306 failed.
Sep 25 16:00:35 edu-mysql-01 Keepalived_healthcheckers[3517]: Check on service [192.168.1.205]:3306 failed after 1 retry.
Sep 25 16:00:35 edu-mysql-01 Keepalived_healthcheckers[3517]: Removing service [192.168.1.205]:3306 from VS [192.168.1.207]:3306
Sep 25 16:00:35 edu-mysql-01 Keepalived_healthcheckers[3517]: Executing [/etc/keepalived/mysql.sh] for service [192.168.1.205]:3306 in VS [192.168.1.207]:3306
Sep 25 16:00:35 edu-mysql-01 Keepalived_healthcheckers[3517]: Lost quorum 1-0=1 > 0 for VS [192.168.1.207]:3306
Sep 25 16:00:35 edu-mysql-01 kernel: IPVS: __ip_vs_del_service: enter
Sep 25 16:00:35 edu-mysql-01 Keepalived[3515]: Stopping
Sep 25 16:00:35 edu-mysql-01 Keepalived_healthcheckers[3517]: Stopped
Sep 25 16:00:35 edu-mysql-01 Keepalived_vrrp[3518]: VRRP_Instance(VI_1) sent 0 priority
Sep 25 16:00:35 edu-mysql-01 Keepalived_vrrp[3518]: VRRP_Instance(VI_1) removing protocol VIPs.
Sep 25 16:00:36 edu-mysql-01 Keepalived_vrrp[3518]: Stopped
Sep 25 16:00:36 edu-mysql-01 Keepalived[3515]: Stopped Keepalived v1.2.23 (09/12,2016)
前三行是keepalived心跳检测,每秒检测一次mysql服务是否可用,当第3次连接失败时,
将该服务从lvs真实服务器列表中移除(第4行),并执行/etc/keepalived/mysql.sh脚本(第5行)
停止keepalived服务,转让vip使用权。
节点2的日志:
Sep 25 16:00:36 edu-mysql-02 Keepalived_vrrp[3457]: VRRP_Instance(VI_1) Transition to MASTER STATE
Sep 25 16:00:37 edu-mysql-02 Keepalived_vrrp[3457]: VRRP_Instance(VI_1) Entering MASTER STATE
Sep 25 16:00:37 edu-mysql-02 Keepalived_vrrp[3457]: VRRP_Instance(VI_1) setting protocol VIPs.
Sep 25 16:00:37 edu-mysql-02 Keepalived_vrrp[3457]: Sending gratuitous ARP on enp0s3 for 192.168.1.207
Sep 25 16:00:37 edu-mysql-02 Keepalived_vrrp[3457]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on enp0s3 for 192.168.1.207
Sep 25 16:00:37 edu-mysql-02 Keepalived_vrrp[3457]: Sending gratuitous ARP on enp0s3 for 192.168.1.207
Sep 25 16:00:37 edu-mysql-02 Keepalived_vrrp[3457]: Sending gratuitous ARP on enp0s3 for 192.168.1.207
Sep 25 16:00:37 edu-mysql-02 Keepalived_vrrp[3457]: Sending gratuitous ARP on enp0s3 for 192.168.1.207
Sep 25 16:00:37 edu-mysql-02 Keepalived_vrrp[3457]: Sending gratuitous ARP on enp0s3 for 192.168.1.207
Sep 25 16:00:37 edu-mysql-02 Keepalived_healthcheckers[3456]: Netlink reflector reports IP 192.168.1.207 added
Sep 25 16:00:42 edu-mysql-02 Keepalived_vrrp[3457]: Sending gratuitous ARP on enp0s3 for 192.168.1.207
Sep 25 16:00:42 edu-mysql-02 Keepalived_vrrp[3457]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on enp0s3 for 192.168.1.207
节点2当前是BACKUP状态,当收到节点1挂掉的通知后,将自己的状态转换为master状态(第1行),接着将vip(192.168.1.207)绑定到enp0s3网卡上(第3到第5行)。
查看故障转移后的vip
从上图可知,vip成功从205转移到了206上。
此时在之前已经和vip建立了连接的mysql客户端当中,再执行sql时,第一次会执行失败,因为之前的连接已经断开,第二次执行时,会尝试重新连接到vip对应的新的mysql真实服务器上。如下图所示: