我们通常说的双机热备是指两台机器都在运行,但并不是两台机器都同时在提供服务。当提供服务的一台出现故障的时候,另外一台会马上自动接管并且提供服务,而且切换的时间非常短。MySQL双主复制,即互为Master-Slave(只有一个Master提供写操作),可以实现数据库服务器的热备,但是一个Master宕机后不能实现动态切换。使用Keepalived,可以通过虚拟IP,实现双主对外的统一接口以及自动检查、失败切换机制,从而实现MySQL数据库的高可用方案。之前梳理了Mysql主从/主主同步,下面说下Mysql+keeoalived双主热备高可用方案的实施。
Keepalived看名字就知道,保持存活,在网络里面就是保持在线了,也就是所谓的高可用或热备,用来防止单点故障(单点故障是指一旦某一点出现故障就会导
整个系统架构的不可用)的发生,那说到keepalived不得不说的一个协议不是VRRP协议,可以说这个协议就是keepalived实现的基础。
1)Keepalived的工作原理是VRRP(Virtual Router Redundancy Protocol)虚拟路由冗余协议。在VRRP中有两组重要的概念:VRRP路由器和虚拟路由器,主控路由器和备份路由器。
2)VRRP路由器是指运行VRRP的路由器,是物理实体,虚拟路由器是指VRRP协议创建的,是逻辑概念。一组VRRP路由器协同工作,共同构成一台虚拟路由器。
Vrrp中存在着一种选举机制,用以选出提供服务的路由即主控路由,其他的则成了备份路由。当主控路由失效后,备份路由中会重新选举出一个主控路由,来继
续工作,来保障不间断服务。
过多内容在这里就不做详细介绍了,下面详细记录下Mysql+Keepalived双主热备的高可用方案的操作记录
1)先实施Master->Slave的主主同步。主主是数据双向同步,主从是数据单向同步。一般情况下,主库宕机后,需要手动将连接切换到从库上。(但是用keepalived就可以自动切换)
2)再结合Keepalived的使用,通过VIP实现Mysql双主对外连接的统一接口。即客户端通过Vip连接数据库;当其中一台宕机后,VIP会漂移到另一台上,这个过程对于客户端的数据连接来说几乎无感觉,从而实现高可用。
Centos7.5版本
dgp:192.168.110.7 安装mysql和keepalived
dgs:192.168.110.8 安装mysql和keepalived
vip:192.168.110.9
mysql版本:MySQL 5.7.21-log
keepalived版本:keepalived-2.0.18.tar.gz
要实现主主同步,可以先实现主从同步,即dgp->dgs的主从同步,然后dgs->dgp的主从同步.
这样,双方就完成了主主同步。
注意下面几点:
1)要保证同步服务期间之间的网络联通。即能相互ping通,能使用对方授权信息连接到对方数据库(防火墙开放3306端口)。
2)关闭selinux。
3)同步前,双方数据库中需要同步的数据要保持一致。这样,同步环境实现后,再次更新的数据就会如期同步了。
4) keepalived必须使用root编译安装
一、Mysql主主同步环境部署
---------------dgp服务器操作记录---------------
在my.cnf文件的[mysqld]配置区域添加下面内容:
[root@dgp keepalived-2.0.18]# cat /etc/my.cnf
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
symbolic-links=0
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
server-id = 1
log-bin = mysql-bin
sync_binlog = 1
binlog_checksum = none
binlog_format = mixed
auto-increment-increment = 2
auto-increment-offset = 1
slave-skip-errors = all
#skip-grant-tables
数据同步授权(iptables防火墙开启3306端口)这样I/O线程就可以以这个用户的身份连接到主服务器,并且读取它的二进制日志。
mysql> grant replication slave,replication client on *.* to slave@'192.168.%' identified by "slave#123";
Query OK, 0 rows affected (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
最好将库锁住,仅仅允许读,以保证数据一致性;待主主同步环境部署后再解锁;
锁住后,就不能往表里写数据,但是重启mysql服务后就会自动解锁!
mysql> flush tables with read lock; //注意该参数设置后,如果自己同步对方数据,同步前一定要记得先解锁!
Query OK, 0 rows affected (0.00 sec)
查看下log bin日志和pos值位置
mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000006 | 150 | | | |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
---------------dgs服务器操作记录---------------
在my.cnf文件的[mysqld]配置区域添加下面内容:
[root@dgs src]# cat /etc/my.cnf
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
#主2节点
server-id = 2
log-bin = mysql-bin
sync_binlog = 1
binlog_checksum = none
binlog_format = mixed
auto-increment-increment = 2
auto-increment-offset = 2
slave-skip-errors = all
[root@master2 ~]# /etc/init.d/mysql restart
Shutting down MySQL.. SUCCESS!
Starting MySQL.. SUCCESS!
mysql> grant replication slave,replication client on *.* to slave@'192.168.%' identified by "slave#123";
Query OK, 0 rows affected (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
mysql> flush tables with read lock;
Query OK, 0 rows affected (0.00 sec)
mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000001 | 2846 | | | |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
---------------dgp服务器做同步操作---------------
mysql> unlock tables; //先解锁,将对方数据同步到自己的数据库中
mysql> slave stop;
mysql> CHANGE MASTER TO MASTER_HOST='192.168.110.8', MASTER_USER='slave', MASTER_PASSWORD='Slave#123', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000001', MASTER_LOG_POS=2846, MASTER_CONNECT_RETRY=10;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> start slave;
Query OK, 0 rows affected (0.01 sec)
查看同步状态,如下出现两个“Yes”,表明同步成功!
这样,dgp就和dgs实现了主从同步,即dgp同步dgs的数据。
---------------dgs服务器做同步操作---------------
mysql> unlock tables; //先解锁,将对方数据同步到自己的数据库中
mysql> slave stop;
mysql> CHANGE MASTER TO MASTER_HOST='192.168.110.7', MASTER_USER='slave', MASTER_PASSWORD='slave123', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000006', MASTER_LOG_POS=150, MASTER_CONNECT_RETRY=10;
Query OK, 0 rows affected, 2 warnings (0.06 sec)
mysql> start slave;
Query OK, 0 rows affected (0.01 sec)
这样,dgs就和dgp实现了主从同步,即dgp也同步dgs的数据。
以上表明双方已经实现了mysql主主同步。
当运行一段时间后,要是发现同步有问题,比如只能单向同步,双向同步失效。可以重新执行下上面的change master同步操作,只不过这样同步后,只能同步在此之后的更新数据。下面开始进行数据验证:
(略略略)验证过程略,可通过互相创建库,创建表,创建索引,插入数据,更新数据等一系列操作相互验证。
至此,Mysql主主同步环境已经实现。
二、配置Mysql+Keepalived故障转移的高可用环境
1)安装keepalived并将其配置成系统服务。master1和master2两台机器上同样进行如下操作:
tar -zxvf keepalived-2.0.18.tar.gz
[root@dgp keepalived-2.0.18]# cd keepalived-2.0.18
[root@dgp keepalived-2.0.18]# ./configure --prefix=/usr/local/keepalived
[root@dgp keepalived-2.0.18]# make && make install
[root@dgp keepalived-2.0.18]# cp keepalived/etc/init.d/keepalived /etc/rc.d/init.d/
[root@dgp keepalived-2.0.18]# cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/
[root@dgp keepalived-2.0.18]# mkdir /etc/keepalived
[root@dgp keepalived-2.0.18]# cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/
[root@dgp keepalived-2.0.18]# cp /usr/local/keepalived/sbin/keepalived /usr/sbin/
[root@dgp keepalived-2.0.18]# echo "/etc/init.d/keepalived start" >> /etc/rc.local
[root@dgp keepalived-2.0.18]# cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak
vi /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
ops@wangshibo.cn
tech@wangshibo.cn
}
notification_email_from ops@wangshibo.cn
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id MASTER-HA
}
vrrp_script chk_mysql_port { #检测mysql服务是否在运行。有很多方式,比如进程,用脚本检测等等
script "/root/chk_mysql.sh" #这里通过脚本监测
interval 2 #脚本执行间隔,每2s检测一次
weight -5 #脚本结果导致的优先级变更,检测失败(脚本返回非0)则优先级 -5
fall 2 #检测连续2次失败才算确定是真失败。会用weight减少优先级(1-255之间)
rise 1 #检测1次成功就算成功。但不修改优先级
}
vrrp_instance VI_1 {
state MASTER
interface eth1 #指定虚拟ip的网卡接口
mcast_src_ip 192.168.110.7
virtual_router_id 51 #路由器标识,MASTER和BACKUP必须是一致的
priority 101 #定义优先级,数字越大,优先级越高,在同一个vrrp_instance下,MASTER的优先级必须大于BACKUP的优先级。这样MASTER故障恢复后,就可以将VIP资源再次抢回来
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.110.9
}
track_script {
chk_mysql_port
}
}
编写切换脚本。KeepAlived做心跳检测,如果dgp的MySQL服务挂了(3306端口挂了),那么它就会选择自杀。Slave的KeepAlived通过心跳检测发现这个情况,就会将VIP的请求接管
[root@dgp keepalived-2.0.18]# vi /root/chk_mysql.sh
#!/bin/bash
#下面的mysql监测脚本有点过于简单且粗暴,即脚本一旦监测到Master的mysql服务关闭,就立刻把keepalived服务关闭,从而实现vip转移!
#counter=$(netstat -na|grep "LISTEN"|grep "3306"|wc -l)
#if [ "${counter}" -eq 0 ]; then
# /etc/init.d/keepalived stop
#fi
#下面对该脚本进行优化,优化后,当监测到Master的mysql服务关闭后,就会将vip切换到Backup上(但此时Master的keepalived服务不会被暴力kill)
#当Master的mysql服务恢复后,就会再次将VIP资源切回来!
MYSQL=/usr/bin/mysql
MYSQL_HOST=192.168.110.7
MYSQL_USER=root
MYSQL_PASSWORD=Infore#123
CHECK_TIME=3
#mysql is working MYSQL_OK is 1 , mysql down MYSQL_OK is 0
MYSQL_OK=1
function check_mysql_helth (){
$MYSQL -h $MYSQL_HOST -u $MYSQL_USER -p${MYSQL_PASSWORD} -e "show status;" >/dev/null 2>&1
if [ $? = 0 ] ;then
MYSQL_OK=1
else
MYSQL_OK=0
fi
return $MYSQL_OK
}
while [ $CHECK_TIME -ne 0 ]
do
let "CHECK_TIME -= 1"
check_mysql_helth
if [ $MYSQL_OK = 1 ] ; then
CHECK_TIME=0
exit 0
fi
if [ $MYSQL_OK -eq 0 ] && [ $CHECK_TIME -eq 0 ]
then
pkill keepalived
exit 1
fi
sleep 1
done
[root@dgp keepalived-2.0.18]# chmod 755 /root/chk_mysql.sh
启动keepalived服务
[root@dgp keepalived-2.0.18]# /etc/init.d/keepalived start
正在启动 keepalived: [确定]
[root@dgp keepalived-2.0.18]# ps -ef|grep keepalived
root 9230 4086 0 14:23 pts/1 00:00:00 grep keepalived
root 31213 1 0 11:41 ? 00:00:00 keepalived -D
root 31214 31213 0 11:41 ? 00:00:02 keepalived -D
dgs机器上的keepalived配置。dgs机器上的keepalived.conf文件只修改priority为90、nopreempt不设置、real_server设置本地IP。
[root@dgs keepalived-2.0.18]# vi /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
ops@wangshibo.cn
tech@wangshibo.cn
}
notification_email_from ops@wangshibo.cn
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id MASTER-HA
}
vrrp_script chk_mysql_port { #检测mysql服务是否在运行。有很多方式,比如进程,用脚本检测等等
script "/root/chk_mysql.sh" #这里通过脚本监测
interval 2 #脚本执行间隔,每2s检测一次
weight -5 #脚本结果导致的优先级变更,检测失败(脚本返回非0)则优先级 -5
fall 2 #检测连续2次失败才算确定是真失败。会用weight减少优先级(1-255之间)
rise 1 #检测1次成功就算成功。但不修改优先级
}
vrrp_instance VI_1 {
state BACKUP
interface enp0s3 #指定虚拟ip的网卡接口
mcast_src_ip 192.168.110.8
virtual_router_id 51 #路由器标识,MASTER和BACKUP必须是一致的
priority 99 #定义优先级,数字越大,优先级越高,在同一个vrrp_instance下,MASTER的优先级必须大于BACKUP的优先级。这样MASTER故障恢复后,就可以将VIP资源再次抢回来
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.110.9
}
track_script {
chk_mysql_port
}
}
[root@dgs keepalived-2.0.18]# cat /root/chk_mysql.sh
#!/bin/bash
#下面的mysql监测脚本有点过于简单且粗暴,即脚本一旦监测到Master的mysql服务关闭,就立刻把keepalived服务关闭,从而实现vip转移!
#counter=$(netstat -na|grep "LISTEN"|grep "3306"|wc -l)
#if [ "${counter}" -eq 0 ]; then
# /etc/init.d/keepalived stop
#fi
#下面对该脚本进行优化,优化后,当监测到Master的mysql服务关闭后,就会将vip切换到Backup上(但此时Master的keepalived服务不会被暴力kill)
#当Master的mysql服务恢复后,就会再次将VIP资源切回来!
MYSQL=/usr/bin/mysql
MYSQL_HOST=192.168.110.8
MYSQL_USER=root
MYSQL_PASSWORD=Infore#123
CHECK_TIME=3
#mysql is working MYSQL_OK is 1 , mysql down MYSQL_OK is 0
MYSQL_OK=1
function check_mysql_helth (){
$MYSQL -h $MYSQL_HOST -u $MYSQL_USER -p${MYSQL_PASSWORD} -e "show status;" >/dev/null 2>&1
if [ $? = 0 ] ;then
MYSQL_OK=1
else
MYSQL_OK=0
fi
return $MYSQL_OK
}
while [ $CHECK_TIME -ne 0 ]
do
let "CHECK_TIME -= 1"
check_mysql_helth
if [ $MYSQL_OK = 1 ] ; then
CHECK_TIME=0
exit 0
fi
if [ $MYSQL_OK -eq 0 ] && [ $CHECK_TIME -eq 0 ]
then
pkill keepalived
exit 1
fi
sleep 1
done
[root@dgs ~]# chmod 755 /root/chk_mysql.sh
[root@dgs ~]# /etc/init.d/keepalived start
正在启动 keepalived: [确定]
[root@dgs keepalived-2.0.18]# ps -ef|grep keepalived
root 26969 1 0 10:50 ? 00:00:00 /usr/local/keepalived/sbin/keepalived -D
root 26970 26969 0 10:50 ? 00:00:03 /usr/local/keepalived/sbin/keepalived -D
至此,MySQL+keepalived 双主配置完成
三、相互验证
[root@dgp ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:86:8e:3f brd ff:ff:ff:ff:ff:ff
inet 192.168.110.7/24 brd 192.168.110.255 scope global eth1
inet 192.168.110.9/32 scope global eth1
inet6 fe80::a00:27ff:fe86:8e3f/64 scope link
valid_lft forever preferred_lft forever
[root@dgs ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:e7:1b:7c brd ff:ff:ff:ff:ff:ff
inet 192.168.110.8/24 brd 192.168.110.255 scope global noprefixroute enp0s3
valid_lft forever preferred_lft forever
inet6 fe80::f36b:da05:2965:447e/64 scope link noprefixroute
valid_lft forever preferred_lft forever
模拟主节点一dgp MySQL服务挂掉
[root@dgp ~]# service mysqld status
mysqld (pid 30983) 正在运行...
[root@dgp ~]# service mysqld stop
停止 mysqld: [确定]
查看vip漂移
[root@dgp ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:86:8e:3f brd ff:ff:ff:ff:ff:ff
inet 192.168.110.7/24 brd 192.168.110.255 scope global eth1
inet6 fe80::a00:27ff:fe86:8e3f/64 scope link
valid_lft forever preferred_lft forever
[root@dgs ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:e7:1b:7c brd ff:ff:ff:ff:ff:ff
inet 192.168.110.8/24 brd 192.168.110.255 scope global noprefixroute enp0s3
valid_lft forever preferred_lft forever
inet 192.168.110.9/32 scope global enp0s3
valid_lft forever preferred_lft forever
inet6 fe80::f36b:da05:2965:447e/64 scope link noprefixroute
valid_lft forever preferred_lft forever
启动dgp MySQL数据库服务
[root@dgp ~]# service mysqld start
正在启动 mysqld: [确定]
[root@dgp ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:86:8e:3f brd ff:ff:ff:ff:ff:ff
inet 192.168.110.7/24 brd 192.168.110.255 scope global eth1
inet 192.168.110.9/32 scope global eth1
inet6 fe80::a00:27ff:fe86:8e3f/64 scope link
valid_lft forever preferred_lft forever
[root@dgp ~]# mysql -uroot -pInfore#123
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 39
Server version: 5.7.17-log MySQL Community Server (GPL)
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> show slave status \G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.110.8
Master_User: slave
Master_Port: 3306
Connect_Retry: 10
Master_Log_File: mysql-bin.000001
Read_Master_Log_Pos: 2846
Relay_Log_File: dgp-relay-bin.000008
Relay_Log_Pos: 312
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
验证完毕,客户端始终都是连接vip 192.168.110.9操作数据库。