我们选择主主互热备做生产环境MYSQL高可用方案,该方案的优点是在压力大的时候,仍然可以通过设置负载均衡来做业务分发。而且最重要的是配置简单,两份配置都一样,改个IP就可以就喜欢简单的东西。
目前考虑负债均衡分发可能有同步过程中引起数据不一致的问题,我们使用主主机制做数据即时同步,用KeepAliveD做数据库监控及故障自动切换。
实现:
1.数据库主从设置:
MYSQL安装完成后,mysql的配置修改为:
写道
[mysqld]
datadir=/usr/local/mysql/data
socket=/var/lib/mysql/mysql.sock
max_connection=1000
log-bin=mysql-bin
server-id=2
//这块两台服务器需要设置成不同
binlog-do-db=CU
binlog-do-db=FAE
binlog-do-db=HA
//它是用于主从同步时,给master用的配置参数。来说明需要复制那些库给从库。因为有时主从不是要把全部的DB都复制给从库。
binlog-ignore-db=mysql
//安全起见,无论如何不从主同步系统DB
replicate-do-db=CU
replicate-do-db=FAE
replicate-do-db=HA
//从主读取的库
replicate-ignore-db=mysql
//安全起见,无论如何不从主同步系统DB
log-slave-updates
#slave-skip-errors=all
#auto_increment_increment=2
#auto_increment_offset=2
//不是用负载均衡模式,用的是主主模式,所以这块不需要设置
[mysql.server]
user=mysql
basedir=/usr/local/mysql
[client]
socket=/var/lib/mysql/mysql.sock
datadir=/usr/local/mysql/data
socket=/var/lib/mysql/mysql.sock
max_connection=1000
log-bin=mysql-bin
server-id=2
//这块两台服务器需要设置成不同
binlog-do-db=CU
binlog-do-db=FAE
binlog-do-db=HA
//它是用于主从同步时,给master用的配置参数。来说明需要复制那些库给从库。因为有时主从不是要把全部的DB都复制给从库。
binlog-ignore-db=mysql
//安全起见,无论如何不从主同步系统DB
replicate-do-db=CU
replicate-do-db=FAE
replicate-do-db=HA
//从主读取的库
replicate-ignore-db=mysql
//安全起见,无论如何不从主同步系统DB
log-slave-updates
#slave-skip-errors=all
#auto_increment_increment=2
#auto_increment_offset=2
//不是用负载均衡模式,用的是主主模式,所以这块不需要设置
[mysql.server]
user=mysql
basedir=/usr/local/mysql
[client]
socket=/var/lib/mysql/mysql.sock
配置好之后,在两台机器上分别给从建立账号,登录两台Mysql,执行以下语句:
写道
CHANGE MASTER TO
MASTER_HOST='192.***.***.***',
MASTER_USER='repl_user',
MASTER_PASSWORD='hj3****’;
MASTER_HOST='192.***.***.***',
MASTER_USER='repl_user',
MASTER_PASSWORD='hj3****’;
注意的一点就是,使用sqlyog,一定要使用12.09以上的版本,否者主从同步会出现脏数据,因为sqlyog的老版本在修改数据库的时候,某些时候不会产生binlog,导致这些修改数据不会同步。
2.故障检测和主从切换。
我们使用keepalived和shell脚本搞定。
原理:两台数据库服务器上同时部署Keepalived。Keepalived所有节点会互相通信,互报平安。一旦抢到主的机器的Keepalived没有心跳了,从机器的Keepalived会把主的IP抢过来。Keepalived会定期调用shell脚本检测本机mysql是否可用。一旦本机mysq不可用,Keepalived会通知其他机器的Keepalived抢主的IP来实现故障切换。
Keepalived的配置:
写道
! Configuration File for keepalived
#Keepalived组的名称,各个组之间互相不影响
global_defs {
router_id cu_ha_router
}
#检测MYSQL的实例
vrrp_instance mysql {
state BACKUP # 主也配置为SLAVE
interface eth0 #绑定虚拟IP的网络接口
virtual_router_id 101 #VRRP组名,两个节点的设置必须一样,以指明各个节点属于同一VRRP组
priority 150 #节点的优先级(1-254之间),备用节点必须比主节点优先级低
nopreempt # 不抢占,注意加上
advert_int 1 #组播信息发送间隔,两个节点设置必须一样
authentication { #设置验证信息,两个节点必须一致
auth_type PASS
auth_pass xmotor
}
#一定要注意这块子网掩码的计算,写错了可能会影响整台的网络访问
virtual_ipaddress {
172.*.*.201/20
}
track_script {
chk_mysql
}
#我们使用的是亚马逊云,不能使用组播模式(advertisement),所以必须使用单播模式。单播模式需要增加以下配置。我们做过测试,单播模式支持三台或以上的机器。
unicast_src_ip 172.31.2.1 #localIp
unicast_peer {
172.31.2.124 #Resource-02
#172.31.0.207 #Management
}
#以下四个脚本分别在机器变成主、从、失败、宕机的时候执行的脚本,目前我们只记录一条日志
notify_master "/home/keepalived/scripts/mysql_be_master.sh"
notify_backup "/home/keepalived/scripts/mysql_be_slave.sh"
notify_fault "/home/keepalived/scripts/mysql_fault.sh"
notify_stop "/home/keepalived/scripts/mysql_stop.sh"
}
#检测MYSQL函数
vrrp_script chk_mysql
{
script "/home/keepalived/scripts/mysql_chk.sh eth0 root root"
interval 2
timeout 2
fall 3
}
vrrp_script chk_redis
{
script "/home/keepalived/scripts/redis_check.sh 6391 eth0"
interval 2
timeout 2
fall 3
}
vrrp_script chk_mongo
{
script "/home/keepalived/scripts/mongo_chk.sh eth0"
interval 2
timeout 2
fall 3
}
vrrp_instance redis {
state BACKUP
interface eth0 #绑定虚拟IP的网络接口
virtual_router_id 102 #VRRP组名,两个节点的设置必须一样,以指明各个节点属于同一VRRP组
priority 100 #节点的优先级(1-254之间),备用节点必须比主节点优先级低
advert_int 1 #组播信息发送间隔,两个节点设置必须一样
authentication { #设置验证信息,两个节点必须一致
auth_type PASS
auth_pass xmotor
}
virtual_ipaddress {
172.*.*.202/20
}
track_script {
chk_redis
}
unicast_src_ip 172.*.*.1 #localIp
unicast_peer {
172.*.*.124 #Resource-02
#172.*.*.207 #Management
}
notify_master "/home/keepalived/scripts/redis_be_master.sh 6391"
notify_backup "/home/keepalived/scripts/redis_be_slave.sh 172.*.*.* 6391"
notify_fault "/home/keepalived/scripts/redis_fault.sh 6391"
notify_stop "/home/keepalived/scripts/redis_stop.sh 6391"
}
vrrp_instance mongodb {
state BACKUP
interface eth0 #绑定虚拟IP的网络接口
virtual_router_id 103 #VRRP组名,两个节点的设置必须一样,以指明各个节点属于同一VRRP组
priority 100 #节点的优先级(1-254之间),备用节点必须比主节点优先级低
advert_int 1 #组播信息发送间隔,两个节点设置必须一样
authentication { #设置验证信息,两个节点必须一致
auth_type PASS
auth_pass xmotor
}
unicast_src_ip 172.*.*.1 #localIp
unicast_peer {
172.*.*.124 #Resource-02
#172.*.*.207 #Management
}
virtual_ipaddress {
172.*.*.203/20
}
track_script {
chk_mongo
}
notify_master "/home/keepalived/scripts/mongo_be_master.sh"
notify_backup "/home/keepalived/scripts/mongo_be_slave.sh"
notify_fault "/home/keepalived/scripts/mongo_fault.sh"
notify_stop "/home/keepalived/scripts/mongo_stop.sh"
}
#Keepalived组的名称,各个组之间互相不影响
global_defs {
router_id cu_ha_router
}
#检测MYSQL的实例
vrrp_instance mysql {
state BACKUP # 主也配置为SLAVE
interface eth0 #绑定虚拟IP的网络接口
virtual_router_id 101 #VRRP组名,两个节点的设置必须一样,以指明各个节点属于同一VRRP组
priority 150 #节点的优先级(1-254之间),备用节点必须比主节点优先级低
nopreempt # 不抢占,注意加上
advert_int 1 #组播信息发送间隔,两个节点设置必须一样
authentication { #设置验证信息,两个节点必须一致
auth_type PASS
auth_pass xmotor
}
#一定要注意这块子网掩码的计算,写错了可能会影响整台的网络访问
virtual_ipaddress {
172.*.*.201/20
}
track_script {
chk_mysql
}
#我们使用的是亚马逊云,不能使用组播模式(advertisement),所以必须使用单播模式。单播模式需要增加以下配置。我们做过测试,单播模式支持三台或以上的机器。
unicast_src_ip 172.31.2.1 #localIp
unicast_peer {
172.31.2.124 #Resource-02
#172.31.0.207 #Management
}
#以下四个脚本分别在机器变成主、从、失败、宕机的时候执行的脚本,目前我们只记录一条日志
notify_master "/home/keepalived/scripts/mysql_be_master.sh"
notify_backup "/home/keepalived/scripts/mysql_be_slave.sh"
notify_fault "/home/keepalived/scripts/mysql_fault.sh"
notify_stop "/home/keepalived/scripts/mysql_stop.sh"
}
#检测MYSQL函数
vrrp_script chk_mysql
{
script "/home/keepalived/scripts/mysql_chk.sh eth0 root root"
interval 2
timeout 2
fall 3
}
vrrp_script chk_redis
{
script "/home/keepalived/scripts/redis_check.sh 6391 eth0"
interval 2
timeout 2
fall 3
}
vrrp_script chk_mongo
{
script "/home/keepalived/scripts/mongo_chk.sh eth0"
interval 2
timeout 2
fall 3
}
vrrp_instance redis {
state BACKUP
interface eth0 #绑定虚拟IP的网络接口
virtual_router_id 102 #VRRP组名,两个节点的设置必须一样,以指明各个节点属于同一VRRP组
priority 100 #节点的优先级(1-254之间),备用节点必须比主节点优先级低
advert_int 1 #组播信息发送间隔,两个节点设置必须一样
authentication { #设置验证信息,两个节点必须一致
auth_type PASS
auth_pass xmotor
}
virtual_ipaddress {
172.*.*.202/20
}
track_script {
chk_redis
}
unicast_src_ip 172.*.*.1 #localIp
unicast_peer {
172.*.*.124 #Resource-02
#172.*.*.207 #Management
}
notify_master "/home/keepalived/scripts/redis_be_master.sh 6391"
notify_backup "/home/keepalived/scripts/redis_be_slave.sh 172.*.*.* 6391"
notify_fault "/home/keepalived/scripts/redis_fault.sh 6391"
notify_stop "/home/keepalived/scripts/redis_stop.sh 6391"
}
vrrp_instance mongodb {
state BACKUP
interface eth0 #绑定虚拟IP的网络接口
virtual_router_id 103 #VRRP组名,两个节点的设置必须一样,以指明各个节点属于同一VRRP组
priority 100 #节点的优先级(1-254之间),备用节点必须比主节点优先级低
advert_int 1 #组播信息发送间隔,两个节点设置必须一样
authentication { #设置验证信息,两个节点必须一致
auth_type PASS
auth_pass xmotor
}
unicast_src_ip 172.*.*.1 #localIp
unicast_peer {
172.*.*.124 #Resource-02
#172.*.*.207 #Management
}
virtual_ipaddress {
172.*.*.203/20
}
track_script {
chk_mongo
}
notify_master "/home/keepalived/scripts/mongo_be_master.sh"
notify_backup "/home/keepalived/scripts/mongo_be_slave.sh"
notify_fault "/home/keepalived/scripts/mongo_fault.sh"
notify_stop "/home/keepalived/scripts/mongo_stop.sh"
}
下面是检测mysql的脚本,可用返回0,不可用返回1。
写道
#!/bin/bash
#define mysql variable
NETBOARD=$1
MYSQL_USER=$2
MYSQL_PWD=$3
LOGFILE="/var/log/keepalived/mysql/mysql-check.log"
# 看3306端口是否开这
mysql_alive=`netstat -nl | awk 'NR>2{if ($4 ~ /.*:3306/) {print "Yes";exit 0}}'`
#记录日志
echo "[CHECK]" >> $LOGFILE
date >> $LOGFILE
# 取出网关的IP地址
gate=`route -n | grep $NETBOARD | grep UG | awk '{print $2}'`
# 查看网关是否通畅
ping -w 1 -c 1 $gate >/dev/null
ret=$?
#连不上网关,记录日志,返回1
if [ $ret != 0 ];then
echo "Network Gate is not connected" >> $LOGFILE 2>&1
exit 1
fi
Slave_IO_Running=""
Slave_SQL_Running=""
# 如果MYSQL端口是通的,登录mysql,执行mysql语句,检查mysql状态
if [ "$mysql_alive" == "Yes" ];then
Slave_IO_Running=`mysql -u${MYSQL_USER} -p${MYSQL_PWD} -e"show slave status\G" | grep "Slave_IO_Running:" | awk '{print $2}'`
Slave_SQL_Running=`mysql -u${MYSQL_USER} -p${MYSQL_PWD} -e"show slave status\G" | grep "Slave_SQL_Running:" | awk '{print $2}'`
if [ "$Slave_IO_Running" != "" -a "$Slave_SQL_Running" != "" ];then
echo "Success, Slave_IO_Running = $Slave_IO_Running, Slave_SQL_Running = $Slave_SQL_Running" >> $LOGFILE 2>&1
exit 0
else
mysql -u${MYSQL_USER} -p${MYSQL_PWD} -e"show slave status\G"
getSlaveStatusResult=$?
echo "Failed, Reason: $getSlaveStatusResult" >> $LOGFILE 2>&1
exit 1
fi
else
echo "Mysql Down" >> $LOGFILE 2>&1
exit 1
fi
#define mysql variable
NETBOARD=$1
MYSQL_USER=$2
MYSQL_PWD=$3
LOGFILE="/var/log/keepalived/mysql/mysql-check.log"
# 看3306端口是否开这
mysql_alive=`netstat -nl | awk 'NR>2{if ($4 ~ /.*:3306/) {print "Yes";exit 0}}'`
#记录日志
echo "[CHECK]" >> $LOGFILE
date >> $LOGFILE
# 取出网关的IP地址
gate=`route -n | grep $NETBOARD | grep UG | awk '{print $2}'`
# 查看网关是否通畅
ping -w 1 -c 1 $gate >/dev/null
ret=$?
#连不上网关,记录日志,返回1
if [ $ret != 0 ];then
echo "Network Gate is not connected" >> $LOGFILE 2>&1
exit 1
fi
Slave_IO_Running=""
Slave_SQL_Running=""
# 如果MYSQL端口是通的,登录mysql,执行mysql语句,检查mysql状态
if [ "$mysql_alive" == "Yes" ];then
Slave_IO_Running=`mysql -u${MYSQL_USER} -p${MYSQL_PWD} -e"show slave status\G" | grep "Slave_IO_Running:" | awk '{print $2}'`
Slave_SQL_Running=`mysql -u${MYSQL_USER} -p${MYSQL_PWD} -e"show slave status\G" | grep "Slave_SQL_Running:" | awk '{print $2}'`
if [ "$Slave_IO_Running" != "" -a "$Slave_SQL_Running" != "" ];then
echo "Success, Slave_IO_Running = $Slave_IO_Running, Slave_SQL_Running = $Slave_SQL_Running" >> $LOGFILE 2>&1
exit 0
else
mysql -u${MYSQL_USER} -p${MYSQL_PWD} -e"show slave status\G"
getSlaveStatusResult=$?
echo "Failed, Reason: $getSlaveStatusResult" >> $LOGFILE 2>&1
exit 1
fi
else
echo "Mysql Down" >> $LOGFILE 2>&1
exit 1
fi