测试服务器环境均为Centos7,测试结构如下图:
mysql-master:192.168.1.212 ------> mha_node1
mysql-master-bak:192.168.1.213 ------>mha_node2
mysql-slave:192.168.1.211 ------>mha_manager
环境部署:
三台机器安装mariadb,并配置好主从
分别在mysql中执行grant all on *.* to root@'%' identified by '123';
配置服务器间面密码登陆
cat ssh.sh
#!/bin/bash
HOSTS=("192.168.1.211" "192.168.1.212" )
if ! [ $# -eq 1 ]; then
echo "$0 --key"
echo "$0 \"COMMAND\""
exit 1
fi
if [ "$1" = "--key" ]; then
[ -f ~/.ssh/id_rsa ] || ssh-keygen -t rsa -f ~/.ssh/id_rsa -N ''
[ -f ~/.ssh/id_rsa.pub ] && cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys && chmod 600 ~/.ssh/authorized_keys
for host in ${HOSTS[*]}; do
# 因为其它节点上都没有~/.ssh这么一个目录, 还得远程创建, 这导致分发密钥时要输两次密码。。。
ssh root@$host "[ -d ~/.ssh ] || mkdir ~/.ssh"
scp -p ~/.ssh/{id_rsa,authorized_keys} root@$host:~/.ssh/
done
else
for host in ${HOSTS[*]}; do
ssh root@$host $1
done
fi
配置文件如下:(保证server_id不同,slave开启readonly)
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
symbolic-links=0
character-set-server=utf8
collation-server=utf8_general_ci
default-storage-engine=InnoDB
innodb-file-per-table=TRUE
skip-name-resolve=TRUE
log_bin=master-binlog
server_id=212
[mysqld_safe]
log-error=/var/log/mariadb/mariadb.log
pid-file=/var/run/mariadb/mariadb.pid
!includedir /etc/my.cnf.d
测试主从同步ok。。开始安装mha
rpm -ivh epel-release-latest-7.noarch.rpm(忘记在哪下载的yum源了)
yum install -y perl-DBD-MySQL perl-DBI cpan perl-Config perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes perl perl-Config-Tiny
下载mha
git clone https://github.com/kevin-hao/mha-manager.git
git clone https://github.com/kevin-hao/mha-node.git
解压编译:
perl Makefile.PL
make&&make install
编辑mha配置文件
mkdir /etc/masterha/
vim /etc/masterha/app3.conf
[server default]
user=root
password=123
manager_workdir=/data/masterha/app1
manager_log=/data/masterha/app1/manager.log
remote_workdir=/data/masterha/app1
ssh_user=root
repl_user=repl
repl_password=123
ping_interval=1
#master_ip_failover_script= /etc/masterha/master_ip_failover
#shutdown_script= /etc/masterha/power_manager
#report_script= /etc/masterha/send_report
#master_ip_online_change_script= /etc/masterha/master_ip_online_change
[server1]
hostname=192.168.1.211
no_master=1
[server2]
hostname=192.168.1.212
candidate_master=1
[server3]
hostname=192.168.1.213
candidate_master=1
~
测试mha:
masterha_check_ssh --conf /etc/masterha/app1.conf
masterha_check_repl --conf /etc/masterha/app1.conf
采取failover脚本模式进行故障切换:(取消配置文件中master_ip_failover_script注释)
master_ip_failover脚本如下:
#!/usr/bin/env php
$longopts = array(
'command:',
'ssh_user:',
'orig_master_host:',
'orig_master_ip:',
'orig_master_port:',
'new_master_host::',
'new_master_ip::',
'new_master_port::', );
$options = getopt(null, $longopts);
if ($options['command'] == 'start') {
$params = array(
'ip' => $options['new_master_ip'],
'port' => $options['new_master_port'],
);
$string = '<?php return ' . var_export($params, true) . '; ?>';
file_put_contents('config.php', $string, LOCK_EX); }
exit(0);
?>
启动mha-manager:
[root@mysql-slave ~]# nohup masterha_manager --conf=/etc/masterha/app3.conf &>> /data/masterha/app1/manager.log &
mysql故障切换:
停止212上的mariadb服务,发现211上mha-manager已经停止,登陆slave查看master已经更改为213.
回复212时,需要手动change master 作为213的slave,然后在启动mha-manager之前需要删除/data/masterha/app1下面的app3.failover.complete 才可完成下次故障切换。。。
采取keepalived方式进行切换:
需要在212和213上安装keepalived
下载最新keepalived解压安装
./configure --prefix=/usr/local/keepalived
make&&make install
编辑配置文件: vim /usr/local/keepalived/etc/keepalived/keepalived.conf
#Configuration File for keepalived
global_defs {
notification_email {
xxxx@qq.com
}
notification_email_from xxx@xxxx.com
smtp_server smtp.exmail.qq.com
smtp_connect_timeout 10
}
vrrp_script check_mysql {
script "/home/check_mysql.sh" (mysql停止后杀掉keepalived进程的脚本)
interval 2
weight 2
}
vrrp_instance vrrptest {
state BACKUP
interface eno16777736
virtual_router_id 51
priority 220
advert_int 1
## notify /etc/keepalived/alert.sh
authentication {
auth_type PASS
auth_pass 1234
}
track_script {
check_mysql
}
virtual_ipaddress {
192.168.1.240
}
}
启动keepalived
/usr/local/keepalived/sbin/keepalived -f /usr/local/keepalived/etc/keepalived/keepalived.conf
同样测试停掉212mariadb,故障也可以自动切换到213。。。
两种方法我自己觉得都很别扭,每次故障切换都要删除app3.failover.complete才可进行后续故障切换,使用failover脚本的话 ,数据库ip地址总是要切换。。。使用keepalived的话 ,每次切换需要修改配置priority的数值 ,防止vip指向恢复后的slave上。。。切换时如果数据一直在写入的话,新恢复的机器手动指定master时会少这段时间的数据,倘若master进行操作这段数据,master-bak会报错,如何处理。。。本人也是新手,最近刚刚实验这些。。