MHA环境搭建
一、安装前的准备
- 至少3台虚拟机(服务器),且已安装了CentOS 7(或其他版本)
- 检查系统时间,并更新
# 查看当前系统时间
date -R
# 更新时间,与互联网时间一致
install ntp ntpdate
二、开始安装Maria DB(mysql)
1、卸载系统自带的MySQL和Maria DB(centos7 开始系统默认安装的是mariadb)
检查mysql和maria
rpm -qa | grep mysql
存在的话卸载,没有则进行下一步
rpm -e –-nodeps mariadb-libs-5.5.65-1.el7.x86_64
2、安装MariaDB
创建Maria的资源文件
vi /etc/yum.repos.d/MariaDB.repo
文件内容到此网站中复制(内容下文也贴了)
https://downloads.mariadb.org/mariadb/repositories/#mirror=tuna
# MariaDB 10.3 CentOS repository list - created 2020-09-30 07:22 UTC
# http://downloads.mariadb.org/mariadb/repositories/
[mariadb]
name = MariaDB
baseurl = http://yum.mariadb.org/10.3/centos7-amd64
gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
gpgcheck=1
使用yum 命令进行在线安装
yum -y install MariaDB-server MariaDB-client
等待自动安装完成即可(会显示 Complete!
)
3、启动并初始化数据库
启动数据库
systemctl start mariadb
设置开机自启
systemctl enable mariadb
初始化
mysql_secure_installation
- 根据提示配置,(设置密码完了,就一直回车就行)
Set root password? [Y/n] <– 是否设置root用户密码,输入y并回车或直接回车
New password: <– 设置root用户的密码
Re-enter new password: <– 再输入一次你设置的密码
Remove anonymous users? [Y/n] <– 是否删除匿名用户, 回车
Disallow root login remotely? [Y/n] <–是否禁止root远程登录 , 回车,
Remove test database and access to it? [Y/n] <– 是否删除test数据库,回车
Reload privilege tables now? [Y/n] <– 是否重新加载权限表,回车
配置数据库
- 进入
/etc/my.cnf.d/
目录,复制内容到sever.cnf
中
[mysqld]
init_connect='SET collation_connection = utf8_unicode_ci'
init_connect='SET NAMES utf8'
character-set-server=utf8
collation-server=utf8_unicode_ci
skip-character-set-client-handshake
- 创建文件
client.cnf
vi /etc/my.cnf.d/client.cnf
- 复制以下内容(此步骤可省略,之后启动MHA可能会报错)
[client]
default-character-set=utf8
- 编辑文件
mysql-clients.cnf
vi /etc/my.cnf.d/mysql-clients.cnf
- 在
mysqld
下添加以下内容
default-character-set=utf8
- 配置完成,重启数据库
systemctl restart mariadb
4、设置远程连接权限
进入数据库
mysql -u root -p password
设置远程访问权限
grant all privileges on *.* to username@'%' identified by 'password';
-配置完成,可远程连接试试,若连接不上,可能是防火墙的问题,可参照以下链接进行排除
https://blog.csdn.net/je_rry/article/details/108866568
三、ssh免密互通配置
配置ssh免密(下面操作在所有节点上操作,管理节点发给各个节点,其它节点之间互发)
ssh-keygen
ssh-copy-id 192.168.3.101
测试免密是否成功(需要进入
.ssh/
目录下,此处可以不测试,可以在之后的manager中测试互通是否成功)
scp authorized_keys root@192.168.3.101:~/.ssh/
四、搭建MHA
将mha4mysql-node-0.57-0.el7.noarch.rpm 和 mha4mysql-manager-0.57-0.el7.noarch.rpm上传到服务器中,所有节点修只需要安装node,管理节点两个都需要安装
安装依赖(安装顺序如下)
yum install epel-release -y
yum -y install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager
- 每个节点先安装
node
rpm -ivh mha4mysql-node-0.57-0.el7.noarch.rpm
- 检查是否安装成功
ll /usr/bin/{app*,filter*,purge*,save*}
如下结果代表成功
- 在管理节点上安装manager
[root@manager ~]# rpm -ivh mha4mysql-manager-0.57-0.el7.noarch.rpm
搭建主从环境
- 配置master(创建需要同步的数据库,并创建复制用的用户以及监控用的用户)
#进入master的数据库
mysql -u root -p password
#创建一个数据库(用于测试)
create database test_db;
#创建复制用户
grant replication slave on *.* to 'slave'@'192.168.%.%' identified by '123456'
#创建监视用户
grant all privileges on *.* to 'root'@'192.168.%.%' identified by 'root';
#刷新权限
flush privileges;
- 修改master配置文件,重启mysqld
[root@master ~]# vi /etc/my.cnf
[mysqld]
server-id=1
log-bin=mysql-bin-master
binlog-do-db=test_db
[root@master ~]# systemctl restart mysqld
- 查看master状态(position的值之后会用到)
show master status\G;
*************************** 1. row ***************************
File: mysql-bin-master.000001
Position: 335
Binlog_Do_DB: test_db
Binlog_Ignore_DB:
Executed_Gtid_Set:
- 配置slave,备用master
[root@slave~]# mysql -uroot -proot
mysql> create database test_db;
Query OK, 1 row affected (0.00 sec)
mysql> grant replication slave on *.* to 'slave'@'192.168.%.%' identified by '123456'; //复制用的用户
Query OK, 0 rows affected, 1 warning (0.05 sec)
mysql> grant all privileges on *.* to 'root'@'192.168.%.%' identified by 'root'; //监控用的用户
Query OK, 0 rows affected, 1 warning (0.01 sec)
mysql> flush privileges; //刷新权限表
Query OK, 0 rows affected (0.01 sec)
- 修改slave的配置文件,重启mysqld
[root@slave ~]# vi /etc/my.cnf
[mysqld]
server-id=2
log-bin=mysql-bin-slave1
binlog-do-db=test_db
log-slave-updates=1
[root@slave ~]# systemctl restart mysqld
- 在slave服务器中修改master信息使其指向主服务器,并查看slave状态
[root@slave ~]# mysql -uroot -proot
mysql> stop slave;
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> change master to master_host='192.168.3.101',
-> master_user='slave',
-> master_password='123456',
-> master_log_file='mysql-bin-master.000001',
-> master_log_pos=335;
Query OK, 0 rows affected (0.08 sec)
mysql> start slave;
Query OK, 0 rows affected (0.02 sec)
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.3.101
Master_User: slave
Master_Port: 3306
Connect_Retry: 60
...
- 设置为只读,并关闭中转日志自动清理
mysql> set global read_only=1; //从表设置为只读,备用master的只读不能写入配置文件,因为有可能会升为master
Query OK, 0 rows affected (0.00 sec)
mysql> set global relay_log_purge=off; //从表关闭中转日志自动清理,因为MHA从库恢复过程依赖中转日志相关信息
Query OK, 0 rows affected (0.00 sec)
配置MHA
- 创建工作目录
[root@manager ~]# mkdir /etc/masterha
[root@manager ~]# mkdir -p /var/log/masterha/app1
在/usr/local/bin目录下创建脚本master_ip_failover(故障切换脚本,切换master的VIP地址),并授可执行权限
/var/log/masterha/app1
创建脚本
vi /usr/local/bin/master_ip_failover
将以下内容添加至脚本中
#!/usr/bin/env perl
use strict;
use warnings FATAL =>'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
my $vip = '192.168.3.101/24'; #这里需要改,漂移的VIP,vip的值需要是当前网段的空闲地址
my $key = "1";
my $ssh_start_vip = "/sbin/ifconfig $vip ens33:$key $vip"; #如果服务器不支持ifconfig命令,使用yum install net-tools 命令安装就支持ifconfig了
my $ssh_stop_vip = "/sbin/ifconfig $vip ens33:$key down";
my $exit_code = 0;
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);
exit &main();
sub main {
#print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
print "\n\n\n***************************************************************\n";
print "Disabling the VIP - $vip on old master: $orig_master_host\n";
print "***************************************************************\n\n\n\n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
print "\n\n\n***************************************************************\n";
print "Enabling the VIP - $vip on new master: $new_master_host \n";
print "***************************************************************\n\n\n\n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
`ssh $ssh_user\@$orig_master_host \" $ssh_start_vip \"`;
exit 0;
}
else {
&usage();
exit 1;
}
}
# A simple system call that enable the VIP on the new master
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print "Usage: master_ip_failover –command=start|stop|stopssh|status –orig_master_host=host –orig_master_ip=ip
–orig_master_port=port –new_master_host=host –new_master_ip=ip –new_master_port=port\n";
}
添加权限
chmod +x /usr/local/bin/master_ip_failover
- 在/etc/masterha下创建配置文件app1.cnf
[root@manager ~]# cat /etc/masterha/app1.cnf
[server default]
manager_workdir=/var/log/masterha/app1
manager_log=/var/log/masterha/app1/manager.log
master_binlog_dir=/var/lib/mysql
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
user=root
password=root
ping_interval=1
remote_workdir=/tmp
repl_user=slave
repl_password=123456
report_script=/usr/local/send_report
shutdown_script=""
ssh_user=root
[server1]
hostname=192.168.3.101
port=3306
[server2]
hostname=192.168.3.102
port=3306
candidate_master=1
check_repl_delay=0
检查管理节点到所有Node节点的ssh连接状态
masterha_check_ssh --conf=/etc/masterha/app1.cnf
显示[info] All SSH connection tests passed successfully.
代表成功
- 检查复制环境
masterha_check_repl --conf=/etc/masterha/app1.cnf
显示MySQL Replication Health is OK.
表示成功
- 检查管理节点的状态
masterha_check_status --conf=/etc/masterha/app1.cnf
app1 is stopped(2:NOT_RUNNING).
开启管理节点监控
[root@manager ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
[1]18909
[root@manager ~]# masterha_check_status --conf=/etc/masterha/app1.cnf
app1 (pid:18909) is running(0:PING_OK), master:192.168.3.101
[root@manager ~]# masterha_stop --conf=/etc/masterha/app1.cnf
Stopped app1 successfully.
配置完成
六、测试
打开navicat (其他连接工具也可以)连接master和slave 的数据库
- 检查在master下指定主从关系的数据库中添加表或表中数据是否会自动同步
在master 服务器下添加命令
ifconfig eno1:1 192.x.x.x (eno1为网卡名称,192.x.x.x 为vip,vip需要设置当前网段下的空闲ip)
- 检查是否添加成功
# ifconfig
- 在master中停止mysql服务
systemctl stop mysql
在manager节点中查看日志
vi /var/log/masterha/app1/manager.log
看到如下结果,则切换成功
- 查看 /etc/masterha/app1.cnf
cat /etc/masterha/app1.cnf
发现里面的 server1
已经被删除了
如果遇到这个不成功
Failed to deactivate master IP with return code 1:0
将 /etc/masterha/app1.cnf
中master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change删除后重新测试
七、宕机邮件发送功能添加
1、在 /etc/masterha/app1.cnf 中添加或修改邮件发送脚本的路径 report_script=/usr/local/send_report
2、安装依赖
rpm -ivh perl-Mail-Sender-0.8.16-1.el5.pp.src.rpm
如果提示缺少mock的依赖,则安装mock
yum install mock
useradd -s /sbin/nologin mockbuild
3、创建脚本,添加如下代码(修改对应的数据,当主库宕机的时候就会自动发送邮件到指定的邮箱了)
# vi /usr/local/send_report
#!/usr/bin/perl
# Copyright (C) 2011 DeNA Co.,Ltd.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
## Note: This is a sample script and is not complete. Modify the script based on your environment.
use strict;
use warnings FATAL => 'all';
use Mail::Sender;
use Getopt::Long;
#new_master_host and new_slave_hosts are set only when recovering master succeeded
my ( $dead_master_host, $new_master_host, $new_slave_hosts, $subject, $body );
my $smtp='smtp.163.com';
my $mail_from='email@163.com';
my $mail_user='email@163.com';
my $mail_pass='第三方登录授权码';
#my $mail_to=['to1@qq.com','to2@qq.com'];
my $mail_to='1017045280@qq.com';
GetOptions(
'orig_master_host=s' => \$dead_master_host,
'new_master_host=s' => \$new_master_host,
'new_slave_hosts=s' => \$new_slave_hosts,
'subject=s' => \$subject,
'body=s' => \$body,
);
# Do whatever you want here
mailToContacts($smtp,$mail_from,$mail_user,$mail_pass,$mail_to,$subject,$body);
sub mailToContacts {
my ($smtp, $mail_from, $mail_user, $mail_pass, $mail_to, $subject, $msg ) = @_;
open my $DEBUG, ">/var/log/masterha/mail.log" #log的存放路径
or die "Can't open the debug file:$!\n";
my $sender = new Mail::Sender {
ctype => 'text/plain;charset=utf-8',
encoding => 'utf-8',
smtp => $smtp,
from => $mail_from,
auth => 'LOGIN',
TLS_allowed => '0',
authid => $mail_user,
authpwd => $mail_pass,
to => $mail_to,
subject => $subject,
debug => $DEBUG
};
$sender->MailMsg(
{
msg => $msg,
debug => $DEBUG
}
) or print $Mail::Sender::Error;
return 1;
}
exit 0;
本文参考了网上现有的部分部署步骤,经过本人实际操作记录下来的完整部署过程,若有不当之处,欢迎留言