-
背景
客户有一个即将上线的互联网产品平台,后台采用的mysql数据库,未来数据量100G+,需要采用MHA主从环境
基于MHA实现主从复制,采用GTID+无损同步复制技术,双主多从
5台机器
monitor 监控复制组 需要两个ip,而且一个要能通外网
master 写
slave1 读(备用master)
slave2
slave3
-
配置无损GTID主从
五台机器配置hostname
[root@deendb ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.0.50 deendb
192.168.0.51 deendb02
192.168.0.52 deendb03
192.168.0.53 deendb04
192.168.0.54 deendb05
192.168.0.55 deendb06
192.168.0.56 deendb07
192.168.0.57 deendb08
192.168.0.61 deenfz01
192.168.0.62 deenfz02
192.168.0.71 deenm01
192.168.0.72 deenm02
配置文件,无损GTID
主库第一台配置下面这个
log_bin=/mysql/log/3306/binlog/mysqldb-binlog
log_bin_index=/mysql/log/3306/binlog/mysqldb-binlog.index
binlog_format=row
binlog_rows_query_log_events=on
skip_name_resolve = on
expire_logs_days = 7
innodb_support_xa =1
binlog_cache_size = 1M
max_binlog_size = 2048M
log_bin_trust_function_creators = 1
innodb_flush_log_at_trx_commit =1
sync_binlog = 1
transaction-isolation = read-committed
gtid_mode = on
enforce_gtid_consistency = 1
log-slave-updates = 1
binlog_gtid_simple_recovery=1
#--------------------------
#slave parameter
#--------------------------
#relay_log = /mysql/log/3306/relaylog/mysqldb-relay.log
#read_only=1
#slave-parallel-type=LOGICAL_CLOCK
#slave-parallel-workers=4
#master_info_repository=TABLE
#relay_log_info_repository=TABLE
#relay_log_recovery=1
#slave_skip_errors = ddl_exist_errors
#slave_preserve_commit_order=1
#------------------------------------
#semi sync replication settings
#------------------------------------
plugin_dir=/mysql/app/mysql/lib/plugin/
plugin_load ="rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1
loose_rpl_semi_sync_master_timeout = 5000
rpl_semi_sync_master_wait_point = AFTER_SYNC
rpl_semi_sync_master_wait_for_slave_count = 1
后面三台备库加这个
#slave parameter
#------------------------------------
relay_log = /mysql/log/3306/relaylog/mysqldb-relay.log
log-slave-updates = 1
read_only=1
slave-parallel-type=LOGICAL_CLOCK
slave-parallel-workers=4
master_info_repository=table
relay_log_info_repository=table
relay_log_recovery=1
slave_skip_errors = ddl_exist_errors
slave_preserve_commit_order=1
主库
create user 'repuser'@'%' identified by 'repuser123';
grant replication slave on *.* to 'repuser'@'%';
flush privileges;
select user,host from mysql.user;
三台从库
stop slave;
reset master;
reset slave;
change master to
master_host='192.168.0.51',
master_port=3306,
master_user='repuser',
master_password='repuser123',
master_auto_position=1;
start slave;
show slave status \G
show variables like '%semi_sync%';
show global status like 'rpl_semi%';
测试:
create database itpuxdb2;
create table itpuxdb2.itpuxbak21 (id int,name varchar(40));
insert into itpuxdb2.itpuxbak21 values(1,'itpux211'),(2,'itpux212'),(3,'itpux213'),(4,'itpux214'),(5,'itpux215');
commit;
select * from itpuxdb2.itpuxbak21;
show databases;
-
安装MHA
在节点 52,53,54,51 安装MHA node所需perl模块,(DBD:mysql),并下载MHA软件包
cp /mysql/app/mysql/lib/libmysqlclient.so.20 /lib64/
mount /dev/cdrom /mnt
yum install perl-DBD-MySQL -y
yum install -y perl-devel perl-CPAN
安装node包
scp 192.168.0.150:/opt/mha4mysql-node-0.58.tar.gz /soft/
cd /soft/
tar xvf mha4mysql-node-0.58.tar.gz
cd mha4mysql-node-0.58
perl Makefile.PL
make && make install
安装完成将产生 4 个文件如下:
ls -lsa /usr/local/bin
echo "export PATH=\$PATH:/usr/local/bin" >> ~/.bash_profile
source ~/.bash_profile
monitor 节点
mount /dev/cdrom /mnt
###rpm -ivh epel-release-7-11.noarch.rpm 阿里云的yum源 这里要联网,上面配置完成后,生成其他repo
### - Log::Dispatch ...missing.
###- Parallel::ForkManager ...missing.
yum install -y perl-devel perl-CPAN perl-DBD-MySQL
yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes -y
#邮件发送的包
yum install perl-Email-Date-Format perl-Mail-Sender perl-Mail-Send-Loop perl-Mail-Sendmail perl-MIME-Lite perl-MIME-Types perl-Params-Validate -y
#########上面这些都是mha4mysql-manager-0.58 所需 perl的依赖
cd /opt/
tar zxvf Config-Tiny-2.23.tgz
cd Config-Tiny-2.23
perl Makefile.PL
make && make install
cd /opt/
tar xvf mha4mysql-node-0.58.tar.gz
cd mha4mysql-node-0.58
perl Makefile.PL
make && make install
cd /opt/
tar xvf mha4mysql-manager-0.58.tar.gz
cd mha4mysql-manager-0.58
perl Makefile.PL
make && make install
图解:这7个加载好了,才能说明是正常的,不能是miss
echo "export PATH=\$PATH:/usr/local/bin" >> ~/.bash_profile
source ~/.bash_profile
[root@itpuxdb1 scripts]# pwd
ls -lsa /opt/mha4mysql-manager-0.58/samples/scripts
ls -lsa /usr/local/bin
cp /opt/mha4mysql-manager-0.58/samples/scripts/* /usr/local/bin/
图解:这些是配置好后的脚本
-
快速配置 SSH 登录无密码验证
ssh-keygen -t rsa
ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.1.138
互信做过,但报错ERROR: ECDSA host key for 192.168.1.140 has changed and you have requested strict checking.
执行它就好了
ssh-keygen -R 192.168.1.139
ssh deendb02 date
ssh deendb03 date
ssh deendb04 date
ssh deendb05 date
ssh deendb date
-
MHA 配置(monitor)
monitor这台机器上
mkdir -p /etc/masterha
这里的名字app1.cnf在生产环境一定要规范 比如:mysql_ydpt_app1.cnf *****app2.cnf
cp /opt/mha4mysql-manager-0.58/samples/conf/app1.cnf /etc/masterha/app1.cnf
vi /etc/masterha/app1.cnf
[server default]
manager_workdir=/var/log/masterha/app1
manager_log=/var/log/masterha/app1/manager.log
master_binlog_dir=/mysql/log/3306/binlog
master_ip_failover_script= /usr/local/bin/master_ip_failover
master_ip_online_change_script= /usr/local/bin/master_ip_online_change
password=root
user=root
ping_interval=1
remote_workdir=/tmp
repl_password=repuser123
repl_user=repuser
report_script=/usr/local/bin/send_report
secondary_check_script= /usr/local/bin/masterha_secondary_check -s deendb03 -s deendb04 -s deendb05
shutdown_script=""
ssh_user=root
[server1]
hostname=192.168.0.51
port=3306
[server2]
hostname=192.168.0.52
port=3306
candidate_master=1
check_repl_delay=0
[server3]
hostname=192.168.0.53
port=3306
[server4]
hostname=192.168.0.54
port=3306
-
设置定期清理relay脚本
四个数据库节点执行,关闭 清除中继日志 ,MHA依赖中继日志
mysql -uroot -proot -e 'set global relay_log_purge=0'
对relay_log的控制,自己写个脚本
touch /usr/local/bin/purge_relay_log.sh
vi /usr/local/bin/purge_relay_log.sh
#!/bin/bash
user=root
passwd=root
port=3306
socket=/tmp/mysql.sock
log_dir='/var/log/masterha/log'
work_dir='/mysql/log/3306/relaylog'
purge='/usr/local/bin/purge_relay_logs'
if [ ! -d $log_dir ]
then
mkdir $log_dir -p
fi
$purge --user=$user --password=$passwd -S $socket --host=localhost --disable_relay_log_purge --port=$port --workdir=$work_dir >> $log_dir/purge_relay_logs.log 2>&1
4 数据库机器:
scp 192.168.0.51:/usr/local/bin/purge_relay_log.sh /usr/local/bin
chmod +x /usr/local/bin/purge_relay_log.sh
crontab -e
0 12 * * * /bin/bash /usr/local/bin/purge_relay_log.sh
#试一下看看会不会报错(不抱任何信息是正常)
/usr/local/bin/purge_relay_log.sh
cd /var/log/masterha/log
more purge_relay_logs.log
这里看到有报错,不能找到 mysql.sock位置
-
检查 都在0.150上面,monitor那台上执行
检查 MHA Manger 到所有 MHA Node 的 SSH 连接状态:
masterha_check_ssh --conf=/etc/masterha/app1.cnf
检查整个复制环境状况。
masterha_check_repl --conf=/etc/masterha/app1.cnf
图解:master_ip_failover_script is notdefined.
这里先把 app1.cnf 里面
#master_ip_failover_script= /usr/local/bin/master_ip_failover 这个选项屏蔽才可以通过。
检查 MHA Manager 的状态
masterha_check_status --conf=/etc/masterha/app1.cnf
app1 is stopped(2:NOT_RUNNING).
手动启动
mkdir -p /var/log/masterha/app1
chmod -R 777 /var/log/masterha/app1
nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
masterha_check_status --conf=/etc/masterha/app1.cnf
启动参数讲解:
remove_dead_master_conf
manager_log
ignore_last_failover #如果MHA检测到主库连发生宕机,两次宕机间隔不足8小时,则不会发生切换,加这个参数可以避免这种情况
查看日志:
tail -100f /var/log/masterha/app1/manager.log
图解:这是正常的
-
monitor上 配置四个脚本,vip改掉,网卡改掉(这是手动脚本,下一个讲keepalive方式)
master上面添加网卡0.60: 可以把下面这条命令加到开机自启的命令行下,但是切换之后,要ping下这个地址
/sbin/ifconfig ens34:1 192.168.0.60/24
在三台备机上执行,mysql -uroot -proot -h 192.168.0.60
mysql> show variables like "hostname";
monitor上操作
vi /usr/local/bin/master_ip_failover
--------------------------------------
#!/usr/bin/env perl
# Copyright (C) 2011 DeNA Co.,Ltd.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
## Note: This is a sample script and is not complete. Modify the script based on your environment.
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
use MHA::DBHelper;
my (
$command, $ssh_user, $orig_master_host,
$orig_master_ip, $orig_master_port, $new_master_host,
$new_master_ip, $new_master_port, $new_master_user,
$new_master_password
);
my $vip = '192.168.0.60/24';
my $key = "2";
my $ssh_start_vip = "/sbin/ifconfig ens34:$key $vip/24";
my $ssh_stop_vip = "/sbin/ifconfig ens34:$key down";
my $ssh_send_garp = "/sbin/arping -U $vip -I eth0 -c 1";
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
'new_master_user=s' => \$new_master_user,
'new_master_password=s' => \$new_master_password,
);
exit &main();
sub main {
if ( $command eq "stop" || $command eq "stopssh" ) {
# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
print "Disabling the VIP an old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
my $new_master_handler = new MHA::DBHelper();
# args: hostname, port, user, password, raise_error_or_not
$new_master_handler->connect( $new_master_ip, $new_master_port,
$new_master_user, $new_master_password, 1 );
## Set read_only=0 on the new master
$new_master_handler->disable_log_bin_local();
print "Set read_only=0 on the new master.\n";
$new_master_handler->disable_read_only();
## Creating an app user on the new master
# print "Creating app user on the new master..\n";
# FIXME_xxx_create_user( $new_master_handler->{dbh} );
$new_master_handler->enable_log_bin_local();
$new_master_handler->disconnect();
print "Enabling the VIP $vip on the new master: $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
# If you want to continue failover, exit 10.
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
# do nothing
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip(){
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
`ssh $ssh_user\@$new_master_host \" $ssh_send_garp \"`;
}
sub stop_vip(){
return 0 unless ($ssh_user);
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
vi /etc/masterha/app1.cnf #解开注释
masterha_check_repl --conf=/etc/masterha/app1.cnf
状态 OK
masterha_stop --conf=/etc/masterha/app1.cnf
ps -ef|grep manager
vi /usr/local/bin/manager_start #启动manager的脚本
#!/bin/bash
nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
chmod u+x /usr/local/bin/manager_start
vi master_ip_online_change
#!/usr/bin/env perl
# Copyright (C) 2011 DeNA Co.,Ltd.## This program is free software; you can redistribute it and/or modify# it under the terms of the GNU General Public License as published by# the Free Software Foundation; either version 2 of the License, or# (at your option) any later version.## This program is distributed in the hope that it will be useful,# but WITHOUT ANY WARRANTY; without even the implied warranty of# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the# GNU General Public License for more details.## You should have received a copy of the GNU General Public License# along with this program; if not, write to the Free Software# Foundation, Inc.,# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
## Note: This is a sample script and is not complete. Modify the script based on your environment.
use strict;use warnings FATAL => 'all';
use Getopt::Long;use MHA::DBHelper;use MHA::NodeUtil;use Time::HiRes qw( sleep gettimeofday tv_interval );use Data::Dumper;
my $_tstart;my $_running_interval = 0.1;
my $vip = '192.168.0.60';my $key = "2";
my $ssh_start_vip = "/sbin/ifconfig ens34:$key $vip/24";
my $ssh_stop_vip = "/sbin/ifconfig ens34:$key down";
my $ssh_send_garp = "/sbin/arping -U $vip -I ens34 -c 1";
my $orig_master_ssh_port = 22;
my $new_master_ssh_port = 22;
my $ssh_user = "root";
my $new_master_password='root';
my $orig_master_password='root';
my (
$command, $orig_master_is_new_slave, $orig_master_host,
$orig_master_ip, $orig_master_port, $orig_master_user,
$orig_master_password, $orig_master_ssh_user, $new_master_host,
$new_master_ip, $new_master_port, $new_master_user,
$new_master_password, $new_master_ssh_user,);GetOptions(
'command=s' => \$command,
'orig_master_is_new_slave' => \$orig_master_is_new_slave,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'orig_master_user=s' => \$orig_master_user,
'orig_master_password=s' => \$orig_master_password,
'orig_master_ssh_user=s' => \$orig_master_ssh_user,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
'new_master_user=s' => \$new_master_user,
'new_master_password=s' => \$new_master_password,
'new_master_ssh_user=s' => \$new_master_ssh_user,);
exit &main();
sub start_vip(){
`ssh $new_master_ssh_user\@$new_master_host \" $ssh_start_vip \"`;
`ssh $new_master_ssh_user\@$new_master_host \" $ssh_send_garp \"`;}
sub stop_vip(){
`ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;}
sub current_time_us {
my ( $sec, $microsec ) = gettimeofday();
my $curdate = localtime($sec);
return $curdate . " " . sprintf( "%06d", $microsec );}
sub sleep_until {
my $elapsed = tv_interval($_tstart);
if ( $_running_interval > $elapsed ) {
sleep( $_running_interval - $elapsed );
}}
sub get_threads_util {
my $dbh = shift;
my $my_connection_id = shift;
my $running_time_threshold = shift;
my $type = shift;
$running_time_threshold = 0 unless ($running_time_threshold);
$type = 0 unless ($type);
my @threads;
my $sth = $dbh->prepare("SHOW PROCESSLIST");
$sth->execute();
while ( my $ref = $sth->fetchrow_hashref() ) {
my $id = $ref->{Id};
my $user = $ref->{User};
my $host = $ref->{Host};
my $command = $ref->{Command};
my $state = $ref->{State};
my $query_time = $ref->{Time};
my $info = $ref->{Info};
$info =~ s/^\s*(.*?)\s*$/$1/ if defined($info);
next if ( $my_connection_id == $id );
next if ( defined($query_time) && $query_time < $running_time_threshold );
next if ( defined($command) && $command eq "Binlog Dump" );
next if ( defined($user) && $user eq "system user" );
next
if ( defined($command)
&& $command eq "Sleep"
&& defined($query_time)
&& $query_time >= 1 );
if ( $type >= 1 ) {
next if ( defined($command) && $command eq "Sleep" );
next if ( defined($command) && $command eq "Connect" );
}
if ( $type >= 2 ) {
next if ( defined($info) && $info =~ m/^select/i );
next if ( defined($info) && $info =~ m/^show/i );
}
push @threads, $ref;
}
return @threads;}
sub main {
if ( $command eq "stop" ) {
## Gracefully killing connections on the current master
# 1. Set read_only= 1 on the new master
# 2. DROP USER so that no app user can establish new connections
# 3. Set read_only= 1 on the current master
# 4. Kill current queries
# * Any database access failure will result in script die.
my $exit_code = 1;
eval {
## Setting read_only=1 on the new master (to avoid accident)
my $new_master_handler = new MHA::DBHelper();
# args: hostname, port, user, password, raise_error(die_on_error)_or_not
$new_master_handler->connect( $new_master_ip, $new_master_port,
$new_master_user, $new_master_password, 1 );
print current_time_us() . " Set read_only on the new master.. ";
$new_master_handler->enable_read_only();
if ( $new_master_handler->is_read_only() ) {
print "ok.\n";
}
else {
die "Failed!\n";
}
$new_master_handler->disconnect();
# Connecting to the orig master, die if any database error happens
my $orig_master_handler = new MHA::DBHelper();
$orig_master_handler->connect( $orig_master_ip, $orig_master_port,
$orig_master_user, $orig_master_password, 1 );
## Drop application user so that nobody can connect. Disabling per-session binlog beforehand
$orig_master_handler->disable_log_bin_local();
# print current_time_us() . " Drpping app user on the orig master..\n";
#drop_app_user($orig_master_handler);
## Waiting for N * 100 milliseconds so that current connections can exit
my $time_until_read_only = 15;
$_tstart = [gettimeofday];
my @threads = get_threads_util( $orig_master_handler->{dbh},
$orig_master_handler->{connection_id} );
while ( $time_until_read_only > 0 && $#threads >= 0 ) {
if ( $time_until_read_only % 5 == 0 ) {
printf
"%s Waiting all running %d threads are disconnected.. (max %d milliseconds)\n",
current_time_us(), $#threads + 1, $time_until_read_only * 100;
if ( $#threads < 5 ) {
print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"
foreach (@threads);
}
}
sleep_until();
$_tstart = [gettimeofday];
$time_until_read_only--;
@threads = get_threads_util( $orig_master_handler->{dbh},
$orig_master_handler->{connection_id} );
}
## Setting read_only=1 on the current master so that nobody(except SUPER) can write
print current_time_us() . " Set read_only=1 on the orig master.. ";
$orig_master_handler->enable_read_only();
if ( $orig_master_handler->is_read_only() ) {
print "ok.\n";
}
else {
die "Failed!\n";
}
## Waiting for M * 100 milliseconds so that current update queries can complete
my $time_until_kill_threads = 5;
@threads = get_threads_util( $orig_master_handler->{dbh},
$orig_master_handler->{connection_id} );
while ( $time_until_kill_threads > 0 && $#threads >= 0 ) {
if ( $time_until_kill_threads % 5 == 0 ) {
printf
"%s Waiting all running %d queries are disconnected.. (max %d milliseconds)\n",
current_time_us(), $#threads + 1, $time_until_kill_threads * 100;
if ( $#threads < 5 ) {
print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"
foreach (@threads);
}
}
sleep_until();
$_tstart = [gettimeofday];
$time_until_kill_threads--;
@threads = get_threads_util( $orig_master_handler->{dbh},
$orig_master_handler->{connection_id} );
}
## Terminating all threads
print current_time_us() . " Killing all application threads..\n";
$orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 );
print current_time_us() . " done.\n";
$orig_master_handler->enable_log_bin_local();
$orig_master_handler->disconnect();
## Droping the VIP
print "Disabling the VIP an old master: $orig_master_host \n";
&stop_vip();
## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
## Activating master ip on the new master
# 1. Create app user with write privileges
# 2. Moving backup script if needed
# 3. Register new master's ip to the catalog database
# We don't return error even though activating updatable accounts/ip failed so that we don't interrupt slaves' recovery.# If exit code is 0 or 10, MHA does not abort
my $exit_code = 10;
eval {
my $new_master_handler = new MHA::DBHelper();
# args: hostname, port, user, password, raise_error_or_not
$new_master_handler->connect( $new_master_ip, $new_master_port,
$new_master_user, $new_master_password, 1 );
## Set read_only=0 on the new master
$new_master_handler->disable_log_bin_local();
print current_time_us() . " Set read_only=0 on the new master.\n";
$new_master_handler->disable_read_only();
## Creating an app user on the new master
#print current_time_us() . " Creating app user on the new master..\n";
# create_app_user($new_master_handler);
print "Enabling the VIP $vip on the new master: $new_master_host \n";
&start_vip();
$new_master_handler->enable_log_bin_local();
$new_master_handler->disconnect();
## Update master ip on the catalog database, etc
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
# do nothing
exit 0;
}
else {
&usage();
exit 1;
}}
sub usage {
print"Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
die;}
vi masterha_secondary_check # 不需要修改
#!/bin/env perl
# Copyright (C) 2011 DeNA Co.,Ltd.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
use strict;
use warnings FATAL => 'all';
use English qw(-no_match_vars);
use Getopt::Long;
use Pod::Usage;
use MHA::ManagerConst;
my @monitoring_servers;
my (
$help, $version, $ssh_user, $ssh_port,
$ssh_options, $master_host, $master_ip, $master_port,
$master_user, $master_password, $ping_type
);
my $timeout = 5;
$| = 1;
GetOptions(
'help' => \$help,
'version' => \$version,
'secondary_host=s' => \@monitoring_servers,
'user=s' => \$ssh_user,
'port=s' => \$ssh_port,
'options=s' => \$ssh_options,
'master_host=s' => \$master_host,
'master_ip=s' => \$master_ip,
'master_port=i' => \$master_port,
'master_user=s' => \$master_user,
'master_password=s' => \$master_password,
'ping_type=s' => \$ping_type,
'timeout=i' => \$timeout,
);
if ($version) {
print "masterha_secondary_check version $MHA::ManagerConst::VERSION.\n";
exit 0;
}
if ($help) {
pod2usage(0);
}
unless ($master_host) {
pod2usage(1);
}
}
local $SIG{INT} = $SIG{HUP} = $SIG{QUIT} = $SIG{TERM} = \&exit_by_signal;
$ssh_user = "root" unless ($ssh_user);
$ssh_port = 22 unless ($ssh_port);
$master_port = 3306 unless ($master_port);
if ($ssh_options) {
$MHA::ManagerConst::SSH_OPT_CHECK = $ssh_options;
}
$MHA::ManagerConst::SSH_OPT_CHECK =~ s/VAR_CONNECT_TIMEOUT/$timeout/;
# 3: master is reachable from at least one of monitoring servers
my $exit_code = 0;
foreach my $monitoring_server (@monitoring_servers) {
my $ssh_user_host = $ssh_user . '@' . $monitoring_server;
my $command =
"ssh $MHA::ManagerConst::SSH_OPT_CHECK -p $ssh_port $ssh_user_host \"perl -e "
. "\\\"use IO::Socket::INET; my \\\\\\\$sock = IO::Socket::INET->new"
. "(PeerAddr => \\\\\\\"$master_host\\\\\\\", PeerPort=> $master_port, "
. "Proto =>'tcp', Timeout => $timeout); if(\\\\\\\$sock) { close(\\\\\\\$sock); "
. "exit 3; } exit 0;\\\" \"";
my $ret = system($command);
$ret = $ret >> 8;
if ( $ret == 0 ) {
next;
}
if ( $ret == 3 ) {
if ( defined $ping_type
&& $ping_type eq $MHA::ManagerConst::PING_TYPE_INSERT )
{
my $ret_insert;
my $command_insert =
"ssh $MHA::ManagerConst::SSH_OPT_CHECK -p $ssh_port $ssh_user_host \'"
. "/usr/bin/mysql -u$master_user -p$master_password -h$master_host "
. "-e \"CREATE DATABASE IF NOT EXISTS infra; "
my $sigalrm_timeout = 3;
eval {
local $SIG{ALRM} = sub {
die "timeout.\n";
};
alarm $sigalrm_timeout;
$ret_insert = system($command_insert);
$ret_insert = $ret_insert >> 8;
alarm 0;
}
}
print "Master is reachable from $monitoring_server!\n";
}
else {
print "Monitoring server $monitoring_server is NOT reachable!\n";
$exit_code = 2;
last;
}
}
exit $exit_code;
# ############################################################################
# Documentation
# ############################################################################
=pod
=head1 NAME
masterha_secondary_check - Checking master availability from additional network routes
=head1 SYNOPSIS
masterha_secondary_check -s secondary_host1 -s secondary_host2 .. --user=ssh_username --master_host=host --master_ip=ip --master_port=port
See online reference (http://code.google.com/p/mysql-master-ha/wiki/Parameters#secondary_check_script) for details.
=head1 DESCRIPTION
See online reference (http://code.google.com/p/mysql-master-ha/wiki/Parameters#secondary_check_script) for details.
#!/usr/bin/perl
# Copyright (C) 2011 DeNA Co.,Ltd.## This program is free software; you can redistribute it and/or modify# it under the terms of the GNU General Public License as published by# the Free Software Foundation; either version 2 of the License, or# (at your option) any later version.## This program is distributed in the hope that it will be useful,# but WITHOUT ANY WARRANTY; without even the implied warranty of# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the# GNU General Public License for more details.## You should have received a copy of the GNU General Public License# along with this program; if not, write to the Free Software# Foundation, Inc.,# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
## Note: This is a sample script and is not complete. Modify the script based on your environment.
use strict;use warnings FATAL => 'all';use Mail::Sender;use Getopt::Long;
#new_master_host and new_slave_hosts are set only when recovering master succeededmy ( $dead_master_host, $new_master_host, $new_slave_hosts, $subject, $body );my $smtp='smtp.ym.163.com';my $mail_from='a@163.com';my $mail_user='a@163.com';my $mail_pass='xxxxx';my $mail_to=['b@163.com'];GetOptions(
'orig_master_host=s' => \$dead_master_host,
'new_master_host=s' => \$new_master_host,
'new_slave_hosts=s' => \$new_slave_hosts,
'subject=s' => \$subject,
'body=s' => \$body,);
mailToContacts($smtp,$mail_from,$mail_user,$mail_pass,$mail_to,$subject,$body);
sub mailToContacts {
my ( $smtp, $mail_from, $user, $passwd, $mail_to, $subject, $msg ) = @_;
open my $DEBUG, "> /tmp/monitormail.log"
or die "Can't open the debug file:$!\n";
my $sender = new Mail::Sender {
ctype => 'text/plain; charset=utf-8',
encoding => 'utf-8',
smtp => $smtp,
from => $mail_from,
auth => 'LOGIN',
TLS_allowed => '0',
authid => $user,
authpwd => $passwd,
to => $mail_to,
subject => $subject,
debug => $DEBUG
};
$sender->MailMsg(
{ msg => $msg,
debug => $DEBUG
}
) or print $Mail::Sender::Error;
return 1;}
# Do whatever you want here
exit 0;
-
failover 自动切换测试与检查相关信息
切换的6个步骤
1)配置文件检查
2)宕机master处理,VIP删除,在这个阶段处理
3)复制中继日志给最新的从库
4)含有最新的数据的slave库
5)应用第三部分保存下来的二进制日志
6)将最新数据的从库 提升为 master
7)将其他slave连接到新的master
然后手工停止 192.168.0.51 的 mysql,然后查看日志:
systemctl stop mysqld
################ tail -300f /var/log/masterha/app1/manager.log写成脚本,manager_log###################
vi /usr/local/bin/manager_log
tail -300f /var/log/masterha/app1/manager.log
chmod u+x /usr/local/bin/manager_log
manager_log
####################################################
同时再看看 192.168.0.52/53/54 的 mysql 状态:
mysql -uroot -proot
show slave status\G
然后记得修改新主库的参数 my.cnf:持久化,否则重启出问题
#read_only=1
show variables like 'read_only';
主从数据测试。
create table itpuxdb2.itpuxbak22 (id int,name varchar(40));
insert into itpuxdb2.itpuxbak22 values(1,'itpux211'),(2,'itpux212'),(3,'itpux213'),(4,'itpux214'),(5,'itpux215');
commit;
select * from itpuxdb2.itpuxbak22;
备库检查:
select * from itpuxdb2.itpuxbak22;
注意点 :
1. 切换完之后 发现 MHA Manager 监控程序会自动死掉,解决方式:
这里我们用 shell 脚本的方式去执行就不会发生监控程序死掉的情况
vi /usr/local/bin/manager_start
#!/usr/bin/bash
nohup /usr/local/bin/masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null >/var/log/masterha/app1/manager.log 2>&1 &
chmod u+x /usr/local/bin/manager_start
-- 专用脚本
vi /usr/local/bin/manager_status_check
#!/bin/bash
while :
do
MGECHECK=`ps -ef|grep masterha_manager |egrep -v grep| wc -l`
if [ $MGECHECK -eq 0 ];then
/usr/local/bin/manager_start
else
echo "MHA manager start"
fi
sleep 5
done
chmod u+x /usr/local/bin/manager_status_check
nohup /usr/local/bin/manager_status_check &
写入/etc/rc.d/rc.local 开机自动启动
echo "nohup /usr/local/bin/manager_status_check &" >> /etc/rc.d/rc.local
2. 当你修复完死掉的 master 想重新加入先有的两节点 MHA 也是可以的
旧 Master :
记得修改旧主库 51 的参数 my.cnf,打开从库相关的参数,要重启 52的参数要开启主库的参数要打开,备库的参数关闭
(新主库 52 参数需要把从库相关的参数注释掉)
#read_only=1
mysql -uroot -proot
旧 master 直接 change master to:
change master to
master_host='192.168.0.52',
master_port=3306,
master_user='repuser',
master_password='repuser123',
master_auto_position=1;
start slave;
show variables like 'read_only';
看是否会补全数据,发现数据补全了,加入复制没问题。
mysql> select * from itpuxdb2.itpuxbak21;
+------+----------+
| id | name |
+------+----------+
| 1 | itpux211 |
| 2 | itpux212 |
| 3 | itpux213 |
| 4 | itpux214 |
| 5 | itpux215 |
+------+----------+
5 rows in set (0.01 sec)
----------------- 如果配置文件中 51 掉了,需要修改 app1.cnf 把 192.168.1.51 补上 ,并将下面两个地方修改好,此时把0.51当作HA的备机,以便下次宕机,能够切换
[server1]
hostname=192.168.0.52
port=3306
[server2]
hostname=192.168.0.51
port=3306
candidate_master=1
check_repl_delay=0
secondary_check_script= /usr/local/bin/masterha_secondary_check -s deendb02 -s deendb04 -s deendb05
(这个参数,只要加一个备库HA即可)
重启监控程序并查看 MHA 状态:
masterha_stop --conf=/etc/masterha/app1.cnf
ps -ef|grep manager
/usr/local/bin/manager_start
----------------------------------------------------------------
masterha_check_repl --conf=/etc/masterha/app1.cnf
再看日志,是不是有 4 台服务器在运行了:
/usr/local/bin/manager_log
192.168.1.52(192.168.1.52:3306) (current master)
+--192.168.1.51(192.168.1.51:3306)
+--192.168.1.53(192.168.1.53:3306)
+--192.168.1.54(192.168.1.54:3306)
-
手工正常切换测试与检查相关信息
在线切换步骤如下:
01.先停止监控程序
ps -ef|grep manager_status_check
kill -9 2384
masterha_stop --conf=/etc/masterha/app1.cnf
02.修改 master_ip_online_change 脚本如下:
vi /usr/local/bin/master_ip_online_change
my $vip = '192.168.1.60/24';
my $key = '1';
my $ssh_start_vip = "/sbin/ifconfig ens34:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens34:$key down";
my $ssh_user = "root";
my $new_master_password='root';
my $orig_master_password='root';
执行切换:(在线切换)(切换之后要my.cnf的参数文件修改掉)(192.168.0.60这个IP要测试mysql连不连的上)
masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --new_master_host=192.168.0.51 --new_master_port=3306 --orig_master_is_new_slave
--running_updates_limit=10000
-- 如果是宕机切换(故障切换)
masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --dead_master_host=192.168.1.52 --dead_master_port=3306 --new_master_host=192.168.1.51 --new_master_port=3306 --ignore_last_failover
查看切换后各机器的状态:
mysql -uroot -proot
show slave status\G
主从数据测试。
create table itpuxdb2.itpuxbak23 (id int,name varchar(40));
insert into itpuxdb2.itpuxbak23 values(1,'itpux231'),(2,'itpux232'),(3,'itpux233'),(4,'itpux234'),(5,'itpux235');
commit;
select * from itpuxdb2.itpuxbak23;
备库检查:
select * from itpuxdb2.itpuxbak23;
重启监控程序并查看 MHA 状态:
masterha_stop --conf=/etc/masterha/app1.cnf
/usr/local/bin/manager_start
masterha_check_repl --conf=/etc/masterha/app1.cnf
nohup /usr/local/bin/manager_status_check &
再看日志,是不是有 4 台服务器在运行了:
tail -300f /var/log/masterha/app1/manager.log (写了脚本的话就执行这个/usr/local/bin/manager_log)