MHA

本文详细介绍了如何在Linux环境下配置MySQL Master High Availability (MHA) 主从复制,使用GTID进行无损同步,并通过MHA Manager实现故障切换。涉及的步骤包括设置hostname、配置主从复制、安装MHA软件、配置SSH无密码登录、配置MHA Monitor以及测试和监控。此外,还涵盖了手动和自动切换测试,确保在主库故障时能平滑过渡。
摘要由CSDN通过智能技术生成
  1. 背景

客户有一个即将上线的互联网产品平台,后台采用的mysql数据库,未来数据量100G+,需要采用MHA主从环境

 

基于MHA实现主从复制,采用GTID+无损同步复制技术,双主多从

 

5台机器

monitor      监控复制组          需要两个ip,而且一个要能通外网

master        写

slave1         读(备用master)

slave2

slave3

 

 

  1. 配置无损GTID主从

 

五台机器配置hostname

 

[root@deendb ~]# cat /etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4

::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.0.50 deendb

192.168.0.51 deendb02

192.168.0.52 deendb03

192.168.0.53 deendb04

192.168.0.54 deendb05

192.168.0.55 deendb06

192.168.0.56 deendb07

192.168.0.57 deendb08

192.168.0.61 deenfz01

192.168.0.62 deenfz02

192.168.0.71 deenm01

192.168.0.72 deenm02

 

 

 

配置文件,无损GTID

主库第一台配置下面这个

log_bin=/mysql/log/3306/binlog/mysqldb-binlog

log_bin_index=/mysql/log/3306/binlog/mysqldb-binlog.index

binlog_format=row

binlog_rows_query_log_events=on

 

skip_name_resolve = on

expire_logs_days = 7

innodb_support_xa =1

binlog_cache_size = 1M

max_binlog_size = 2048M

log_bin_trust_function_creators = 1

innodb_flush_log_at_trx_commit =1

sync_binlog = 1

transaction-isolation = read-committed

gtid_mode = on

enforce_gtid_consistency = 1

log-slave-updates = 1

binlog_gtid_simple_recovery=1

#--------------------------

#slave parameter

#--------------------------

#relay_log = /mysql/log/3306/relaylog/mysqldb-relay.log

#read_only=1

#slave-parallel-type=LOGICAL_CLOCK

#slave-parallel-workers=4

#master_info_repository=TABLE

#relay_log_info_repository=TABLE

#relay_log_recovery=1

#slave_skip_errors = ddl_exist_errors

#slave_preserve_commit_order=1

#------------------------------------

#semi sync replication settings

#------------------------------------

plugin_dir=/mysql/app/mysql/lib/plugin/

plugin_load ="rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"

loose_rpl_semi_sync_master_enabled = 1

loose_rpl_semi_sync_slave_enabled = 1

loose_rpl_semi_sync_master_timeout = 5000

rpl_semi_sync_master_wait_point = AFTER_SYNC

rpl_semi_sync_master_wait_for_slave_count = 1

后面三台备库加这个

#slave parameter

#------------------------------------

relay_log = /mysql/log/3306/relaylog/mysqldb-relay.log

log-slave-updates = 1

read_only=1

slave-parallel-type=LOGICAL_CLOCK

slave-parallel-workers=4

master_info_repository=table

relay_log_info_repository=table

relay_log_recovery=1

slave_skip_errors = ddl_exist_errors

slave_preserve_commit_order=1

主库

create user 'repuser'@'%' identified by 'repuser123';

grant replication slave on *.* to 'repuser'@'%';

flush privileges;

select user,host from mysql.user;

三台从库

stop slave;

reset master;

reset slave;

change master to

master_host='192.168.0.51',

master_port=3306,

master_user='repuser',

master_password='repuser123',

master_auto_position=1;

start slave;

show slave status \G

show variables like '%semi_sync%';

show global status like 'rpl_semi%';

测试:

create database itpuxdb2;

create table itpuxdb2.itpuxbak21 (id int,name varchar(40));

insert into itpuxdb2.itpuxbak21 values(1,'itpux211'),(2,'itpux212'),(3,'itpux213'),(4,'itpux214'),(5,'itpux215');

commit;

select * from itpuxdb2.itpuxbak21;

show databases;

  1. 安装MHA

在节点 52,53,54,51 安装MHA node所需perl模块,(DBD:mysql),并下载MHA软件包

cp /mysql/app/mysql/lib/libmysqlclient.so.20 /lib64/

mount /dev/cdrom /mnt

yum install perl-DBD-MySQL -y

yum install -y perl-devel perl-CPAN

 

安装node包

scp 192.168.0.150:/opt/mha4mysql-node-0.58.tar.gz /soft/

cd /soft/

tar xvf mha4mysql-node-0.58.tar.gz

cd mha4mysql-node-0.58

perl Makefile.PL

make && make install

安装完成将产生 4 个文件如下:

ls -lsa /usr/local/bin

echo "export PATH=\$PATH:/usr/local/bin" >> ~/.bash_profile

source ~/.bash_profile

monitor 节点

mount /dev/cdrom /mnt

###rpm -ivh  epel-release-7-11.noarch.rpm                阿里云的yum源   这里要联网,上面配置完成后,生成其他repo

### - Log::Dispatch         ...missing.

###- Parallel::ForkManager ...missing.

yum install -y perl-devel perl-CPAN perl-DBD-MySQL

yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes -y

#邮件发送的包

yum install  perl-Email-Date-Format  perl-Mail-Sender  perl-Mail-Send-Loop  perl-Mail-Sendmail  perl-MIME-Lite  perl-MIME-Types  perl-Params-Validate -y 

 

#########上面这些都是mha4mysql-manager-0.58  所需 perl的依赖

cd /opt/

tar zxvf Config-Tiny-2.23.tgz

cd Config-Tiny-2.23

perl Makefile.PL

make && make install

cd /opt/

tar xvf mha4mysql-node-0.58.tar.gz

cd mha4mysql-node-0.58

perl Makefile.PL

make && make install

cd /opt/

tar xvf mha4mysql-manager-0.58.tar.gz

cd mha4mysql-manager-0.58

perl Makefile.PL

make && make install

图解:这7个加载好了,才能说明是正常的,不能是miss

echo "export PATH=\$PATH:/usr/local/bin" >> ~/.bash_profile

source ~/.bash_profile

[root@itpuxdb1 scripts]# pwd

ls -lsa /opt/mha4mysql-manager-0.58/samples/scripts

ls -lsa /usr/local/bin

cp /opt/mha4mysql-manager-0.58/samples/scripts/* /usr/local/bin/

图解:这些是配置好后的脚本

  1. 快速配置 SSH 登录无密码验证

ssh-keygen -t rsa

ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.1.138

 

互信做过,但报错ERROR: ECDSA host key for 192.168.1.140 has changed and you have requested strict checking.

执行它就好了

ssh-keygen -R 192.168.1.139  

 

ssh deendb02 date

ssh deendb03 date

ssh deendb04 date

ssh deendb05 date

ssh deendb date

  1. MHA 配置(monitor)

monitor这台机器上

mkdir -p /etc/masterha

这里的名字app1.cnf在生产环境一定要规范  比如:mysql_ydpt_app1.cnf       *****app2.cnf

cp /opt/mha4mysql-manager-0.58/samples/conf/app1.cnf /etc/masterha/app1.cnf

 vi /etc/masterha/app1.cnf

[server default]

manager_workdir=/var/log/masterha/app1

manager_log=/var/log/masterha/app1/manager.log

master_binlog_dir=/mysql/log/3306/binlog

master_ip_failover_script= /usr/local/bin/master_ip_failover

master_ip_online_change_script= /usr/local/bin/master_ip_online_change

password=root

user=root

ping_interval=1

remote_workdir=/tmp

repl_password=repuser123

repl_user=repuser

report_script=/usr/local/bin/send_report

secondary_check_script= /usr/local/bin/masterha_secondary_check -s deendb03 -s deendb04 -s deendb05

shutdown_script=""

ssh_user=root

[server1]

hostname=192.168.0.51

port=3306

[server2]

hostname=192.168.0.52

port=3306

candidate_master=1

check_repl_delay=0

[server3]

hostname=192.168.0.53

port=3306

[server4]

hostname=192.168.0.54

port=3306

  1. 设置定期清理relay脚本

四个数据库节点执行,关闭 清除中继日志 ,MHA依赖中继日志

mysql -uroot -proot -e 'set global relay_log_purge=0'

对relay_log的控制,自己写个脚本

 

touch /usr/local/bin/purge_relay_log.sh

vi /usr/local/bin/purge_relay_log.sh

#!/bin/bash

user=root

passwd=root

port=3306

socket=/tmp/mysql.sock

log_dir='/var/log/masterha/log'

work_dir='/mysql/log/3306/relaylog'

purge='/usr/local/bin/purge_relay_logs'

if [ ! -d $log_dir ]

then

mkdir $log_dir -p

fi

$purge --user=$user --password=$passwd -S $socket --host=localhost --disable_relay_log_purge --port=$port --workdir=$work_dir >> $log_dir/purge_relay_logs.log 2>&1

 

4 数据库机器:

scp 192.168.0.51:/usr/local/bin/purge_relay_log.sh /usr/local/bin

chmod +x /usr/local/bin/purge_relay_log.sh

crontab -e

0 12 * * * /bin/bash /usr/local/bin/purge_relay_log.sh

#试一下看看会不会报错(不抱任何信息是正常)         

/usr/local/bin/purge_relay_log.sh

cd /var/log/masterha/log

more purge_relay_logs.log

这里看到有报错,不能找到 mysql.sock位置

 

 

 

  1. 检查    都在0.150上面,monitor那台上执行

检查 MHA Manger 到所有 MHA Node 的 SSH 连接状态:

masterha_check_ssh --conf=/etc/masterha/app1.cnf

检查整个复制环境状况。

masterha_check_repl --conf=/etc/masterha/app1.cnf

图解:master_ip_failover_script is notdefined.

这里先把 app1.cnf 里面

#master_ip_failover_script= /usr/local/bin/master_ip_failover         这个选项屏蔽才可以通过。

检查 MHA Manager 的状态

masterha_check_status --conf=/etc/masterha/app1.cnf

app1 is stopped(2:NOT_RUNNING).

手动启动

mkdir -p /var/log/masterha/app1

chmod -R 777 /var/log/masterha/app1

nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &

masterha_check_status --conf=/etc/masterha/app1.cnf

启动参数讲解:

remove_dead_master_conf

manager_log

ignore_last_failover       #如果MHA检测到主库连发生宕机,两次宕机间隔不足8小时,则不会发生切换,加这个参数可以避免这种情况

查看日志:

tail -100f /var/log/masterha/app1/manager.log

图解:这是正常的

 

  1. monitor上  配置四个脚本,vip改掉,网卡改掉(这是手动脚本,下一个讲keepalive方式)

master上面添加网卡0.60:     可以把下面这条命令加到开机自启的命令行下,但是切换之后,要ping下这个地址

/sbin/ifconfig ens34:1 192.168.0.60/24

在三台备机上执行,mysql -uroot -proot -h 192.168.0.60     

mysql> show variables like "hostname";

 

monitor上操作

vi /usr/local/bin/master_ip_failover   

--------------------------------------

#!/usr/bin/env perl

 

#  Copyright (C) 2011 DeNA Co.,Ltd.

#

#  This program is free software; you can redistribute it and/or modify

#  it under the terms of the GNU General Public License as published by

#  the Free Software Foundation; either version 2 of the License, or

#  (at your option) any later version.

#

#  This program is distributed in the hope that it will be useful,

#  but WITHOUT ANY WARRANTY; without even the implied warranty of

#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the

#  GNU General Public License for more details.

#

#  You should have received a copy of the GNU General Public License

#   along with this program; if not, write to the Free Software

#  Foundation, Inc.,

#  51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

 

## Note: This is a sample script and is not complete. Modify the script based on your environment.

 

use strict;

use warnings FATAL => 'all';

 

use Getopt::Long;

use MHA::DBHelper;

my (

  $command,        $ssh_user,         $orig_master_host,

  $orig_master_ip, $orig_master_port, $new_master_host,

  $new_master_ip,  $new_master_port,  $new_master_user,

  $new_master_password

);

 

my $vip = '192.168.0.60/24';

my $key = "2";

my $ssh_start_vip = "/sbin/ifconfig ens34:$key $vip/24";

my $ssh_stop_vip = "/sbin/ifconfig ens34:$key down";

my $ssh_send_garp = "/sbin/arping -U $vip -I eth0 -c 1";

 

 

GetOptions(

  'command=s'             => \$command,

  'ssh_user=s'            => \$ssh_user,

  'orig_master_host=s'    => \$orig_master_host,

  'orig_master_ip=s'      => \$orig_master_ip,

  'orig_master_port=i'    => \$orig_master_port,

  'new_master_host=s'     => \$new_master_host,

  'new_master_ip=s'       => \$new_master_ip,

  'new_master_port=i'     => \$new_master_port,

  'new_master_user=s'     => \$new_master_user,

  'new_master_password=s' => \$new_master_password,

);

 

exit &main();

 

sub main {

  if ( $command eq "stop" || $command eq "stopssh" ) {

 

    # $orig_master_host, $orig_master_ip, $orig_master_port are passed.

    # If you manage master ip address at global catalog database,

    # invalidate orig_master_ip here.

    my $exit_code = 1;

    eval {

      print "Disabling the VIP an old master: $orig_master_host \n";

      &stop_vip();

      $exit_code = 0;

    };

    if ($@) {

      warn "Got Error: $@\n";

      exit $exit_code;

    }

    exit $exit_code;

  }

  elsif ( $command eq "start" ) {

 

    # all arguments are passed.

    # If you manage master ip address at global catalog database,

    # activate new_master_ip here.

    # You can also grant write access (create user, set read_only=0, etc) here.

    my $exit_code = 10;

    eval {

      

      my $new_master_handler = new MHA::DBHelper();

 

      # args: hostname, port, user, password, raise_error_or_not

      $new_master_handler->connect( $new_master_ip, $new_master_port,

        $new_master_user, $new_master_password, 1 );

 

      ## Set read_only=0 on the new master

      $new_master_handler->disable_log_bin_local();

      print "Set read_only=0 on the new master.\n";

      $new_master_handler->disable_read_only();

 

      ## Creating an app user on the new master

      # print "Creating app user on the new master..\n";

      # FIXME_xxx_create_user( $new_master_handler->{dbh} );

      $new_master_handler->enable_log_bin_local();

      $new_master_handler->disconnect();

 

      print "Enabling the VIP $vip on the new master: $new_master_host \n";

      &start_vip();

      $exit_code = 0;

    };

    if ($@) {

      warn $@;

 

      # If you want to continue failover, exit 10.

      exit $exit_code;

    }

    exit $exit_code;

  }

  elsif ( $command eq "status" ) {

 

    # do nothing

    exit 0;

  }

  else {

    &usage();

    exit 1;

  }

}

 

sub start_vip(){

    `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;

    `ssh $ssh_user\@$new_master_host \" $ssh_send_garp \"`;

}

 

sub stop_vip(){

    return 0  unless  ($ssh_user);

    `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;

}

 

sub usage {

  print

"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";

}

 

vi /etc/masterha/app1.cnf     #解开注释

masterha_check_repl --conf=/etc/masterha/app1.cnf

状态 OK

masterha_stop --conf=/etc/masterha/app1.cnf

ps -ef|grep manager

vi /usr/local/bin/manager_start       #启动manager的脚本

#!/bin/bash

nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &

chmod u+x /usr/local/bin/manager_start

vi master_ip_online_change

#!/usr/bin/env perl

 

# Copyright (C) 2011 DeNA Co.,Ltd.## This program is free software; you can redistribute it and/or modify# it under the terms of the GNU General Public License as published by# the Free Software Foundation; either version 2 of the License, or# (at your option) any later version.## This program is distributed in the hope that it will be useful,# but WITHOUT ANY WARRANTY; without even the implied warranty of# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the# GNU General Public License for more details.## You should have received a copy of the GNU General Public License# along with this program; if not, write to the Free Software# Foundation, Inc.,# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA

 

## Note: This is a sample script and is not complete. Modify the script based on your environment.

 

use strict;use warnings FATAL => 'all';

 

use Getopt::Long;use MHA::DBHelper;use MHA::NodeUtil;use Time::HiRes qw( sleep gettimeofday tv_interval );use Data::Dumper;

 

my $_tstart;my $_running_interval = 0.1;

 

my $vip = '192.168.0.60';my $key = "2";

my $ssh_start_vip = "/sbin/ifconfig ens34:$key $vip/24";

my $ssh_stop_vip = "/sbin/ifconfig ens34:$key down";

my $ssh_send_garp = "/sbin/arping -U $vip -I ens34 -c 1";

my $orig_master_ssh_port = 22;

my $new_master_ssh_port = 22;

my $ssh_user = "root";

my $new_master_password='root';

my $orig_master_password='root';

 

 

my (

$command, $orig_master_is_new_slave, $orig_master_host,

$orig_master_ip, $orig_master_port, $orig_master_user,

$orig_master_password, $orig_master_ssh_user, $new_master_host,

$new_master_ip, $new_master_port, $new_master_user,

$new_master_password, $new_master_ssh_user,);GetOptions(

'command=s' => \$command,

'orig_master_is_new_slave' => \$orig_master_is_new_slave,

'orig_master_host=s' => \$orig_master_host,

'orig_master_ip=s' => \$orig_master_ip,

'orig_master_port=i' => \$orig_master_port,

'orig_master_user=s' => \$orig_master_user,

'orig_master_password=s' => \$orig_master_password,

'orig_master_ssh_user=s' => \$orig_master_ssh_user,

'new_master_host=s' => \$new_master_host,

'new_master_ip=s' => \$new_master_ip,

'new_master_port=i' => \$new_master_port,

'new_master_user=s' => \$new_master_user,

'new_master_password=s' => \$new_master_password,

'new_master_ssh_user=s' => \$new_master_ssh_user,);

 

exit &main();

 

sub start_vip(){

`ssh $new_master_ssh_user\@$new_master_host \" $ssh_start_vip \"`;

`ssh $new_master_ssh_user\@$new_master_host \" $ssh_send_garp \"`;}

 

sub stop_vip(){

`ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;}

 

 

sub current_time_us {

my ( $sec, $microsec ) = gettimeofday();

my $curdate = localtime($sec);

return $curdate . " " . sprintf( "%06d", $microsec );}

 

sub sleep_until {

my $elapsed = tv_interval($_tstart);

if ( $_running_interval > $elapsed ) {

sleep( $_running_interval - $elapsed );

}}

 

sub get_threads_util {

my $dbh = shift;

my $my_connection_id = shift;

my $running_time_threshold = shift;

my $type = shift;

$running_time_threshold = 0 unless ($running_time_threshold);

$type = 0 unless ($type);

my @threads;

 

my $sth = $dbh->prepare("SHOW PROCESSLIST");

$sth->execute();

 

while ( my $ref = $sth->fetchrow_hashref() ) {

my $id = $ref->{Id};

my $user = $ref->{User};

my $host = $ref->{Host};

my $command = $ref->{Command};

my $state = $ref->{State};

my $query_time = $ref->{Time};

my $info = $ref->{Info};

$info =~ s/^\s*(.*?)\s*$/$1/ if defined($info);

next if ( $my_connection_id == $id );

next if ( defined($query_time) && $query_time < $running_time_threshold );

next if ( defined($command) && $command eq "Binlog Dump" );

next if ( defined($user) && $user eq "system user" );

next

if ( defined($command)

&& $command eq "Sleep"

&& defined($query_time)

&& $query_time >= 1 );

 

if ( $type >= 1 ) {

next if ( defined($command) && $command eq "Sleep" );

next if ( defined($command) && $command eq "Connect" );

}

 

if ( $type >= 2 ) {

next if ( defined($info) && $info =~ m/^select/i );

next if ( defined($info) && $info =~ m/^show/i );

}

 

push @threads, $ref;

}

return @threads;}

 

sub main {

if ( $command eq "stop" ) {

## Gracefully killing connections on the current master

# 1. Set read_only= 1 on the new master

# 2. DROP USER so that no app user can establish new connections

# 3. Set read_only= 1 on the current master

# 4. Kill current queries

# * Any database access failure will result in script die.

my $exit_code = 1;

eval {

## Setting read_only=1 on the new master (to avoid accident)

my $new_master_handler = new MHA::DBHelper();

 

# args: hostname, port, user, password, raise_error(die_on_error)_or_not

$new_master_handler->connect( $new_master_ip, $new_master_port,

$new_master_user, $new_master_password, 1 );

print current_time_us() . " Set read_only on the new master.. ";

$new_master_handler->enable_read_only();

if ( $new_master_handler->is_read_only() ) {

print "ok.\n";

}

else {

die "Failed!\n";

}

$new_master_handler->disconnect();

 

# Connecting to the orig master, die if any database error happens

my $orig_master_handler = new MHA::DBHelper();

$orig_master_handler->connect( $orig_master_ip, $orig_master_port,

$orig_master_user, $orig_master_password, 1 );

 

## Drop application user so that nobody can connect. Disabling per-session binlog beforehand

$orig_master_handler->disable_log_bin_local();

# print current_time_us() . " Drpping app user on the orig master..\n";

#drop_app_user($orig_master_handler);

 

## Waiting for N * 100 milliseconds so that current connections can exit

my $time_until_read_only = 15;

$_tstart = [gettimeofday];

my @threads = get_threads_util( $orig_master_handler->{dbh},

$orig_master_handler->{connection_id} );

while ( $time_until_read_only > 0 && $#threads >= 0 ) {

if ( $time_until_read_only % 5 == 0 ) {

printf

"%s Waiting all running %d threads are disconnected.. (max %d milliseconds)\n",

current_time_us(), $#threads + 1, $time_until_read_only * 100;

if ( $#threads < 5 ) {

print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"

foreach (@threads);

}

}

sleep_until();

$_tstart = [gettimeofday];

$time_until_read_only--;

@threads = get_threads_util( $orig_master_handler->{dbh},

$orig_master_handler->{connection_id} );

}

 

## Setting read_only=1 on the current master so that nobody(except SUPER) can write

print current_time_us() . " Set read_only=1 on the orig master.. ";

$orig_master_handler->enable_read_only();

if ( $orig_master_handler->is_read_only() ) {

print "ok.\n";

}

else {

die "Failed!\n";

}

 

## Waiting for M * 100 milliseconds so that current update queries can complete

my $time_until_kill_threads = 5;

@threads = get_threads_util( $orig_master_handler->{dbh},

$orig_master_handler->{connection_id} );

while ( $time_until_kill_threads > 0 && $#threads >= 0 ) {

if ( $time_until_kill_threads % 5 == 0 ) {

printf

"%s Waiting all running %d queries are disconnected.. (max %d milliseconds)\n",

current_time_us(), $#threads + 1, $time_until_kill_threads * 100;

if ( $#threads < 5 ) {

print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"

foreach (@threads);

}

}

sleep_until();

$_tstart = [gettimeofday];

$time_until_kill_threads--;

@threads = get_threads_util( $orig_master_handler->{dbh},

$orig_master_handler->{connection_id} );

}

 

## Terminating all threads

print current_time_us() . " Killing all application threads..\n";

$orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 );

print current_time_us() . " done.\n";

$orig_master_handler->enable_log_bin_local();

$orig_master_handler->disconnect();

 

## Droping the VIP

print "Disabling the VIP an old master: $orig_master_host \n";

&stop_vip();

 

## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK

$exit_code = 0;

};

if ($@) {

warn "Got Error: $@\n";

exit $exit_code;

}

exit $exit_code;

}

elsif ( $command eq "start" ) {

## Activating master ip on the new master

# 1. Create app user with write privileges

# 2. Moving backup script if needed

# 3. Register new master's ip to the catalog database

 

# We don't return error even though activating updatable accounts/ip failed so that we don't interrupt slaves' recovery.# If exit code is 0 or 10, MHA does not abort

my $exit_code = 10;

eval {

my $new_master_handler = new MHA::DBHelper();

 

# args: hostname, port, user, password, raise_error_or_not

$new_master_handler->connect( $new_master_ip, $new_master_port,

$new_master_user, $new_master_password, 1 );

 

## Set read_only=0 on the new master

$new_master_handler->disable_log_bin_local();

print current_time_us() . " Set read_only=0 on the new master.\n";

$new_master_handler->disable_read_only();

 

## Creating an app user on the new master

#print current_time_us() . " Creating app user on the new master..\n";

# create_app_user($new_master_handler);

print "Enabling the VIP $vip on the new master: $new_master_host \n";

&start_vip();

$new_master_handler->enable_log_bin_local();

$new_master_handler->disconnect();

 

## Update master ip on the catalog database, etc

$exit_code = 0;

};

if ($@) {

warn "Got Error: $@\n";

exit $exit_code;

}

exit $exit_code;

}

elsif ( $command eq "status" ) {

 

# do nothing

exit 0;

}

else {

&usage();

exit 1;

}}

 

sub usage {

print"Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";

die;}

vi masterha_secondary_check     # 不需要修改

#!/bin/env perl

 

#  Copyright (C) 2011 DeNA Co.,Ltd.

#

#  This program is free software; you can redistribute it and/or modify

#  it under the terms of the GNU General Public License as published by

#  the Free Software Foundation; either version 2 of the License, or

#  (at your option) any later version.

#

#  This program is distributed in the hope that it will be useful,

#  but WITHOUT ANY WARRANTY; without even the implied warranty of

#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the

#  GNU General Public License for more details.

#

#  You should have received a copy of the GNU General Public License

#   along with this program; if not, write to the Free Software

#  Foundation, Inc.,

#  51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

 

use strict;

use warnings FATAL => 'all';

 

use English qw(-no_match_vars);

use Getopt::Long;

use Pod::Usage;

use MHA::ManagerConst;

 

my @monitoring_servers;

my (

  $help,        $version,         $ssh_user,  $ssh_port,

  $ssh_options, $master_host,     $master_ip, $master_port,

  $master_user, $master_password, $ping_type

);

my $timeout = 5;

 

$| = 1;

GetOptions(

  'help'              => \$help,

  'version'           => \$version,

  'secondary_host=s'  => \@monitoring_servers,

  'user=s'            => \$ssh_user,

  'port=s'            => \$ssh_port,

  'options=s'         => \$ssh_options,

  'master_host=s'     => \$master_host,

  'master_ip=s'       => \$master_ip,

  'master_port=i'     => \$master_port,

  'master_user=s'     => \$master_user,

  'master_password=s' => \$master_password,

  'ping_type=s'       => \$ping_type,

  'timeout=i'         => \$timeout,

);

 

if ($version) {

  print "masterha_secondary_check version $MHA::ManagerConst::VERSION.\n";

  exit 0;

}

 

if ($help) {

  pod2usage(0);

}

 

unless ($master_host) {

  pod2usage(1);

}

 

}

local $SIG{INT} = $SIG{HUP} = $SIG{QUIT} = $SIG{TERM} = \&exit_by_signal;

 

$ssh_user    = "root" unless ($ssh_user);

$ssh_port    = 22     unless ($ssh_port);

$master_port = 3306   unless ($master_port);

 

if ($ssh_options) {

  $MHA::ManagerConst::SSH_OPT_CHECK = $ssh_options;

}

$MHA::ManagerConst::SSH_OPT_CHECK =~ s/VAR_CONNECT_TIMEOUT/$timeout/;

# 3: master is reachable from at least one of monitoring servers

my $exit_code = 0;

 

foreach my $monitoring_server (@monitoring_servers) {

  my $ssh_user_host = $ssh_user . '@' . $monitoring_server;

  my $command =

"ssh $MHA::ManagerConst::SSH_OPT_CHECK -p $ssh_port $ssh_user_host \"perl -e "

    . "\\\"use IO::Socket::INET; my \\\\\\\$sock = IO::Socket::INET->new"

    . "(PeerAddr => \\\\\\\"$master_host\\\\\\\", PeerPort=> $master_port, "

    . "Proto =>'tcp', Timeout => $timeout); if(\\\\\\\$sock) { close(\\\\\\\$sock); "

    . "exit 3; } exit 0;\\\" \"";

  my $ret = system($command);

  $ret = $ret >> 8;

  if ( $ret == 0 ) {

    print

    next;

  }

  if ( $ret == 3 ) {

    if ( defined $ping_type

      && $ping_type eq $MHA::ManagerConst::PING_TYPE_INSERT )

    {

      my $ret_insert;

      my $command_insert =

          "ssh $MHA::ManagerConst::SSH_OPT_CHECK -p $ssh_port $ssh_user_host \'"

        . "/usr/bin/mysql -u$master_user -p$master_password -h$master_host "

        . "-e \"CREATE DATABASE IF NOT EXISTS infra; "

      my $sigalrm_timeout = 3;

      eval {

        local $SIG{ALRM} = sub {

          die "timeout.\n";

        };

        alarm $sigalrm_timeout;

        $ret_insert = system($command_insert);

        $ret_insert = $ret_insert >> 8;

        alarm 0;

        print

      }

    }

    print "Master is reachable from $monitoring_server!\n";

  }

  else {

    print "Monitoring server $monitoring_server is NOT reachable!\n";

    $exit_code = 2;

    last;

  }

}

 

exit $exit_code;

 

# ############################################################################

# Documentation

# ############################################################################

 

=pod

 

=head1 NAME

 

masterha_secondary_check - Checking master availability from additional network routes

 

=head1 SYNOPSIS

 

masterha_secondary_check -s secondary_host1 -s secondary_host2 .. --user=ssh_username --master_host=host --master_ip=ip --master_port=port

 

See online reference (http://code.google.com/p/mysql-master-ha/wiki/Parameters#secondary_check_script) for details.

 

=head1 DESCRIPTION

 

See online reference (http://code.google.com/p/mysql-master-ha/wiki/Parameters#secondary_check_script) for details.

#!/usr/bin/perl

 

# Copyright (C) 2011 DeNA Co.,Ltd.## This program is free software; you can redistribute it and/or modify# it under the terms of the GNU General Public License as published by# the Free Software Foundation; either version 2 of the License, or# (at your option) any later version.## This program is distributed in the hope that it will be useful,# but WITHOUT ANY WARRANTY; without even the implied warranty of# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the# GNU General Public License for more details.## You should have received a copy of the GNU General Public License# along with this program; if not, write to the Free Software# Foundation, Inc.,# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA

 

## Note: This is a sample script and is not complete. Modify the script based on your environment.

 

use strict;use warnings FATAL => 'all';use Mail::Sender;use Getopt::Long;

 

#new_master_host and new_slave_hosts are set only when recovering master succeededmy ( $dead_master_host, $new_master_host, $new_slave_hosts, $subject, $body );my $smtp='smtp.ym.163.com';my $mail_from='a@163.com';my $mail_user='a@163.com';my $mail_pass='xxxxx';my $mail_to=['b@163.com'];GetOptions(

'orig_master_host=s' => \$dead_master_host,

'new_master_host=s' => \$new_master_host,

'new_slave_hosts=s' => \$new_slave_hosts,

'subject=s' => \$subject,

'body=s' => \$body,);

mailToContacts($smtp,$mail_from,$mail_user,$mail_pass,$mail_to,$subject,$body);

 

sub mailToContacts {

my ( $smtp, $mail_from, $user, $passwd, $mail_to, $subject, $msg ) = @_;

open my $DEBUG, "> /tmp/monitormail.log"

or die "Can't open the debug file:$!\n";

my $sender = new Mail::Sender {

ctype => 'text/plain; charset=utf-8',

encoding => 'utf-8',

smtp => $smtp,

from => $mail_from,

auth => 'LOGIN',

TLS_allowed => '0',

authid => $user,

authpwd => $passwd,

to => $mail_to,

subject => $subject,

debug => $DEBUG

};

 

$sender->MailMsg(

{ msg => $msg,

debug => $DEBUG

}

) or print $Mail::Sender::Error;

return 1;}

# Do whatever you want here

exit 0;

  1. failover 自动切换测试与检查相关信息

切换的6个步骤

1)配置文件检查

2)宕机master处理,VIP删除,在这个阶段处理

3)复制中继日志给最新的从库

4)含有最新的数据的slave库

5)应用第三部分保存下来的二进制日志

6)将最新数据的从库 提升为 master

7)将其他slave连接到新的master

 

然后手工停止 192.168.0.51 的 mysql,然后查看日志:

systemctl stop mysqld

################ tail -300f /var/log/masterha/app1/manager.log写成脚本,manager_log###################

vi /usr/local/bin/manager_log

tail -300f /var/log/masterha/app1/manager.log

chmod u+x /usr/local/bin/manager_log

manager_log

####################################################

同时再看看 192.168.0.52/53/54 的 mysql 状态:

mysql -uroot -proot

show slave status\G

然后记得修改新主库的参数 my.cnf:持久化,否则重启出问题

#read_only=1

show variables like 'read_only';

 

主从数据测试。

create table itpuxdb2.itpuxbak22 (id int,name varchar(40));

insert into itpuxdb2.itpuxbak22 values(1,'itpux211'),(2,'itpux212'),(3,'itpux213'),(4,'itpux214'),(5,'itpux215');

commit;

select * from itpuxdb2.itpuxbak22;

备库检查:

select * from itpuxdb2.itpuxbak22;

注意点 :

1. 切换完之后 发现 MHA Manager 监控程序会自动死掉,解决方式:

这里我们用 shell 脚本的方式去执行就不会发生监控程序死掉的情况

vi /usr/local/bin/manager_start

#!/usr/bin/bash

nohup /usr/local/bin/masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null >/var/log/masterha/app1/manager.log 2>&1 &

chmod u+x /usr/local/bin/manager_start

-- 专用脚本

vi /usr/local/bin/manager_status_check

#!/bin/bash

while :

do

MGECHECK=`ps -ef|grep masterha_manager |egrep -v grep| wc -l`

if [ $MGECHECK -eq 0 ];then

/usr/local/bin/manager_start

else

echo "MHA manager start"

fi

sleep 5

done

 

chmod u+x /usr/local/bin/manager_status_check

nohup /usr/local/bin/manager_status_check &

写入/etc/rc.d/rc.local 开机自动启动

echo "nohup /usr/local/bin/manager_status_check &" >> /etc/rc.d/rc.local

2. 当你修复完死掉的 master 想重新加入先有的两节点 MHA 也是可以的

旧 Master :

记得修改旧主库 51 的参数 my.cnf,打开从库相关的参数,要重启       52的参数要开启主库的参数要打开,备库的参数关闭

(新主库 52 参数需要把从库相关的参数注释掉)

#read_only=1

mysql -uroot -proot

旧 master 直接 change master to:

change master to

master_host='192.168.0.52',

master_port=3306,

master_user='repuser',

master_password='repuser123',

master_auto_position=1;

start slave;

 

show variables like 'read_only';

看是否会补全数据,发现数据补全了,加入复制没问题。

mysql> select * from itpuxdb2.itpuxbak21;

+------+----------+

| id | name |

+------+----------+

| 1 | itpux211 |

| 2 | itpux212 |

| 3 | itpux213 |

| 4 | itpux214 |

| 5 | itpux215 |

+------+----------+

5 rows in set (0.01 sec)

----------------- 如果配置文件中 51 掉了,需要修改 app1.cnf 把 192.168.1.51 补上 ,并将下面两个地方修改好,此时把0.51当作HA的备机,以便下次宕机,能够切换

[server1]

hostname=192.168.0.52

port=3306

[server2]

hostname=192.168.0.51

port=3306

candidate_master=1

check_repl_delay=0

secondary_check_script= /usr/local/bin/masterha_secondary_check -s deendb02 -s deendb04 -s deendb05

(这个参数,只要加一个备库HA即可)

 

重启监控程序并查看 MHA 状态:

masterha_stop --conf=/etc/masterha/app1.cnf

ps -ef|grep manager

/usr/local/bin/manager_start

----------------------------------------------------------------

masterha_check_repl --conf=/etc/masterha/app1.cnf

再看日志,是不是有 4 台服务器在运行了:

/usr/local/bin/manager_log

192.168.1.52(192.168.1.52:3306) (current master)

+--192.168.1.51(192.168.1.51:3306)

+--192.168.1.53(192.168.1.53:3306)

+--192.168.1.54(192.168.1.54:3306)

  1. 手工正常切换测试与检查相关信息

在线切换步骤如下:

 

01.先停止监控程序

ps -ef|grep manager_status_check

kill -9 2384

masterha_stop --conf=/etc/masterha/app1.cnf

02.修改 master_ip_online_change 脚本如下:

vi /usr/local/bin/master_ip_online_change

my $vip = '192.168.1.60/24';

my $key = '1';

my $ssh_start_vip = "/sbin/ifconfig ens34:$key $vip";

my $ssh_stop_vip = "/sbin/ifconfig ens34:$key down";

my $ssh_user = "root";

my $new_master_password='root';

my $orig_master_password='root';

执行切换:(在线切换)(切换之后要my.cnf的参数文件修改掉)(192.168.0.60这个IP要测试mysql连不连的上)

masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --new_master_host=192.168.0.51 --new_master_port=3306 --orig_master_is_new_slave

--running_updates_limit=10000

-- 如果是宕机切换(故障切换)

masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --dead_master_host=192.168.1.52 --dead_master_port=3306 --new_master_host=192.168.1.51 --new_master_port=3306 --ignore_last_failover

 

查看切换后各机器的状态:

mysql -uroot -proot

show slave status\G

主从数据测试。

create table itpuxdb2.itpuxbak23 (id int,name varchar(40));

insert into itpuxdb2.itpuxbak23 values(1,'itpux231'),(2,'itpux232'),(3,'itpux233'),(4,'itpux234'),(5,'itpux235');

commit;

select * from itpuxdb2.itpuxbak23;

备库检查:

select * from itpuxdb2.itpuxbak23;

重启监控程序并查看 MHA 状态:

masterha_stop --conf=/etc/masterha/app1.cnf

/usr/local/bin/manager_start

masterha_check_repl --conf=/etc/masterha/app1.cnf

nohup /usr/local/bin/manager_status_check &

再看日志,是不是有 4 台服务器在运行了:

tail -300f /var/log/masterha/app1/manager.log            (写了脚本的话就执行这个/usr/local/bin/manager_log)

 

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值