MHA 高可用部署

一  功能描述

        MHA (master high avaliabulity) 是基于主库的高可用环境下,可以实现主从复制及故障切换。MHA要求一主两从,半同步复制模式。解决单点故障问题,当主库crash,MHA服务可以在0-30秒内自动完成故障切换,实现业务的可靠性保障。

二 简要原理

        ① MHA使用的是半同步复制方式,只要有一个从节点写入数据,就会自动提交给客户端;

        ② 如果 master crash,slave会识别最新更新的日志,差异部分同步到slave;并提升一个新的slave作为master;其他的slave继续和新的master同步。

三 MHA环境资源

角色

主机名

IP地址

master

mha15

192.168.10.15

slave1

mha16

192.168.10.16

slave2

mha17

192.168.10.17

MHA manager

mhamaster18

192.168.10.18

四 部署步骤

4.1 在Master,Slave1,Slave2上面部署MySQL8.0

配置成一主两从模式(略),可参考其他MySQL集群配置文档

在所有库上创建用户

#从库数据同步用户
create user 'myslave'@'%' identified by '123456';
grant replication slave,replication client on *.* to 'myslave'@'%';

#manager使用用户
create user 'mha'@'%' identified by '123456';
grant all on *.* to 'mha'@'%';

4.2 在Master查看binlog的点位

mysql> show master status;
+------------------+----------+--------------+------------------+------------------------------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                        |
+------------------+----------+--------------+------------------+------------------------------------------+
| mysql-bin.000001 |     1837 |              |                  | 7cb7532a-052d-11ef-9f84-000c2962bd90:1-7 |
+------------------+----------+--------------+------------------+------------------------------------------+
1 row in set (0.01 sec)

4.3 在slave节点上执行同步命令

change master to master_host='192.168.10.15',master_user='myslave',master_port=3306,master_password='123456',master_log_file='mysql-bin.000001',master_log_pos=1837;

start slave;

4.4 在slave节点上查看同步结果

mysql> start slave;
Query OK, 0 rows affected, 1 warning (0.05 sec)

mysql> show slave status \G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for source to send event
                  Master_Host: 192.168.10.15
                  Master_User: myslave
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 1837
               Relay_Log_File: relay-log.000002
                Relay_Log_Pos: 326
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1837
              Relay_Log_Space: 530
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 203148915
                  Master_UUID: 7cb7532a-052d-11ef-9f84-000c2962bd90
             Master_Info_File: /app/data/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Replica has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 6f4c3822-052d-11ef-b559-000c29926dfd:1-7
                Auto_Position: 0
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
       Master_public_key_path: 
        Get_master_public_key: 0
            Network_Namespace: 
1 row in set, 1 warning (0.00 sec)

ERROR: 
No query specified

4.5 将slave节点设置为只读模式

set global read_only=1;

4.6 数据同步测试

#在主库上执行
create database test;
use test;
create table test123(id int);
insert into test123 values(1);

#在从库上验证同步结果
select * from test.test123;

4.7 安装MHA软件

① 在所有4台服务器上安装MHA依赖的环境,首先安装epel源

yum install epel-release --nogpgcheck -y

yum install -y perl-DBD-MySQL \
perl-Config-Tiny \
perl-Log-Dispatch \
perl-Parallel-ForkManager \
perl-ExtUtils-CBuilder \
perl-ExtUtils-MakeMaker \
perl-CPAN

② 在所有4台服务器上安装MHA node组件

下载
https://github.com/yoshinorim/mha4mysql-node
https://github.com/yoshinorim/mha4mysql-manager

mv mha4mysql-* /app
cd /app
tar -zxf mha4mysql-node-0.58.tar.gz
cd mha4mysql-node-0.58
perl Makefile.PL
make && make install

③ 在所有4台服务器上安装MHA manager组件

cd /app
tar -zxf mha4mysql-manager-0.58.tar.gz
cd mha4mysql-manager-0.58
perl Makefile.PL
make && make install

安装完成后的文件清单:

[root@mha15 soft]# cd /usr/local/bin/
[root@mha15 bin]# ll
total 88
-r-xr-xr-x. 1 root root 17639 Apr 28 04:10 apply_diff_relay_logs
-r-xr-xr-x. 1 root root  4807 Apr 28 04:10 filter_mysqlbinlog
-r-xr-xr-x. 1 root root  1995 Apr 28 04:14 masterha_check_repl
-r-xr-xr-x. 1 root root  1779 Apr 28 04:14 masterha_check_ssh
-r-xr-xr-x. 1 root root  1865 Apr 28 04:14 masterha_check_status
-r-xr-xr-x. 1 root root  3201 Apr 28 04:14 masterha_conf_host
-r-xr-xr-x. 1 root root  2517 Apr 28 04:14 masterha_manager
-r-xr-xr-x. 1 root root  2165 Apr 28 04:14 masterha_master_monitor
-r-xr-xr-x. 1 root root  2373 Apr 28 04:14 masterha_master_switch
-r-xr-xr-x. 1 root root  5172 Apr 28 04:14 masterha_secondary_check
-r-xr-xr-x. 1 root root  1739 Apr 28 04:14 masterha_stop
-r-xr-xr-x. 1 root root  8337 Apr 28 04:10 purge_relay_logs
-r-xr-xr-x. 1 root root  7525 Apr 28 04:10 save_binary_logs

4.8 在所有服务器上配置无密码认证

① 在manager 节点上配置到所有数据库节点的无密码认证
ssh-keygen -t rsa     #一路按回车键
ssh-copy-id 192.168.10.15
ssh-copy-id 192.168.10.16
ssh-copy-id 192.168.10.17
② 在master 节点上配置到从数据库节点的无密码认证
ssh-keygen -t rsa     #一路按回车键
ssh-copy-id 192.168.10.16
ssh-copy-id 192.168.10.17
③ 在slave1 节点上配置到maser 和 slave2 的无密码认证
ssh-keygen -t rsa     #一路按回车键
ssh-copy-id 192.168.10.15
ssh-copy-id 192.168.10.17
④ 在slave2 节点上配置到maser 和 slave1 的无密码认证
ssh-keygen -t rsa     #一路按回车键
ssh-copy-id 192.168.10.15
ssh-copy-id 192.168.10.16

4.9 在manager节点上配置MHA

① 在manager节点上复制脚本到 /usr/local/bin 目录下
cp -rp /app/mha4mysql-manager-0.58/samples/scripts/  /usr/local/bin
② 在manager 节点复制自动切换时VIP管理的脚本到/usr/local/bin目录
cp /usr/local/bin/scripts/master_ip_failover /usr/local/bin/
cp /usr/local/bin/scripts/master_ip_online_change /usr/local/bin/
③ 修改内容如下:(删除原有内容,直接复制并修改VIP相关参数)
vim /usr/local/bin/master_ip_failover

# 修改手动vip切换脚本

#!/usr/bin/env perl

use strict;
use warnings FATAL => 'all';

use Getopt::Long;
use MHA::DBHelper;

my (
  $command,        $ssh_user,         $orig_master_host,
  $orig_master_ip, $orig_master_port, $new_master_host,
  $new_master_ip,  $new_master_port,  $new_master_user,
  $new_master_password
);

##############该部分按实际修改
my $vip = '192.168.10.100/24';
my $ifdev = 'ens33';
my $key = '1';
my $ssh_start_vip = "/sbin/ifconfig $ifdev:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig $ifdev:$key  down";

##############

GetOptions(
  'command=s'             => \$command,
  'ssh_user=s'            => \$ssh_user,
  'orig_master_host=s'    => \$orig_master_host,
  'orig_master_ip=s'      => \$orig_master_ip,
  'orig_master_port=i'    => \$orig_master_port,
  'new_master_host=s'     => \$new_master_host,
  'new_master_ip=s'       => \$new_master_ip,
  'new_master_port=i'     => \$new_master_port,
  'new_master_user=s'     => \$new_master_user,
  'new_master_password=s' => \$new_master_password,
);

exit &main();

sub main {
  print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
  if ( $command eq "stop" || $command eq "stopssh" ) {

    my $exit_code = 1;
    eval {

      print "Disabling the VIP on old master: $orig_master_host \n";
      &stop_vip();
      # updating global catalog, etc
      $exit_code = 0;
    };
    if ($@) {
      warn "Got Error: $@\n";
      exit $exit_code;
    }
    exit $exit_code;
  }
  elsif ( $command eq "start" ) {

    my $exit_code = 10;
    eval {

      print "Enabling the VIP - $vip on the new master - $new_master_host \n";
      &start_vip();
      $exit_code = 0;
    };
    if ($@) {
      warn $@;

      # If you want to continue failover, exit 10.
      exit $exit_code;
    }
    exit $exit_code;
  }
  elsif ( $command eq "status" ) {

    print "Checking the Status of the script.. OK \n";
    # do nothing
    exit 0;
  }
  else {
    &usage();
    exit 1;
  }
}
sub start_vip() {

    `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;

}

sub stop_vip() {

     return 0  unless  ($ssh_user);

    `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;

}


sub usage {
  print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
④ 创建MHA软件目录并拷贝配置文件,这里使用app1.cnf配置文件来管理MySQL节点服务器
mkdir /etc/masterha
cp /app/mha4mysql-manager-0.58/samples/conf/app1.cnf /etc/masterha
#删除原有内容,直接复制并修改节点服务器的IP地址
vim /etc/masterha/app1.cnf

[server default]
manager_log=/var/log/masterha/app1/manager.log
manager_workdir=/var/log/masterha/app1
master_binlog_dir=/app/data/data
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
ssh_user=root
user=mha
password=123456
ping_interval=1
remote_workdir=/tmp
repl_password=123456
repl_user=myslave
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.10.16 -s 192.168.10.17
#从对主监听
shutdown_script=""
[server1]
hostname=192.168.10.15
#主服务器
port=3306
[server2]
candidate_master=1   
check_repl_delay=0
hostname=192.168.10.16  
#备用主服务器
port=3306
[server3]
hostname=192.168.10.17  
#从服务器2
port=3306

4.10 第一次需手动在master节点上开启虚拟IP

安装ifconfig工具
yum -y install net-tools

手动增加虚拟IP
/sbin/ifconfig ens33:1 192.168.10.100/24

4.11 在manager节点上测试SSH无密码认证

masterha_check_ssh -conf=/etc/masterha/app1.cnf

4.12 在manager节点上测试MySQL主从连接情况

[root@mhamaster masterha]# masterha_check_repl --conf=/etc/masterha/app1.cnf
Sun Apr 28 21:48:24 2024 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun Apr 28 21:48:24 2024 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Sun Apr 28 21:48:24 2024 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Sun Apr 28 21:48:24 2024 - [info] MHA::MasterMonitor version 0.58.
Sun Apr 28 21:48:25 2024 - [info] GTID failover mode = 1
Sun Apr 28 21:48:25 2024 - [info] Dead Servers:
Sun Apr 28 21:48:25 2024 - [info] Alive Servers:
Sun Apr 28 21:48:25 2024 - [info]   192.168.10.15(192.168.10.15:3306)
Sun Apr 28 21:48:25 2024 - [info]   192.168.10.16(192.168.10.16:3306)
Sun Apr 28 21:48:25 2024 - [info]   192.168.10.17(192.168.10.17:3306)
Sun Apr 28 21:48:25 2024 - [info] Alive Slaves:
Sun Apr 28 21:48:25 2024 - [info]   192.168.10.16(192.168.10.16:3306)  Version=8.0.34 (oldest major version between slaves) log-bin:enabled
Sun Apr 28 21:48:25 2024 - [info]     GTID ON
Sun Apr 28 21:48:25 2024 - [info]     Replicating from 192.168.10.15(192.168.10.15:3306)
Sun Apr 28 21:48:25 2024 - [info]     Primary candidate for the new Master (candidate_master is set)
Sun Apr 28 21:48:25 2024 - [info]   192.168.10.17(192.168.10.17:3306)  Version=8.0.34 (oldest major version between slaves) log-bin:enabled
Sun Apr 28 21:48:25 2024 - [info]     GTID ON
Sun Apr 28 21:48:25 2024 - [info]     Replicating from 192.168.10.15(192.168.10.15:3306)
Sun Apr 28 21:48:25 2024 - [info] Current Alive Master: 192.168.10.15(192.168.10.15:3306)
Sun Apr 28 21:48:25 2024 - [info] Checking slave configurations..
Sun Apr 28 21:48:25 2024 - [info] Checking replication filtering settings..
Sun Apr 28 21:48:25 2024 - [info]  binlog_do_db= , binlog_ignore_db= 
Sun Apr 28 21:48:25 2024 - [info]  Replication filtering check ok.
Sun Apr 28 21:48:25 2024 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Sun Apr 28 21:48:25 2024 - [info] Checking SSH publickey authentication settings on the current master..
Sun Apr 28 21:48:26 2024 - [info] HealthCheck: SSH to 192.168.10.15 is reachable.
Sun Apr 28 21:48:26 2024 - [info] 
192.168.10.15(192.168.10.15:3306) (current master)
 +--192.168.10.16(192.168.10.16:3306)
 +--192.168.10.17(192.168.10.17:3306)

Sun Apr 28 21:48:26 2024 - [info] Checking replication health on 192.168.10.16..
Sun Apr 28 21:48:26 2024 - [info]  ok.
Sun Apr 28 21:48:26 2024 - [info] Checking replication health on 192.168.10.17..
Sun Apr 28 21:48:26 2024 - [info]  ok.
Sun Apr 28 21:48:26 2024 - [info] Checking master_ip_failover_script status:
Sun Apr 28 21:48:26 2024 - [info]   /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.10.15 --orig_master_ip=192.168.10.15 --orig_master_port=3306 


IN SCRIPT TEST====/sbin/ifconfig ens33:1 down==/sbin/ifconfig ens33:1 192.168.10.100===

Checking the Status of the script.. OK 
Sun Apr 28 21:48:26 2024 - [info]  OK.
Sun Apr 28 21:48:26 2024 - [warning] shutdown_script is not defined.
Sun Apr 28 21:48:26 2024 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.
[root@mhamaster masterha]#

4.13 在manager节点启动MHA

nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &


[root@mhamaster masterha]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
[1] 13812
[root@mhamaster masterha]#

4.14 在manager节点上查看MHA状态

[root@mhamaster masterha]# masterha_check_status -conf=/etc/masterha/app1.cnf
app1 (pid:13812) is running(0:PING_OK), master:192.168.10.15
[root@mhamaster masterha]#

4.15 在manager节点上查看MHA日志

cat /var/log/masterha/app1/manager.log | grep "current master"


[root@mhamaster masterha]# cat /var/log/masterha/app1/manager.log | grep "current master"
Sun Apr 28 23:03:08 2024 - [info] Checking SSH publickey authentication settings on the current master..
192.168.10.15(192.168.10.15:3306) (current master)
[root@mhamaster masterha]# 

4.16 在master节点上查看VIP地址

[root@mha15 data]# ifconfig
ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.10.15  netmask 255.255.255.0  broadcast 192.168.10.255
        inet6 fe80::6b4c:646e:e677:d4ac  prefixlen 64  scopeid 0x20<link>
        inet6 fe80::32ad:80bf:c2e0:633f  prefixlen 64  scopeid 0x20<link>
        inet6 fe80::b6cb:27dd:5b53:f57f  prefixlen 64  scopeid 0x20<link>
        ether 00:0c:29:62:bd:90  txqueuelen 1000  (Ethernet)
        RX packets 740602  bytes 1019317019 (972.0 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 75618  bytes 8632318 (8.2 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ens33:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.10.100  netmask 255.255.255.0  broadcast 192.168.10.255
        ether 00:0c:29:62:bd:90  txqueuelen 1000  (Ethernet)

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 737  bytes 73994 (72.2 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 737  bytes 73994 (72.2 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

4.17 在manager节点上关闭MHA服务

masterha_stop --conf=/etc/masterha/app1.cnf
或者可以直接采用 kill 进程 ID 的方式关闭

后续会出MHA故障模拟及恢复的文章,欢迎关注及转载!

  • 34
    点赞
  • 18
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值