一、实验环境
四台虚拟机 server_lue5:172.25.6.7 MHA的Manager server_lue3 :172.25.6.5 主机master server_lue2 :172.25.6.4 从机slave server_lue4 :172.25.6.6 从机slave 首先,后三台主机实验环境清空,重新初始化, /etc/init.d/mysqld stop cd /data/mysql rm -rf * mysqld --initialize --user=mysql 没有编译mysql的主机通过lue3主机远程传输/usr/local/mysql然后参考之前博客mysql数据库安装
二、主从复制的配置
-
主机lue3的配置
1.修改/etc/my.cnf文件 [root@lue3 mysql]# cat /etc/my.cnf [mysqld] basedir=/usr/local/mysql datadir=/data/mysql socket=/data/mysql/mysql.sock server-id=1 gtid_mode=ON enforce-gtid-consistency=ON default_authentication_plugin=mysql_native_password # # include all files from the config directory # !includedir /etc/my.cnf.d 2.初始化, [root@lue3 mysql]#/etc/init.d/mysqld stop [root@lue3 mysql]#pwd /data/mysql [root@lue3 mysql]# rm -rf * [root@lue3 mysql]# ls [root@lue3 mysql]# mysqld --initialize --user=mysql #初始化 2020-08-20T01:52:16.382767Z 0 [System] [MY-013169] [Server] /usr/local/mysql/bin/mysqld (mysqld 8.0.21) initializing of server in progress as process 28860 2020-08-20T01:52:16.392786Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started. 2020-08-20T01:52:17.099404Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended. 2020-08-20T01:52:18.708445Z 6 [Note] [MY-010454] [Server] A temporary password is generated for root@localhost: 8eIIi&?KLFvy [root@lue3 mysql]# /etc/init.d/mysqld start #开启mysql Starting MySQL.Logging to '/data/mysql/lue3.westos.com.err'. SUCCESS! 3.登陆mysql数据库创建用户 [root@lue3 ~]# mysql -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 12 Server version: 8.0.21 Source distribution Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> set sql_log_bin=0; Query OK, 0 rows affected (0.00 sec) mysql> alter user root@localhost identified by 'westos'; #修改密码 Query OK, 0 rows affected (0.01 sec) mysql> set sql_log_bin=1; Query OK, 0 rows affected (0.00 sec) mysql> create user repl@'%' identified by 'westos'; #创建用户 Query OK, 0 rows affected (0.01 sec) mysql> grant replication slave on *.* to repl@'%'; #给权限 Query OK, 0 rows affected (0.01 sec) mysql>
-
lue2和lue4从机的配置
1./etc/my.cnf文件的配置 [root@lue2 ~]# cat /etc/my.cnf [mysqld] basedir=/usr/local/mysql datadir=/data/mysql socket=/data/mysql/mysql.sock server-id=2 #lue4的server_id改为3 gtid_mode=ON enforce-gtid-consistency=ON default_authentication_plugin=mysql_native_password # # include all files from the config directory # !includedir /etc/my.cnf.d 2.初始化 [root@lue2 mysql]# /etc/init.d/mysqld stop Shutting down MySQL.... SUCCESS! [root@lue2 mysql]# vim /etc/my.cnf [root@lue2 mysql]# pwd /data/mysql [root@lue2 mysql]# rm -rf * [root@lue2 mysql]# ls [root@lue2 mysql]# ls [root@lue2 mysql]# mysqld --initialize --user=mysql 2020-08-20T01:57:03.950961Z 0 [System] [MY-013169] [Server] /usr/local/mysql/bin/mysqld (mysqld 8.0.21) initializing of server in progress as process 5155 2020-08-20T01:57:03.961356Z 1 [System] [MY-013576] [InnoDB] InnoDB initialization has started. 2020-08-20T01:57:05.224378Z 1 [System] [MY-013577] [InnoDB] InnoDB initialization has ended. 2020-08-20T01:57:06.818661Z 6 [Note] [MY-010454] [Server] A temporary password is generated for root@localhost: !SwlaLrKN7)y [root@lue2 mysql]# /etc/init.d/mysqld start Starting MySQL.Logging to '/data/mysql/lue2.westos.com.err'. . SUCCESS! 3.从库的配置 [root@lue2 mysql]# mysql -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 9 Server version: 8.0.21 Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> set sql_log_bin=0; Query OK, 0 rows affected (0.00 sec) mysql> alter user root@localhost identified by 'westos'; Query OK, 0 rows affected (0.01 sec) mysql> set sql_log_bin=1; Query OK, 0 rows affected (0.00 sec) mysql> mysql> change master to_host='172.25.6.5',master_user='repl',master_password='westos',master_auto_position=1; Query OK, 0 rows affected, 2 warnings (0.02 sec) mysql> start slave; Query OK, 0 rows affected (0.01 sec) mysql> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 172.25.6.5 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: binlog.000001 Read_Master_Log_Pos: 659 Relay_Log_File: lue2-relay-bin.000002 Relay_Log_Pos: 868 Relay_Master_Log_File: binlog.000001 Slave_IO_Running: Yes #IO开启 Slave_SQL_Running: Yes #SQL开启 Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table:
![]()
-
测试
在主库新建一个表,在从库也有表的出现
lue2从库有westos表
lue4从库有westos表
三、MHA高可用的配置
MHA的故障切换过程
1.配置文件检查阶段,这个阶段会检查整个集群配置文件的配置
2. 宕机的master处理,包括虚拟ip摘除操作,
3. 复制dead master和最新slave相差的relay log,并保存到MHA Manager具体目录下
4. 识别含有最新更新的 Slave;
5. 应用从 Master 保存的二进制日志事件;
6. 提升一个 Slave 为新的 Master;使其他的 Slave 连接新的 Master 进行复制;
-
准备基于ssh互信通信环境
MHA 集群中的各节点彼此之间均需要基于ssh互信通信,以实现远程控制及数据管理功能。
简单起见,可在Manager节点生成密钥对,并设置其可远程连接本地主机后,将私钥文件及authorized_keys 文件复制给余下的所有节点即可。
[root@lue5 mha]# ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: SHA256:fUk35sS+2CkC2F0Mbl9kwXY3IqPERsc+c9UzPuHYhd0 root@lue5.westos.com The key's randomart image is: +---[RSA 2048]----+ | o.o. .++o| | =.* =+*E| | o = *.#.B| | o = B @ * | | . S o B o .| | . . o o | | . o + | | . . | | | +----[SHA256]-----+ [root@lue5 mha]# cd [root@lue5 ~]# ssh-copy-id 172.25.6.7 /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub" The authenticity of host '172.25.6.7 (172.25.6.7)' can't be established. ECDSA key fingerprint is SHA256:IrJMgKJDk0vNmrgAdRXpRFC61wQnqeRmL75ATccJH0g. ECDSA key fingerprint is MD5:54:42:f8:ca:39:c7:70:e8:2c:35:90:64:bf:20:da:29. Are you sure you want to continue connecting (yes/no)? yes /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys root@172.25.6.7's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh '172.25.6.7'" and check to make sure that only the key(s) you wanted were added. [root@lue5 ~]# ls .ssh/ authorized_keys id_rsa id_rsa.pub known_hosts [root@lue5 ~]# scp -r .ssh/ lue3: root@lue3's password: known_hosts 100% 703 662.0KB/s 00:00 id_rsa 100% 1679 1.8MB/s 00:00 id_rsa.pub 100% 402 432.0KB/s 00:00 authorized_keys 100% 402 769.8KB/s 00:00 [root@lue5 ~]# scp -r .ssh/ lue4: root@lue4's password: known_hosts 100% 703 651.9KB/s 00:00 id_rsa 100% 1679 2.0MB/s 00:00 id_rsa.pub 100% 402 653.0KB/s 00:00 authorized_keys 100% 402 304.3KB/s 00:00 [root@lue5 ~]# scp -r .ssh/ lue2: root@lue2's password: known_hosts 100% 703 523.4KB/s 00:00 id_rsa 100% 1679 1.5MB/s 00:00 id_rsa.pub 100% 402 318.2KB/s 00:00 authorized_keys 100% 402 389.2KB/s 00:00 [root@lue5 ~]# ssh root@172.25.6.5 Last login: Wed Aug 19 23:14:04 2020 from foundation6.ilt.example.com [root@lue3 ~]# logout Connection to 172.25.6.5 closed.
-
安装MHA
1.在Manager即lue5主机上 MHA官方下载rpm格式的程序包 mha4mysql-manager-0.58-0.el7.centos.noarch.rpm mha4mysql-manager-0.58.tar.gz mha4mysql-node-0.58-0.el7.centos.noarch.rpm perl-Config-Tiny-2.14-7.el7.noarch.rpm perl-Email-Date-Format-1.002-15.el7.noarch.rpm perl-Log-Dispatch-2.41-1.el7.1.noarch.rpm perl-Mail-Sender-0.8.23-1.el7.noarch.rpm perl-Mail-Sendmail-0.79-21.el7.noarch.rpm perl-MIME-Lite-3.030-1.el7.noarch.rpm perl-MIME-Types-1.38-2.el7.noarch.rpm perl-Net-Telnet-3.03-19.el7.noarch.rpm perl-Parallel-ForkManager-1.18-2.el7.noarch.rpm yum install -y *.rpm tar zxf mha4mysql-manager-0.58.tar.gz 2.在master(lue3)、salve(lue2和lue4) [root@lue5 MHA-7]# scp mha4mysql-node-0.58-0.el7.centos.noarch.rpm lue3:/root root@lue3's password: mha4mysql-node-0.58-0.el7.centos.noarch.rpm 100% 35KB 26.1MB/s 00:00 [root@lue5 MHA-7]# scp mha4mysql-node-0.58-0.el7.centos.noarch.rpm lue2:/root root@lue2's password: mha4mysql-node-0.58-0.el7.centos.noarch.rpm 100% 35KB 24.8MB/s 00:00 [root@lue5 MHA-7]# scp mha4mysql-node-0.58-0.el7.centos.noarch.rpm lue4:/root root@lue4's password: mha4mysql-node-0.58-0.el7.centos.noarch.rpm 100% 35KB 17.9MB/s 00:00 yum install -y mha4mysql-node-0.58-0.el7.centos.noarch.rpm 在master和slave主机中做
![]()
-
初始化 MHA
Manager 节点需要为每个监控 master/slave 集群提供一个专用的配置文件,而所有的master/slave 集群也可以共享全局
[root@lue5 MHA-7]# cd mha4mysql-manager-0.58 [root@lue5 mha4mysql-manager-0.58]# ls AUTHORS COPYING lib MANIFEST README samples tests bin debian Makefile.PL MANIFEST.SKIP rpm t [root@lue5 mha4mysql-manager-0.58]# cd samples [root@lue5 samples]# ls conf scripts [root@lue5 samples]# cd conf [root@lue5 conf]# ls app1.cnf masterha_default.cnf [root@lue5 conf]# mkdir /etc/mha [root@lue5 conf]# cat masterha_default.cnf app1.cnf > /etc/mha/app.cnf [root@lue5 conf]# cd /etc/mha [root@lue5 mha]# ls app.cnf [root@lue5 mha]# mkdir /etc/mha/app1 [root@lue5 mha]# cat app.cnf [server default] user=root #root用于管理数据库的账号 password=westos ssh_user=root master_binlog_dir= /data/mysql remote_workdir=/tmp secondary_check_script= masterha_secondary_check -s 172.25.6.4 -s 172.25.6.6 ping_interval=3 #检测主服务器健康状态的时间间隔,单位为秒 #master_ip_failover_script=/etc/mha/app1/master_ip_failover #shutdown_script= /script/masterha/power_manager #report_script= /script/masterha/send_report #master_ip_online_change_script= /etc/mha/app1/master_ip_online_change manager_workdir=/etc/mha/app1 manager_log=/etc/mha/app1/manager.log [server1] hostname=172.25.6.5 candidate_master=1 [server2] hostname=172.25.6.4 candidate_master=1 #1表示主服务宕机的时候,该节点可以提升为主节点 check_repl_delay=0 [server3] hostname=172.25.6.6 no_master=1 #如果不提升为主节点,则为no_master=1
由于还需要一个mysql账号,用于给予mha管理mysql,所以,可以直接在主服务器上创建账号,并授予其权限,由于已建立了主从复制关系,所以,无需在其它从节点上重复创建账号:
[root@lue3 ~]# mysql -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 12 Server version: 8.0.21 Source distribution Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> create user root@'%' identified by 'westos'; Query OK, 0 rows affected (0.01 sec) mysql> grant all on *.* to root@'%'; Query OK, 0 rows affected (0.01 sec)
测试: 在lue5中yum install -y mariadb mysql -h 172.25.6.5 -uroot -pwestos
检测各节点间ssh互信通信配置时候OK:[root@lue5 mha]# yum install -y mariadb [root@lue5 mha]# mysql -h 172.25.6.5 -uroot -pwestos Welcome to the MariaDB monitor. Commands end with ; or \g. Your MySQL connection id is 13 Server version: 8.0.21 Source distribution Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MySQL [(none)]> exit
[root@lue5 ~]# masterha_check_ssh --conf=/etc/mha/app.cnf
检测成功,最后会提示 All SSH connection tests passed successfully.
检查管理Mysql复制集群的连接配置参数是否OK:
[root@lue5 ~]#masterha_check_repl --conf=/etc/mha/app.cnf
检测成功,最后会提示 Health is OK
[root@lue5 ~]# nohup masterha_manager --conf=/etc/mha/app.cnf & #启动MHA
[1] 16713
[root@lue5 ~]# nohup: ignoring input and appending output to ‘nohup.out’
[root@lue5 ~]# masterha_check_status --conf=/etc/mha/app.cnf #启动成功后,可通过如下命令查看master节点的状态
app (pid:16713) is running(0:PING_OK), master:172.25.6.5[root@lue5 ~]# masterha_stop –conf=/etc/master/app1.cnf 停止MHA
四、测试故障转移
-
手动切换
1.在master节点关闭mysql [root@lue3 ~]# /etc/init.d/mysqld stop [root@lue5 app1]# /etc/mha/app1/master_ip_failover --command=status --orig_master_host=lue3 --orig_master_ip=172.25.6.5 --orig_master_port=3306 --new_master_host=lue2 --new_master_ip=172.25.6.4 --new_master_port=3306 2.查看日志/etc/mha/app1/manager.log (less /etc/mha/app1/manager.log)
由日志文件可看出,172.25.6.4(lue2)已经成为新的master,而lue4复制的指向也改为lue2了
在节点2(lue4)上查看slave状态,已经表明slave停止了,显然成为了master
在节点3上查看slave状态,master_server_id已经改为2了,Master_host已经变成了lue2的主机地址
再手动开启原master(lue3),作为slave加入集群
[root@lue3 ~]# /etc/init.d/mysqld start Starting MySQL. SUCCESS! mysql> change master to master_host='172.25.6.4',master_user='repl',master_password='westos'; Query OK, 0 rows affected, 2 warnings (0.04 sec) mysql> start slave Query OK, 0 rows affected, 2 warnings (0.04 sec) mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Master_Host: 172.25.6.4 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: Read_Master_Log_Pos: 4 Relay_Log_File: lue3-relay-bin.000001 Relay_Log_Pos: 4 Relay_Master_Log_File:
![]()
这时会在/etc/mha目录下生成一个app1.failover.complete文件,是来记录failover情况的,再进行failover时必须先把这个文件删除,不然不会failover
-
配置脚本和vip漂移
首先清理/etc/mhaapp1下的app1.failover.complete
开启mha manager
[root@lue5 app1]# rm -rf app.failover.complete [root@lue5 mha]# nohup masterha_manager --conf=/etc/mha/app.cnf & [1] 19133 [root@lue5 mha]# nohup: ignoring input and appending output to ‘nohup.out’
编辑master_ip_failover 和 master_ip_online_change 两个脚本,配置vip
[root@lue5 mha]# cd /root/MHA-7/mha4mysql-manager-0.58/samples/scripts/ [root@lue5 scripts]# mv master_ip_online_change master_ip_failover /etc/mha/app1 [root@lue5 scripts]# cd /etc/mha/app1 [root@lue5 app1]# vim master_ip_failover #!/usr/bin/env perl use strict; use warnings FATAL => 'all'; use Getopt::Long; my ( $command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port ); my $vip = '172.25.6.100/24'; my $ssh_start_vip = "/sbin/ip addr add $vip dev eth0"; #添加 my $ssh_stop_vip = "/sbin/ip addr del $vip dev eth0"; #添加 [root@lue5 app1]# vim master_ip_online_change #!/usr/bin/env perl use strict; use warnings FATAL =>'all'; use Getopt::Long; my $vip = '172.25.6.100/24'; # Virtual IP my $ssh_start_vip = "/sbin/ip addr add $vip dev eth0"; my $ssh_stop_vip = "/sbin/ip addr del $vip dev eth0"; my $exit_code = 0; my ( $command, $orig_master_is_new_slave, $orig_master_host, $orig_master_ip, $orig_master_port, $orig_master_user, $orig_master_password, $orig_master_ssh_user, $new_master_host, $new_master_ip, $new_master_port, $new_master_user, $new_master_password, $new_master_ssh_user, );
目前lue2是master,所以先给lue2添加vip
[root@lue2 ~]# ip addr add 172.25.6.100 dev eth0
然后在lue2上关闭mysql,模拟故障
查看日志/etc/mha/app1/manager.log
vip切换成功