最近研究完MHA,搭建完之后切换一切均正常,但是因为一个节点端口以及路径跟其他两个节点设置的不一样,导致最终切换失败。
之前排查问题只看到了error这个关键字,并没有留意错误信息其实已经在info 里面提示了。通过反复验证,也对MHA的配置文件有了更深的理解。
master_binlog_dir=/usr/local/mysql/data
master_ip_failover_script=/usr/local/bin/master_ip_failover
由于各个库 binlog路径设置不同,导致 复制状态最终检测失败。
故此结论:MHA搭建最后最好保证binlog路径不一致,如果不一致,需要提前将管理节点mha配置文件的binlog_dir路径修改为新切换的主库的binlog所在路径。如果各个节点binlog路径保持一样,这样当从库变成MASTER主库后,其他从库就自动从binlog路径下读取相关日志信息。
[root@moban m3309]# masterha_check_repl --conf=/etc/mha/app1.cnf
Sun Aug 16 12:14:13 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sun Aug 16 12:14:13 2020 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Sun Aug 16 12:14:13 2020 - [info] Reading server configuration from /etc/mha/app1.cnf..
Sun Aug 16 12:14:13 2020 - [info] MHA::MasterMonitor version 0.58.
Sun Aug 16 12:14:14 2020 - [info] GTID failover mode = 0
Sun Aug 16 12:14:14 2020 - [info] Dead Servers:
Sun Aug 16 12:14:14 2020 - [info] Alive Servers:
Sun Aug 16 12:14:14 2020 - [info] 192.168.20.22(192.168.20.22:3306)
Sun Aug 16 12:14:14 2020 - [info] 192.168.20.20(192.168.20.20:3306)
Sun Aug 16 12:14:14 2020 - [info] 192.168.20.23(192.168.20.23:3309)
Sun Aug 16 12:14:14 2020 - [info] Alive Slaves:
Sun Aug 16 12:14:14 2020 - [info] 192.168.20.22(192.168.20.22:3306) Version=5.7.30-log (oldest major version between slaves) log-bin:enabled
Sun Aug 16 12:14:14 2020 - [info] GTID ON
Sun Aug 16 12:14:14 2020 - [info] Replicating from 192.168.20.23(192.168.20.23:3309)
Sun Aug 16 12:14:14 2020 - [info] 192.168.20.20(192.168.20.20:3306) Version=5.7.30-log (oldest major version between slaves) log-bin:enabled
Sun Aug 16 12:14:14 2020 - [info] GTID ON
Sun Aug 16 12:14:14 2020 - [info] Replicating from 192.168.20.23(192.168.20.23:3309)
Sun Aug 16 12:14:14 2020 - [info] Current Alive Master: 192.168.20.23(192.168.20.23:3309)
Sun Aug 16 12:14:14 2020 - [info] Checking slave configurations..
Sun Aug 16 12:14:14 2020 - [info] read_only=1 is not set on slave 192.168.20.22(192.168.20.22:3306).
Sun Aug 16 12:14:14 2020 - [info] Checking replication filtering settings..
Sun Aug 16 12:14:14 2020 - [info] binlog_do_db= , binlog_ignore_db=
Sun Aug 16 12:14:14 2020 - [info] Replication filtering check ok.
Sun Aug 16 12:14:14 2020 - [info] GTID (with auto-pos) is not supported
Sun Aug 16 12:14:14 2020 - [info] Starting SSH connection tests..
Sun Aug 16 12:14:16 2020 - [info] All SSH connection tests passed successfully.
Sun Aug 16 12:14:16 2020 - [info] Checking MHA Node version..
Sun Aug 16 12:14:17 2020 - [info] Version check ok.
Sun Aug 16 12:14:17 2020 - [info] Checking SSH publickey authentication settings on the current master..
Sun Aug 16 12:14:17 2020 - [info] HealthCheck: SSH to 192.168.20.23 is reachable.
Sun Aug 16 12:14:17 2020 - [info] Checking recovery script configurations on 192.168.20.23(192.168.20.23:3309)..
Sun Aug 16 12:14:17 2020 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/usr/local/mysql/data --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --start_file=mysql-bin.000002
Sun Aug 16 12:14:17 2020 - [info] Connecting to root@192.168.20.23(192.168.20.23:22)..
Failed to save binary log: Binlog not found from /usr/local/mysql/data! If you got this error at MHA Manager, please set "master_binlog_dir=/path/to/binlog_directory_of_the_master" correctly in the MHA Manager's configuration file and try again.
at /usr/bin/save_binary_logs line 123.
eval {...} called at /usr/bin/save_binary_logs line 70
main::main() called at /usr/bin/save_binary_logs line 66
Sun Aug 16 12:14:17 2020 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln161] Binlog setting check failed!
Sun Aug 16 12:14:17 2020 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln408] Master configuration failed.
Sun Aug 16 12:14:17 2020 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/bin/masterha_check_repl line 48.
Sun Aug 16 12:14:17 2020 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Sun Aug 16 12:14:17 2020 - [info] Got exit code 1 (Not master dead).
MySQL Replication Health is NOT OK!