出现免密失败,我这里出现是因为用普通用户做免密,但是创建文件用的是root用户,所以出现这个问题,所以要把创建的mha有关的文件、文件夹都设置为普通用户。即可免密成功。(应该是权限的问题)
[hado@aproxy ~]$ masterha_check_repl --conf=/etc/mha/app1.cnf
Sun Jun 13 20:29:27 2021 - [info] Reading default configuration from /etc/masterha_default.cnf..
Sun Jun 13 20:29:27 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Sun Jun 13 20:29:27 2021 - [info] Reading server configuration from /etc/mha/app1.cnf..
Sun Jun 13 20:29:27 2021 - [info] MHA::MasterMonitor version 0.58.
Sun Jun 13 20:29:29 2021 - [info] GTID failover mode = 0
Sun Jun 13 20:29:29 2021 - [info] Dead Servers:
Sun Jun 13 20:29:29 2021 - [info] Alive Servers:
Sun Jun 13 20:29:29 2021 - [info] 192.168.72.90(192.168.72.90:3306)
Sun Jun 13 20:29:29 2021 - [info] 192.168.72.91(192.168.72.91:3306)
Sun Jun 13 20:29:29 2021 - [info] 192.168.72.92(192.168.72.92:3306)
Sun Jun 13 20:29:29 2021 - [info] Alive Slaves:
Sun Jun 13 20:29:29 2021 - [info] 192.168.72.91(192.168.72.91:3306) Version=5.7.28-log (oldest major version between slaves) log-bin:enabled
Sun Jun 13 20:29:29 2021 - [info] Replicating from 192.168.72.90(192.168.72.90:3306)
Sun Jun 13 20:29:29 2021 - [info] Primary candidate for the new Master (candidate_master is set)
Sun Jun 13 20:29:29 2021 - [info] 192.168.72.92(192.168.72.92:3306) Version=5.7.28-log (oldest major version between slaves) log-bin:enabled
Sun Jun 13 20:29:29 2021 - [info] Replicating from 192.168.72.90(192.168.72.90:3306)
Sun Jun 13 20:29:29 2021 - [info] Primary candidate for the new Master (candidate_master is set)
Sun Jun 13 20:29:29 2021 - [info] Current Alive Master: 192.168.72.90(192.168.72.90:3306)
Sun Jun 13 20:29:29 2021 - [info] Checking slave configurations..
Sun Jun 13 20:29:29 2021 - [info] Checking replication filtering settings..
Sun Jun 13 20:29:29 2021 - [info] binlog_do_db= , binlog_ignore_db= information_schema,mysql,performance_schema,sys
Sun Jun 13 20:29:29 2021 - [info] Replication filtering check ok.
Sun Jun 13 20:29:29 2021 - [info] GTID (with auto-pos) is not supported
Sun Jun 13 20:29:29 2021 - [info] Starting SSH connection tests..
Sun Jun 13 20:29:31 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. SSH Configuration Check Failed!
at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 372.
Sun Jun 13 20:29:31 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Sun Jun 13 20:29:31 2021 - [info] Got exit code 1 (Not master dead).MySQL Replication Health is NOT OK!
排查思路:
1、根据错误提示:集群中ssh免密登陆未设置好,仔细检查我的全部机器的ssh免密是没有问题的。 cat id_rsa.pub >> authorized_keys
2、因为我很多都是通过sudo来执行的,然后切换到root用户下操作,将免密用户改为root
然后检查命令是在root用户下检查,是OK的。没问题。
3、后面觉得可能是权限问题免密不过去
4、根据思路修改创建文件的权限
注意
sudo chown -R hado:hado /etc/masterha_default.cnf
-------------------------------------
注意:
mha 文件夹是root的权限
sudo chown -R hado:hado /etc/mha
改为hado的权限
------------------------------
是root的权限。
sudo chown -R hado:hado /var/log/mha_manager
将权限修改为hado
又出现读取不了路径的问题
[hado@aproxy ~]$ masterha_check_repl --conf=/etc/mha/app1.cnf
。。。。。。。。。。。。。。。。。
Sun Jun 13 20:38:30 2021 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --start_file=mysql-bin.000003
Sun Jun 13 20:38:30 2021 - [info] Connecting to hado@192.168.72.90(192.168.72.90:22)..
Failed to save binary log: readdir() attempted on invalid dirhandle $dir at /usr/share/perl5/vendor_perl/MHA/BinlogManager.pm line 271.
Sun Jun 13 20:38:30 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln161] Binlog setting check failed!
Sun Jun 13 20:38:30 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln408] Master configuration failed.
Sun Jun 13 20:38:30 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/bin/masterha_check_repl line 48.
Sun Jun 13 20:38:30 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Sun Jun 13 20:38:30 2021 - [info] Got exit code 1 (Not master dead).注意:是在90,也就是master上,读取不了mysql的日志信息,所以要改为和mysql同一组
sudo usermod -G mysql hado 在master上执行(其实三台都要执行)
然后出现不能阅读的问题(还是要和mysql一个组才可以,继续执行上面的命令)
Sun Jun 13 21:03:09 2021 - [info] Connecting to hado@192.168.72.91(192.168.72.91:22)..
Checking slave recovery environment settings..
Opening /var/lib/mysql/relay-log.info ...Could not open relay-log-info file /var/lib/mysql/relay-log.info.
at /usr/bin/apply_diff_relay_logs line 347.
Sun Jun 13 21:03:09 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln208] Slaves settings check failed!
Sun Jun 13 21:03