1.master_check_ssh --conf=/etc/app1.conf
这个检查就报错的我觉得百分之九十都是ssh之间连接问题。务必要保证各节点之间都可以免秘钥访问!
2.master_check_repl --conf=/etc/app1.conf
(1)报错代码:
类似就是说什么copyuser复制用户在节点没有权限的代码,解决方法是每个节点创建这个用户即可。要是主从复制已经开启,记得节点先stop slave; 再分别创建用户。
MHA版本,应该需要在所有的数据库中都开启二进制日志,中继日志,授权也应该都相同,
配置文件也基本相同。我想在这个前提下在安装执行MHA应该不会遇上太多问题了。只是目前还不能确定这种做法是不是正解。
(2)报错代码:
Tue Apr 30 09:26:44 2019 - [warning] Global configuration file /etc/masterha_default.cnf notfound. Skipping.
Tue Apr30 09:26:44 2019 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Tue Apr30 09:26:44 2019 - [info] Reading server configuration from /etc/mha/app1.cnf..
Tue Apr30 09:26:44 2019 - [info] MHA::MasterMonitor version 0.56.
Tue Apr30 09:26:45 2019 - [info] GTID failover mode =0
Tue Apr30 09:26:45 2019 -[info] Dead Servers:
Tue Apr30 09:26:45 2019 -[info] Alive Servers:
Tue Apr30 09:26:45 2019 - [info] 103.75.1.22(103.75.1.22:3306)
Tue Apr30 09:26:45 2019 - [info] 103.75.1.23(103.75.1.23:3306)
Tue Apr30 09:26:45 2019 - [info] 103.75.1.24(103.75.1.24:3306)
Tue Apr30 09:26:45 2019 -[info] Alive Slaves:
Tue Apr30 09:26:45 2019 - [info] 103.75.1.23(103.75.1.23:3306) Version=5.7.25-log (oldest major version between slaves) log-bin:enabled
Tue Apr30 09:26:45 2019 - [info] Replicating from 103.75.1.22(103.75.1.22:3306)
Tue Apr30 09:26:45 2019 - [info] Primary candidate for the new Master (candidate_master isset)
Tue Apr30 09:26:45 2019 - [info] 103.75.1.24(103.75.1.24:3306) Version=5.7.25-log (oldest major version between slaves) log-bin:enabled
Tue Apr30 09:26:45 2019 - [info] Replicating from 103.75.1.22(103.75.1.22:3306)
Tue Apr30 09:26:45 2019 - [info] Current Alive Master: 103.75.1.22(103.75.1.22:3306)
Tue Apr30 09:26:45 2019 -[info] Checking slave configurations..
Tue Apr30 09:26:45 2019 - [info] read_only=1 is not set on slave 103.75.1.24(103.75.1.24:3306).
Tue Apr30 09:26:45 2019 -[info] Checking replication filtering settings..
Tue Apr30 09:26:45 2019 - [info] binlog_do_db= , binlog_ignore_db=Tue Apr30 09:26:45 2019 -[info] Replication filtering check ok.
Tue Apr30 09:26:45 2019 - [info] GTID (with auto-pos) is notsupported
Tue Apr30 09:26:45 2019 -[info] Starting SSH connection tests..
Tue Apr30 09:26:53 2019 -[info] All SSH connection tests passed successfully.
Tue Apr30 09:26:53 2019 -[info] Checking MHA Node version..
Tue Apr30 09:26:57 2019 -[info] Version check ok.
Tue Apr30 09:26:57 2019 -[info] Checking SSH publickey authentication settings on the current master..
Tue Apr30 09:26:58 2019 - [info] HealthCheck: SSH to 103.75.1.22 isreachable.
Tue Apr30 09:26:59 2019 - [info] Master MHA Node version is 0.56.
Tue Apr30 09:26:59 2019 - [info] Checking recovery script configurations on 103.75.1.22(103.75.1.22:3306)..
Tue Apr30 09:26:59 2019 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data --output_file=/data/mastermha/app1//save_binary_logs_test --manager_version=0.56 --start_file=master-bin.000008Tue Apr30 09:26:59 2019 - [info] Connecting to root@103.75.1.22(103.75.1.22:22)..
Failed to save binary log: Binlognot found from /data! If you got this error at MHA Manager, please set "master_binlog_dir=/path/to/binlog_directory_of_the_master" correctly in the MHA Manager's configuration file and try again.
at /usr/bin/save_binary_logs line 123eval {...} called at/usr/bin/save_binary_logs line 70main::main() called at/usr/bin/save_binary_logs line 66Tue Apr30 09:27:00 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln158] Binlog setting check failed!
Tue Apr30 09:27:00 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln405] Master configuration failed.
Tue Apr30 09:27:00 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations. at /usr/bin/masterha_check_repl line 48Tue Apr30 09:27:00 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.
Tue Apr30 09:27:00 2019 - [info] Got exit code 1(Not master dead).
MySQL Replication Healthis NOT OK!
View Code
解决方法:
如果手动定义了二进制日志文件的路径,就必须在mha的配置文件中制定master_binlog_dir=‘二进制日志文件所在目录' 我是直接在app1.conf配置文件#注释掉这个master_binlog_dir=/data
(3)报错代码:
Tue Apr 30 10:04:21 2019 - [info] Checking replication health on 103.75.1.23..
Tue Apr30 10:04:21 2019 -[info] ok.
Tue Apr30 10:04:21 2019 - [info] Checking replication health on 103.75.1.24..
Tue Apr30 10:04:21 2019 -[info] ok.
Tue Apr30 10:04:21 2019 - [warning] master_ip_failover_script isnot defined.
Tue Apr30 10:04:21 2019 - [warning] shutdown_script isnot defined.
Tue Apr30 10:04:21 2019 - [info] Got exit code 0(Not master dead).
MySQL Replication Healthis OK.
View Code
这个报错代码出现在检查的最后面,意思是未定义这两个文件。未定义这两个文件我直接启动manage是卡住的。解决方法,在app1.conf配置文件添加master_ip_failover_scipt='脚本文件目录'
附脚本地址:http://control.blog.sina.com.cn/admin/article/article_edit.php?blog_id=b4fca5310102yan0
(3)报错代码:
103.75.1.22(103.75.1.22:3306) (current master)+--103.75.1.23(103.75.1.23:3306)+--103.75.1.24(103.75.1.24:3306)
Tue Apr30 10:44:55 2019 - [info] Checking replication health on 103.75.1.23..
Tue Apr30 10:44:55 2019 -[info] ok.
Tue Apr30 10:44:55 2019 - [info] Checking replication health on 103.75.1.24..
Tue Apr30 10:44:55 2019 -[info] ok.
Tue Apr30 10:44:55 2019 -[info] Checking master_ip_failover_script status:
Tue Apr30 10:44:55 2019 - [info] /data/mastermha/app1/master_ip_failover --command=status --ssh_user=root --orig_master_host=103.75.1.22 --orig_master_ip=103.75.1.22 --orig_master_port=3306Tue Apr30 10:44:55 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations. Can't exec "/data/mastermha/app1/master_ip_failover": Permission denied at /usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm line 68.
Tue Apr 30 10:44:55 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.
Tue Apr30 10:44:55 2019 - [info] Got exit code 1(Not master dead).
MySQL Replication Healthis NOT OK!Tue Apr30 10:44:55 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln226] Failed to get master_ip_failover_script status with return code 1:0.
Tue Apr30 10:44:55 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations. at /usr/bin/masterha_check_repl line 48Tue Apr30 10:44:55 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.
Tue Apr30 10:44:55 2019 - [info] Got exit code 1(Not master dead).
MySQL Replication Healthis NOT OK!
View Code
这个报错查了很多资料。我一直以为是我的master_ip_fialover脚本有问题。结果不是,是这个脚本没有执行权限,
参考
解决办法:赋权! chmod +x /data/mastermha/app1/master_ip_failover
再次执行发现解决!!
附完工图!
[root@localhost ~]# chmod +x /data/mastermha/app1/master_ip_failover
[root@localhost~]# masterha_check_repl --conf=/etc/mha/app1.cnf
Tue Apr30 10:51:59 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Apr30 10:51:59 2019 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Tue Apr30 10:51:59 2019 - [info] Reading server configuration from /etc/mha/app1.cnf..
Tue Apr30 10:51:59 2019 - [info] MHA::MasterMonitor version 0.56.
Tue Apr30 10:52:00 2019 - [info] GTID failover mode = 0Tue Apr30 10:52:00 2019 -[info] Dead Servers:
Tue Apr30 10:52:00 2019 -[info] Alive Servers:
Tue Apr30 10:52:00 2019 - [info] 103.75.1.22(103.75.1.22:3306)
Tue Apr30 10:52:00 2019 - [info] 103.75.1.23(103.75.1.23:3306)
Tue Apr30 10:52:00 2019 - [info] 103.75.1.24(103.75.1.24:3306)
Tue Apr30 10:52:00 2019 -[info] Alive Slaves:
Tue Apr30 10:52:00 2019 - [info] 103.75.1.23(103.75.1.23:3306) Version=5.7.25-log (oldest major version between slaves) log-bin:enabled
Tue Apr30 10:52:00 2019 - [info] Replicating from 103.75.1.22(103.75.1.22:3306)
Tue Apr30 10:52:00 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Tue Apr30 10:52:00 2019 - [info] 103.75.1.24(103.75.1.24:3306) Version=5.7.25-log (oldest major version between slaves) log-bin:enabled
Tue Apr30 10:52:00 2019 - [info] Replicating from 103.75.1.22(103.75.1.22:3306)
Tue Apr30 10:52:00 2019 - [info] Current Alive Master: 103.75.1.22(103.75.1.22:3306)
Tue Apr30 10:52:00 2019 -[info] Checking slave configurations..
Tue Apr30 10:52:00 2019 - [info] read_only=1 is not set on slave 103.75.1.24(103.75.1.24:3306).
Tue Apr30 10:52:00 2019 -[info] Checking replication filtering settings..
Tue Apr30 10:52:00 2019 - [info] binlog_do_db= , binlog_ignore_db=Tue Apr30 10:52:00 2019 -[info] Replication filtering check ok.
Tue Apr30 10:52:00 2019 - [info] GTID (with auto-pos) isnot supported
Tue Apr30 10:52:00 2019 -[info] Starting SSH connection tests..
Tue Apr30 10:52:07 2019 -[info] All SSH connection tests passed successfully.
Tue Apr30 10:52:07 2019 -[info] Checking MHA Node version..
Tue Apr30 10:52:11 2019 -[info] Version check ok.
Tue Apr30 10:52:11 2019 -[info] Checking SSH publickey authentication settings on the current master..
Tue Apr30 10:52:12 2019 - [info] HealthCheck: SSH to 103.75.1.22 isreachable.
Tue Apr30 10:52:14 2019 - [info] Master MHA Node version is 0.56.
Tue Apr30 10:52:14 2019 - [info] Checking recovery script configurations on 103.75.1.22(103.75.1.22:3306)..
Tue Apr30 10:52:14 2019 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql,/var/log/mysql --output_file=/data/mastermha/app1//save_binary_logs_test --manager_version=0.56 --start_file=master-bin.000008
Tue Apr 30 10:52:14 2019 - [info] Connecting to root@103.75.1.22(103.75.1.22:22)..
Creating/data/mastermha/app1 ifnot exists.. ok.
Checking output directoryisaccessible or not..
ok.
Binlog found at/var/lib/mysql, up to master-bin.000008Tue Apr30 10:52:16 2019 -[info] Binlog setting check done.
Tue Apr30 10:52:16 2019 -[info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Tue Apr30 10:52:16 2019 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mhauser' --slave_host=103.75.1.23 --slave_ip=103.75.1.23 --slave_port=3306 --workdir=/data/mastermha/app1/ --target_version=5.7.25-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx
Tue Apr30 10:52:16 2019 - [info] Connecting to root@103.75.1.23(103.75.1.23:22)..
Checking slave recovery environment settings..
Opening/var/lib/mysql/relay-log.info ... ok.
Relay log found at/var/lib/mysql, up to relay-log.000005Temporary relay log fileis /var/lib/mysql/relay-log.000005Testing mysql connection and privileges..mysql: [Warning] Using a password on the command lineinterfacecan be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Tue Apr30 10:52:17 2019 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mhauser' --slave_host=103.75.1.24 --slave_ip=103.75.1.24 --slave_port=3306 --workdir=/data/mastermha/app1/ --target_version=5.7.25-log --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx
Tue Apr30 10:52:17 2019 - [info] Connecting to root@103.75.1.24(103.75.1.24:22)..
Checking slave recovery environment settings..
Opening/var/lib/mysql/relay-log.info ... ok.
Relay log found at/var/lib/mysql, up to relay-log.000006Temporary relay log fileis /var/lib/mysql/relay-log.000006Testing mysql connection and privileges..mysql: [Warning] Using a password on the command lineinterfacecan be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Tue Apr30 10:52:19 2019 -[info] Slaves settings check done.
Tue Apr30 10:52:19 2019 -[info]103.75.1.22(103.75.1.22:3306) (current master)+--103.75.1.23(103.75.1.23:3306)+--103.75.1.24(103.75.1.24:3306)
Tue Apr30 10:52:19 2019 - [info] Checking replication health on 103.75.1.23..
Tue Apr30 10:52:19 2019 -[info] ok.
Tue Apr30 10:52:19 2019 - [info] Checking replication health on 103.75.1.24..
Tue Apr30 10:52:19 2019 -[info] ok.
Tue Apr30 10:52:19 2019 -[info] Checking master_ip_failover_script status:
Tue Apr30 10:52:19 2019 - [info] /data/mastermha/app1/master_ip_failover --command=status --ssh_user=root --orig_master_host=103.75.1.22 --orig_master_ip=103.75.1.22 --orig_master_port=3306IN SCRIPT TEST====/sbin/ifconfig bond1:1 down==/sbin/ifconfig bond1:1 103.75.1.30/26===Checking the Status of the script.. OK
SIOCSIFADDR: No such device
SIOCSIFNETMASK: No such device
SIOCGIFADDR: No such device
SIOCSIFBROADCAST: No such device
bond1:1: unknown interface: No such device
Tue Apr30 10:52:21 2019 -[info] OK.
Tue Apr30 10:52:21 2019 - [warning] shutdown_script isnot defined.
Tue Apr30 10:52:21 2019 - [info] Got exit code 0(Not master dead).
MySQL Replication Healthis OK.
View Code
3.master_manage --conf=/etc/app1.conf
这里我卡住。后来查找
资料发现启动方式不一样
[root@localhost ~]# nohup masterha_manager --conf=/etc/mha/app1.cnf > /data/mastermha/app1/manager.log &1 &
[1] 2190
上面的就是启动命令,需要启动文件和日志
[root@localhost ~]# masterha_check_status --conf=/etc/mha/app1.cnf
app1 monitoring program is now on initialization phase(10:INITIALIZING_MONITOR). Wait for a while and try checking again.
查看状态就会提示在初始化,稍后一段时间,
再次执行就会发现启动成功
app1 monitoring program is now on initialization phase(10:INITIALIZING_MONITOR). Wait for a while and try checking again.
[root@localhost ~]# masterha_check_status --conf=/etc/mha/app1.cnf
app1 (pid:2190) is running(0:PING_OK), master:103.75.1.22