mysql cluster 和mha_MYSQL之MHA集群

环境

manager192.168.137.141

master1192.168.137.144

master2192.168.137.145

slave192.168.137.141

vip192.168.137.199

部署

三台机器已安装mysql。

百度云:链接:https://pan.baidu.com/s/1an3QjoFFdqcjo5-KWRCShw 密码:wsq9

192.168.137.141

mha4mysql-manager-0.55-0.el6.noarch.rpm

mha4mysql-node-0.54-0.el6.noarch.rpm

192.168.137.144

192.168.137.145

mha4mysql-node-0.54-0.el6.noarch.rpm

依赖

链接:https://pan.baidu.com/s/1mTcoBiUsvATQMkM3bXJKrA 密码:rpvq

yum install perl perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker perl-CPAN perl-DBD-MySQL

都要安装。

主模块1:配置集群之前SSH免密码登录

192.168.137.141 --> 192.168.137.144

192.168.137.141 --> 192.168.137.145

192.168.137.144 --> 192.168.137.141

192.168.137.144 --> 192.168.137.145

192.168.137.145 --> 192.168.137.141

192.168.137.145 --> 192.168.137.144

6次

主模块2:配置集群主从复制

master:192.168.137.144

slave:192.168.137.145

master:192.168.137.144

slave:192.168.137.141

主模块3:manager配置

mkdir -p /etc/masterha

cd /etc/masterha

vim app1.cnf

app1.cnf

[server default]

user=root

password=lizhenghua

manager_workdir=/etc/masterha

manager_log=/etc/masterha/logs/manager.log

remote_workdir=/etc/masterha

ssh_user=root

repl_user=sync

repl_password=sync

ping_interval=3

#master_ip_online_change_script=/etc/masterha/script/master_ip_online_change

master_ip_failover_script=/etc/masterha/script/master_ip_failover

#report_script=/etc/masterha/script/sendMail_report

[server1]

hostname=192.168.137.144

port=3306

master_binlog_dir=/usr/local/mysql/data/

candidate_master=1

[server2]

hostname=192.168.137.145

port=3306

master_binlog_dir=/usr/local/mysql/data/

candidate_master=1

[server3]

hostname=192.168.137.141

port=3306

master_binlog_dir=/usr/local/mysql/data/

candidate_master=1

副模块1:检测SSH

37c7b330c912f28c6fac309a9fbe33b2.png

masterha_check_ssh --conf=/etc/masterha/app1.cnf

模块2:检测主从

0fa6c798f304333251c0b3522f078ee2.png

masterha_check_repl --conf=/etc/masterha/app1.cnf

manager启动方式

vim start.sh

#!/usr/bin/bash

nohup perl /root/perl5/bin/masterha_manager --conf=/etc/masterha/conf/app1.cnf &

功能测试

192.168.137.141

mysql> show slave status \G\

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 192.168.137.144

Master_User: sync

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.000003

Read_Master_Log_Pos: 154

Relay_Log_File: mysql-relay-bin.000004

Relay_Log_Pos: 367

Relay_Master_Log_File: mysql-bin.000003

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno: 0

Last_Error:

Skip_Counter: 0

Exec_Master_Log_Pos: 154

Relay_Log_Space: 740

Until_Condition: None

Until_Log_File:

Until_Log_Pos: 0

Master_SSL_Allowed: No

Master_SSL_CA_File:

Master_SSL_CA_Path:

Master_SSL_Cert:

Master_SSL_Cipher:

Master_SSL_Key:

Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

Last_IO_Errno: 0

Last_IO_Error:

Last_SQL_Errno: 0

Last_SQL_Error:

Replicate_Ignore_Server_Ids:

Master_Server_Id: 144

Master_UUID: 9abbdca0-424e-11e8-a71a-000c29deb434

Master_Info_File: /usr/local/mysql/data/master.info

SQL_Delay: 0

SQL_Remaining_Delay: NULL

Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates

Master_Retry_Count: 86400

Master_Bind:

Last_IO_Error_Timestamp:

Last_SQL_Error_Timestamp:

Master_SSL_Crl:

Master_SSL_Crlpath:

Retrieved_Gtid_Set:

Executed_Gtid_Set:

Auto_Position: 0

Replicate_Rewrite_DB:

Channel_Name:

Master_TLS_Version:

1 row in set (0.00 sec)

kill掉192.168.137.144机器上的mysql, 然后141开始切换master

mysql> show slave status \G\

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 192.168.137.145

Master_User: sync

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.000003

Read_Master_Log_Pos: 154

Relay_Log_File: mysql-relay-bin.000002

Relay_Log_Pos: 320

Relay_Master_Log_File: mysql-bin.000003

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno: 0

Last_Error:

Skip_Counter: 0

Exec_Master_Log_Pos: 154

Relay_Log_Space: 527

Until_Condition: None

Until_Log_File:

Until_Log_Pos: 0

Master_SSL_Allowed: No

Master_SSL_CA_File:

Master_SSL_CA_Path:

Master_SSL_Cert:

Master_SSL_Cipher:

Master_SSL_Key:

Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

Last_IO_Errno: 0

Last_IO_Error:

Last_SQL_Errno: 0

Last_SQL_Error:

Replicate_Ignore_Server_Ids:

Master_Server_Id: 145

Master_UUID: 9abbdca0-424e-11e8-a71a-000c29deb433

Master_Info_File: /usr/local/mysql/data/master.info

SQL_Delay: 0

SQL_Remaining_Delay: NULL

Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates

Master_Retry_Count: 86400

Master_Bind:

Last_IO_Error_Timestamp:

Last_SQL_Error_Timestamp:

Master_SSL_Crl:

Master_SSL_Crlpath:

Retrieved_Gtid_Set:

Executed_Gtid_Set:

Auto_Position: 0

Replicate_Rewrite_DB:

Channel_Name:

Master_TLS_Version:

1 row in set (0.00 sec)

manager上日志

8cb6961009b5863dc5687754736374c1.png

切换成功!

主模块3:VIP

master_ip_failover

#!/usr/bin/env perl

use strict;

use warnings FATAL => 'all';

use Getopt::Long;

my (

$command, $ssh_user, $orig_master_host, $orig_master_ip,

$orig_master_port, $new_master_host, $new_master_ip, $new_master_port

);

my $vip = '192.168.137.199/24';

my $key = '0';

my $ssh_start_vip = "/usr/sbin/ifconfig ens33:$key $vip netmask 255.255.255.0 up";

my $ssh_stop_vip = "/usr/sbin/ifconfig ens33:$key down";

GetOptions(

'command=s' => \$command,

'ssh_user=s' => \$ssh_user,

'orig_master_host=s' => \$orig_master_host,

'orig_master_ip=s' => \$orig_master_ip,

'orig_master_port=i' => \$orig_master_port,

'new_master_host=s' => \$new_master_host,

'new_master_ip=s' => \$new_master_ip,

'new_master_port=i' => \$new_master_port,

);

exit &main();

sub main {

print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

if ( $command eq "stop" || $command eq "stopssh" ) {

my $exit_code = 1;

eval {

print "Disabling the VIP on old master: $orig_master_host \n";

&stop_vip();

$exit_code = 0;

};

if ($@) {

warn "Got Error: $@\n";

exit $exit_code;

}

exit $exit_code;

}

elsif ( $command eq "start" ) {

my $exit_code = 10;

eval {

print "Enabling the VIP - $vip on the new master - $new_master_host \n";

&start_vip();

$exit_code = 0;

};

if ($@) {

warn $@;

exit $exit_code;

}

exit $exit_code;

}

elsif ( $command eq "status" ) {

print "Checking the Status of the script.. OK \n";

exit 0;

}

else {

&usage();

exit 1;

}

}

sub start_vip() {

`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;

}

sub stop_vip() {

return 0 unless ($ssh_user);

`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;

}

sub usage {

print

"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip

--orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";

}

检测VIP漂移

第一步:启动manager

[root@localhost masterha]# ./start.sh

[root@localhost masterha]# nohup: 把输出追加到"nohup.out"

务必保证已经启动成功(ssh,主从检测都没错一般不会出问题)!

start.sh里写了什么? 看上面

第二步:停止master mysql(192.168.137.144)

[root@localhost masterha]# ssh root@192.168.137.144 "service mysql stop"

Shutting down MySQL............ SUCCESS!

看manager.log日志

edcf401166fcadac10c9d0cef2bd6e63.png

查看原来master1 192.168.137.144发现VIP已经被down掉

77f146390541c5206febc925639ea7eb.png

发现VIP漂移到了新的master2 192.168.137.145上

1b93b0abad0c9d2c3f7c90c76648a746.png

此时VIP切换成功!

同时查看主从切换, 按道理说现在master1 192.168.137.144上的mysql已经宕掉, slave应该连接新的master2 137.145上

漂移前查询

mysql> show slave status \G

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 192.168.137.144

Master_User: sync

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.000016

Read_Master_Log_Pos: 154

Relay_Log_File: mysql-relay-bin.000002

Relay_Log_Pos: 320

Relay_Master_Log_File: mysql-bin.000016

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno: 0

Last_Error:

Skip_Counter: 0

Exec_Master_Log_Pos: 154

Relay_Log_Space: 527

Until_Condition: None

Until_Log_File:

Until_Log_Pos: 0

Master_SSL_Allowed: No

Master_SSL_CA_File:

Master_SSL_CA_Path:

Master_SSL_Cert:

Master_SSL_Cipher:

Master_SSL_Key:

Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

Last_IO_Errno: 0

Last_IO_Error:

Last_SQL_Errno: 0

Last_SQL_Error:

Replicate_Ignore_Server_Ids:

Master_Server_Id: 144

Master_UUID: 9abbdca0-424e-11e8-a71a-000c29deb434

Master_Info_File: /usr/local/mysql/data/master.info

SQL_Delay: 0

SQL_Remaining_Delay: NULL

Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates

Master_Retry_Count: 86400

Master_Bind:

Last_IO_Error_Timestamp:

Last_SQL_Error_Timestamp:

Master_SSL_Crl:

Master_SSL_Crlpath:

Retrieved_Gtid_Set:

Executed_Gtid_Set:

Auto_Position: 0

Replicate_Rewrite_DB:

Channel_Name:

Master_TLS_Version:

1 row in set (0.00 sec)

漂移后查询

mysql> show slave status \G

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 192.168.137.145

Master_User: sync

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.000005

Read_Master_Log_Pos: 154

Relay_Log_File: mysql-relay-bin.000002

Relay_Log_Pos: 320

Relay_Master_Log_File: mysql-bin.000005

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno: 0

Last_Error:

Skip_Counter: 0

Exec_Master_Log_Pos: 154

Relay_Log_Space: 527

Until_Condition: None

Until_Log_File:

Until_Log_Pos: 0

Master_SSL_Allowed: No

Master_SSL_CA_File:

Master_SSL_CA_Path:

Master_SSL_Cert:

Master_SSL_Cipher:

Master_SSL_Key:

Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

Last_IO_Errno: 0

Last_IO_Error:

Last_SQL_Errno: 0

Last_SQL_Error:

Replicate_Ignore_Server_Ids:

Master_Server_Id: 145

Master_UUID: 9abbdca0-424e-11e8-a71a-000c29deb433

Master_Info_File: /usr/local/mysql/data/master.info

SQL_Delay: 0

SQL_Remaining_Delay: NULL

Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates

Master_Retry_Count: 86400

Master_Bind:

Last_IO_Error_Timestamp:

Last_SQL_Error_Timestamp:

Master_SSL_Crl:

Master_SSL_Crlpath:

Retrieved_Gtid_Set:

Executed_Gtid_Set:

Auto_Position: 0

Replicate_Rewrite_DB:

Channel_Name:

Master_TLS_Version:

1 row in set (0.00 sec)

VIP漂移正常,主从复制正常,MHA搭建成功!

报错信息:

SSH

测试ssh报错

[root@localhost app]# masterha_check_ssh --conf=/etc/masterha/app/app1.cnf

Thu May 17 01:04:32 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Thu May 17 01:04:32 2018 - [info] Reading application default configurations from /etc/masterha/app/app1.cnf..

Thu May 17 01:04:32 2018 - [info] Reading server configurations from /etc/masterha/app/app1.cnf..

Thu May 17 01:04:32 2018 - [info] Starting SSH connection tests..

Thu May 17 01:04:33 2018 - [error][/usr/share/perl5/vendor_perl/MHA/SSHCheck.pm, ln63]

Thu May 17 01:04:32 2018 - [debug] Connecting via SSH from root@192.168.137.144(192.168.137.144:22) to root@192.168.137.145(192.168.137.145:22)..

Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).

Thu May 17 01:04:32 2018 - [error][/usr/share/perl5/vendor_perl/MHA/SSHCheck.pm, ln111] SSH connection from root@192.168.137.144(192.168.137.144:22) to root@192.168.137.145(192.168.137.145:22) failed!

Thu May 17 01:04:33 2018 - [error][/usr/share/perl5/vendor_perl/MHA/SSHCheck.pm, ln63]

Thu May 17 01:04:32 2018 - [debug] Connecting via SSH from root@192.168.137.145(192.168.137.145:22) to root@192.168.137.144(192.168.137.144:22)..

Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).

Thu May 17 01:04:33 2018 - [error][/usr/share/perl5/vendor_perl/MHA/SSHCheck.pm, ln111] SSH connection from root@192.168.137.145(192.168.137.145:22) to root@192.168.137.144(192.168.137.144:22) failed!

Bizarre copy of ARRAY in scalar assignment at /usr/share/perl5/vendor_perl/Carp.pm line 182.

解决方案

三台机器互相配置ssh免密码认证(每台两次共6次)

再测

[root@localhost app]# masterha_check_ssh --conf=/etc/masterha/app/app1.cnf

Thu May 17 01:12:33 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Thu May 17 01:12:33 2018 - [info] Reading application default configurations from /etc/masterha/app/app1.cnf..

Thu May 17 01:12:33 2018 - [info] Reading server configurations from /etc/masterha/app/app1.cnf..

Thu May 17 01:12:33 2018 - [info] Starting SSH connection tests..

Thu May 17 01:12:34 2018 - [debug]

Thu May 17 01:12:33 2018 - [debug] Connecting via SSH from root@192.168.137.144(192.168.137.144:22) to root@192.168.137.145(192.168.137.145:22)..

Thu May 17 01:12:34 2018 - [debug] ok.

Thu May 17 01:12:35 2018 - [debug]

Thu May 17 01:12:34 2018 - [debug] Connecting via SSH from root@192.168.137.145(192.168.137.145:22) to root@192.168.137.144(192.168.137.144:22)..

Thu May 17 01:12:35 2018 - [debug] ok.

Thu May 17 01:12:35 2018 - [info] All SSH connection tests passed successfully.

主从1

主主同步报错

mysql> show slave status \G

*************************** 1. row ***************************

Slave_IO_State:

Master_Host: 192.168.137.145

Master_User: sync

Master_Port: 3306

Connect_Retry: 60

Master_Log_File: mysql-bin.000001

Read_Master_Log_Pos: 336

Relay_Log_File: localhost-relay-bin.000001

Relay_Log_Pos: 4

Relay_Master_Log_File: mysql-bin.000001

Slave_IO_Running: No

Slave_SQL_Running: Yes

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno: 0

Last_Error:

Skip_Counter: 0

Exec_Master_Log_Pos: 336

Relay_Log_Space: 154

Until_Condition: None

Until_Log_File:

Until_Log_Pos: 0

Master_SSL_Allowed: No

Master_SSL_CA_File:

Master_SSL_CA_Path:

Master_SSL_Cert:

Master_SSL_Cipher:

▽ Master_SSL_Key:

Seconds_Behind_Master: NULL

Master_SSL_Verify_Server_Cert: No

Last_IO_Errno: 1593

Last_IO_Error: Fatal error: The slave I/O thread stops because master and slave have equal MySQL server UUIDs; these UUIDs must be different for replication to work.

Last_SQL_Errno: 0

Last_SQL_Error:

Replicate_Ignore_Server_Ids:

Master_Server_Id: 145

Master_UUID:

Master_Info_File: /usr/local/mysql/data/master.info

SQL_Delay: 0

SQL_Remaining_Delay: NULL

Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates

Master_Retry_Count: 86400

Master_Bind:

Last_IO_Error_Timestamp: 180517 01:37:32

Last_SQL_Error_Timestamp:

Master_SSL_Crl:

Master_SSL_Crlpath:

Retrieved_Gtid_Set:

Executed_Gtid_Set:

Auto_Position: 0

Replicate_Rewrite_DB:

Channel_Name:

Master_TLS_Version:

1 row in set (0.00 sec)

解决:

[root@localhost .ssh]# cd /usr/local/mysql/data/

[root@localhost data]# vim auto.cnf

[auto]

server-uuid=9abbdca0-424e-11e8-a71a-000c29deb434

保存退出。(保证两个mysql的uuid不一致即可)

主从2

报错3:主从问题

[root@localhost ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf

Thu May 17 21:32:15 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Thu May 17 21:32:15 2018 - [info] Reading application default configurations from /etc/masterha/app1.cnf..

Thu May 17 21:32:15 2018 - [info] Reading server configurations from /etc/masterha/app1.cnf..

Thu May 17 21:32:15 2018 - [info] MHA::MasterMonitor version 0.55.

Thu May 17 21:32:15 2018 - [warning] SQL Thread is stopped(no error) on 192.168.137.144(192.168.137.144:3306)

Thu May 17 21:32:15 2018 - [error][/root/perl5/lib/perl5/MHA/ServerManager.pm, ln732] Multi-master configuration is detected, but two or more masters are either writable (read-only is not set) or dead! Check configurations for details. Master configurations are as below:

Master 192.168.137.145(192.168.137.145:3306), replicating from 192.168.137.144(192.168.137.144:3306)

Master 192.168.137.144(192.168.137.144:3306), replicating from 192.168.137.145(192.168.137.145:3306)

Thu May 17 21:32:15 2018 - [error][/root/perl5/lib/perl5/MHA/MasterMonitor.pm, ln386] Error happend on checking configurations. at /root/perl5/lib/perl5/MHA/MasterMonitor.pm line 300.

Thu May 17 21:32:15 2018 - [error][/root/perl5/lib/perl5/MHA/MasterMonitor.pm, ln482] Error happened on monitoring servers.

Thu May 17 21:32:15 2018 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!

[root@localhost ~]#

解决:

在每台mysql的my.cnf下配置属性:(主要是备机)

relay-log=/usr/local/mysql/binlog/mysql-relay-bin

mkdir -p /usr/local/mysql/binlog

chown -R mysql:mysql /usr/local/mysql/binlog

重启mysql。

重新配置主从:

CHANGE MASTER TO MASTER_HOST='192.168.137.144',MASTER_USER='sync',MASTER_PASSWORD='sync',MASTER_LOG_FILE='mysql-bin.000002',MASTER_LOG_POS=154;

结果:

[root@localhost ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf

Thu May 17 21:58:11 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Thu May 17 21:58:11 2018 - [info] Reading application default configurations from /etc/masterha/app1.cnf..

Thu May 17 21:58:11 2018 - [info] Reading server configurations from /etc/masterha/app1.cnf..

Thu May 17 21:58:11 2018 - [info] MHA::MasterMonitor version 0.55.

Thu May 17 21:58:11 2018 - [info] Dead Servers:

Thu May 17 21:58:11 2018 - [info] Alive Servers:

Thu May 17 21:58:11 2018 - [info] 192.168.137.144(192.168.137.144:3306)

Thu May 17 21:58:11 2018 - [info] 192.168.137.145(192.168.137.145:3306)

Thu May 17 21:58:11 2018 - [info] Alive Slaves:

Thu May 17 21:58:11 2018 - [info] 192.168.137.145(192.168.137.145:3306) Version=5.7.21-log (oldest major version between slaves) log-bin:enabled

Thu May 17 21:58:11 2018 - [info] Replicating from 192.168.137.144(192.168.137.144:3306)

Thu May 17 21:58:11 2018 - [info] Primary candidate for the new Master (candidate_master is set)

Thu May 17 21:58:11 2018 - [info] Current Alive Master: 192.168.137.144(192.168.137.144:3306)

Thu May 17 21:58:11 2018 - [info] Checking slave configurations..

Thu May 17 21:58:11 2018 - [info] read_only=1 is not set on slave 192.168.137.145(192.168.137.145:3306).

Thu May 17 21:58:11 2018 - [warning] relay_log_purge=0 is not set on slave 192.168.137.145(192.168.137.145:3306).

Thu May 17 21:58:11 2018 - [info] Checking replication filtering settings..

Thu May 17 21:58:11 2018 - [info] binlog_do_db= , binlog_ignore_db=

Thu May 17 21:58:11 2018 - [info] Replication filtering check ok.

Thu May 17 21:58:11 2018 - [info] Starting SSH connection tests..

Thu May 17 21:58:12 2018 - [info] All SSH connection tests passed successfully.

Thu May 17 21:58:12 2018 - [info] Checking MHA Node version..

Thu May 17 21:58:13 2018 - [info] Version check ok.

Thu May 17 21:58:13 2018 - [info] Checking SSH publickey authentication settings on the current master..

Thu May 17 21:58:13 2018 - [info] HealthCheck: SSH to 192.168.137.144 is reachable.

Thu May 17 21:58:14 2018 - [info] Master MHA Node version is 0.54.

Thu May 17 21:58:14 2018 - [info] Checking recovery script configurations on the current master..

Thu May 17 21:58:14 2018 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/usr/local/mysql/data/ --output_file=/masterha/save_binary_logs_test --manager_version=0.55 --start_file=mysql-bin.000002

Thu May 17 21:58:14 2018 - [info] Connecting to root@192.168.137.144(192.168.137.144)..

Creating /masterha if not exists.. Creating directory /masterha.. done.

ok.

Checking output directory is accessible or not..

ok.

Binlog found at /usr/local/mysql/data/, up to mysql-bin.000002

Thu May 17 21:58:14 2018 - [info] Master setting check done.

Thu May 17 21:58:14 2018 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..

Thu May 17 21:58:14 2018 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.137.145 --slave_ip=192.168.137.145 --slave_port=3306 --workdir=/masterha --target_version=5.7.21-log --manager_version=0.55 --relay_log_info=/usr/local/mysql/data/relay-log.info --relay_dir=/usr/local/mysql/data/ --slave_pass=xxx

Thu May 17 21:58:14 2018 - [info] Connecting to root@192.168.137.145(192.168.137.145:22)..

Creating directory /masterha.. done.

Checking slave recovery environment settings..

Opening /usr/local/mysql/data/relay-log.info ... ok.

Relay log found at /usr/local/mysql/binlog, up to mysql-relay-bin.000002

Temporary relay log file is /usr/local/mysql/binlog/mysql-relay-bin.000002

Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.

done.

Testing mysqlbinlog output.. done.

Cleaning up test file(s).. done.

Thu May 17 21:58:15 2018 - [info] Slaves settings check done.

Thu May 17 21:58:15 2018 - [info]

192.168.137.144 (current master)

+--192.168.137.145

Thu May 17 21:58:15 2018 - [info] Checking replication health on 192.168.137.145..

Thu May 17 21:58:15 2018 - [info] ok.

Thu May 17 21:58:15 2018 - [warning] master_ip_failover_script is not defined.

Thu May 17 21:58:15 2018 - [warning] shutdown_script is not defined.

Thu May 17 21:58:15 2018 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

VIP

VIP恢复master1后, 重新启动manager后发现报错

b431207faf3ac18e896e0186449103b2.png

解决方法:按提示删除指定文件即可。

fd72f551e5602408d0a0fb4db1f8e1b1.png

ssh

这是我搭建的第二遍了, 这是第二遍时出的错

由于manager有点特殊,在测试ssh的时候发现了一个大问题, 折磨了我很久

72c5c77fcf30c2e4eac5cfb82a5e485e.png

master:148

master2:149

slave:150

为什么会这样, 这是ssh 148 --> 149不行, 但是手动ssh登录的时候是可以免密码跨越的.

后来想到这是manager和slave一体的

ssh-copy-id root@192.168.137.148

192.168.137.148是本机, 这是给本地的一个免密

然后再测试, 完美通过

d367ee7c5ba14fb52e06371bb75c82b4.png

mysqlbinlog

如下

[root@localhost masterha]# masterha_check_repl --conf=/etc/masterha/app1.cnf

Fri Jul 20 23:39:47 2018 - [info] Checking recovery script configurations on 192.168.137.148(192.168.137.148:3306)..

Fri Jul 20 23:39:47 2018 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/opt/mysql/data/ --output_file=/etc/masterha/save_binary_logs_test --manager_version=0.58 --start_file=mysql-bin.000001

Fri Jul 20 23:39:47 2018 - [info] Connecting to root@192.168.137.148(192.168.137.148:22)..

Creating /etc/masterha if not exists.. ok.

Checking output directory is accessible or not..

ok.

Binlog found at /opt/mysql/data/, up to mysql-bin.000001

Fri Jul 20 23:39:47 2018 - [info] Binlog setting check done.

Fri Jul 20 23:39:47 2018 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..

Fri Jul 20 23:39:47 2018 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.137.149 --slave_ip=192.168.137.149 --slave_port=3306 --workdir=/etc/masterha --target_version=5.7.22-log --manager_version=0.58 --relay_log_info=/opt/mysql/data/relay-log.info --relay_dir=/opt/mysql/data/ --slave_pass=xxx

Fri Jul 20 23:39:47 2018 - [info] Connecting to root@192.168.137.149(192.168.137.149:22)..

Can't exec "mysqlbinlog": 没有那个文件或目录 at /usr/share/perl5/vendor_perl/MHA/BinlogManager.pm line 106.

mysqlbinlog version command failed with rc 1:0, please verify PATH, LD_LIBRARY_PATH, and client options

at /usr/bin/apply_diff_relay_logs line 532.

Fri Jul 20 23:39:48 2018 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln208] Slaves settings check failed!

Fri Jul 20 23:39:48 2018 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln416] Slave configuration failed.

Fri Jul 20 23:39:48 2018 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/bin/masterha_check_repl line 48.

Fri Jul 20 23:39:48 2018 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.

Fri Jul 20 23:39:48 2018 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!

这个报错就是没有搞这个环境, 然后它找不到mysqlbinlog这个命令, 这个就比较容易解决

ln -s /opt/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog

第一个是你mysql中bin的mysqlbinlog, 把它软链接扔进全局环境/usr/bin里面

Slave IO

如下

Fri Jul 20 23:48:07 2018 - [info] Checking replication health on 192.168.137.149..

Fri Jul 20 23:48:07 2018 - [info] ok.

Fri Jul 20 23:48:07 2018 - [info] Checking replication health on 192.168.137.150..

Fri Jul 20 23:48:07 2018 - [info] ok.

Fri Jul 20 23:48:07 2018 - [info] Checking replication health on 192.168.137.148..

Fri Jul 20 23:48:07 2018 - [error][/usr/share/perl5/vendor_perl/MHA/Server.pm, ln490] Slave IO thread is not running on 192.168.137.148(192.168.137.148:3306)

Fri Jul 20 23:48:07 2018 - [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln1526] failed!

Fri Jul 20 23:48:07 2018 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 420.

Fri Jul 20 23:48:07 2018 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.

Fri Jul 20 23:48:07 2018 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!

这个报错完全就是打快了, 不小心把主从同步的slave命令(下面那一条)扔进了master中, 罪过

CHANGE MASTER TO MASTER_HOST='192.168.137.148',MASTER_USER='sync',MASTER_PASSWORD='lizhenghua',MASTER_LOG_FILE='mysql-bin.000001',MASTER_LOG_POS=1121;

这个解决的思路就是从master中干掉它即可

mysql>stop slave;

mysql>reset slave all;

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值