服务器信息:
test-db-1:192.168.151.178 (master)
test-db-2:192.168.147.128 (slave)
test-db-3:192.168.155.178 (slave)
test-maxscale:192.168.145.236 (MHA+Maxscale)
基础配置--hosts
$ /etc/hosts
192.168.151.178 test-db-1
192.168.147.128 test-db-2
192.168.155.178 test-db-3
192.168.145.236 test-maxscale
基础配置--免密钥登录
说明:4台机器都要执行
ssh-keygen -t rsa ssh-copy-id test-db-1 ssh-copy-id test-db-2 ssh-copy-id test-db-3 ssh-copy-id test-maxscale
Step 1.0 安装Mysql 5.7
说明:操作在test-db-1~test-db-3 执行
#下载 & 安装
$ rpm -ivh http://dev.mysql.com/get/mysql57-community-release-el7-8.noarch.rpm
$ yum install mysql-community-server
$ service mysqld start
#登录mysql,mysql5.7需要初始化root密钥
mysql> use mysql;
ERROR 1820 (HY000): You must reset your password using ALTER USER statement before executing this statement.
#设置密码
mysql> alter user 'root'@'localhost' identified by 'root';
Query OK, 0 rows affected (0.00 sec)
Step 1.1 配置主从 --设置server_id & log_bin
#test-db-1
$ vim /etc/my.cnf
[mysqld]
server_id=1
log_bin=test-db-1
# test-db-2
$ vim /etc/my.cnf
[mysqld]
server_id=2
log_bin=test-db-2
# test-db-3
$ vim /etc/my.cnf
[mysqld]
server_id=3
#设置server_id & log_bin 重启mysql
$ service mysqld start
Step 1.2 配置主从--创建同步用户 & 设置主从
#创建同步数据的用户
mysql> grant replication slave on *.* to 'repl'@'192.168.147.128' identified by 'p4Mp*1z_&kv';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> grant replication slave on *.* to 'repl'@'192.168.155.178' identified by 'p4Mp*1z_&kv';
Query OK, 0 rows affected, 1 warning (0.01 sec)
#设置主从 在test-db-2 & test-db-3执行
mysql> CHANGE MASTER TO MASTER_HOST='192.168.151.178', MASTER_PORT=3306,
MASTER_LOG_FILE='test-db-1.000002', MASTER_LOG_POS=154, MASTER_USER='repl',
MASTER_PASSWORD='p4Mp*1z_&kv';
# 查看主从状态
mysql> start slave;
Query OK, 0 rows affected (0.01 sec)
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.151.178
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: test-db-1.000002
Read_Master_Log_Pos: 154
Relay_Log_File: test-db-3-relay-bin.000002
Relay_Log_Pos: 320
Relay_Master_Log_File: test-db-1.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 154
Relay_Log_Space: 531
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_UUID: 328842a9-67d8-11e9-ae88-f23c91a3d9a1
Master_Info_File: /var/lib/mysql/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
Step 2.0 安装MHA
说明:test-maxscale安装node & manager,test-db-1~test-db-3安装node
#我们要安装MHA的安装包,安装MHA以前,要安装一些依赖环境
但是这些包我们系统是没有的,需要我们安装相应的epel第三方资源库,再安装,我们可以先去
https://fedoraproject.org/wiki/EPEL 这个网站下载我们需要的包.
$ yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
$ yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes -y
#安装完依赖可以看一下源
$ yum repolist
repo id repo name status
base/7/x86_64 CentOS-7 - Base 10,019
*epel/x86_64 Extra Packages for Enterprise Linux 7 - x86_64 13,082
extras/7/x86_64 CentOS-7 - Extras 386
updates/7/x86_64 CentOS-7 - Updates 1,580
repolist: 25,067
说明:可以看到已经有epel相关的资源了,所以我们就可以执行执行上面的yum语句安装MHA的依赖环境
#完成后安装NODE & Manage
下载地址:https://code.google.com/p/mysql-master-ha/
$ yum localinstall -y mha4mysql-node-0.57-0.el7.noarch.rpm
$ yum localinstall -y mha4mysql-manager-0.57-0.el7.noarch.rpm
注意:需要先安装node 再安装manager否则会安装失败.
#Manager工具包主要包括以下几个工具:
masterha_check_ssh 检查MHA的SSH配置状况
masterha_check_repl 检查MySQL复制状况
masterha_manger 启动MHA
masterha_check_status 检测当前MHA运行状态
masterha_master_monitor 检测master是否宕机
masterha_master_switch 控制故障转移(自动或者手动)
masterha_conf_host 添加或删除配置的server信息
#Node工具包(这些工具通常由MHA Manager的脚本触发,无需人为操作)主要包括以下几个工具:
save_binary_logs 保存和复制master的二进制日志
apply_diff_relay_logs 识别差异的中继日志事件并将其差异的事件应用于其他的slave
filter_mysqlbinlog 去除不必要的ROLLBACK事件(MHA已不再使用这个工具)
purge_relay_logs 清除中继日志(不会阻塞SQL线程)
Step 2.1 配置MHA
说明:配置部分在test-maxscale
#在mysql 创建MHA管理用户
mysql> GRANT ALL PRIVILEGES ON *.* TO 'mha'@'%' identified by 'p4Mp*1z_&kv';
#新建一个工作目录
$ mkdir /mha
#编辑配置文件
$ vim /etc/masterha_default.cnf
[server default]
#授权mysql管理用戶名
user=mha
password=p4Mp*1z_&kv
manager_log=/mha/manager.log
remote_workdir=/mha
#ssh免密钥登录的帐号名
ssh_user=root
#mysql复制帐号,用来在主从机之间同步二进制日志等
repl_user=repl
repl_password=p4Mp*1z_&kv
#ping间隔,用来检测master是否正常
ping_interval= 1
[server1]
hostname=test-db-1
master_binlog_dir=/var/lib/mysql
#候选人,master挂掉时候优先让它顶
candidate_master=1
[server2]
hostname=test-db-2n
master_binlog_dir=/var/lib/mysql
candidate_master=1
check_repl_delay=0
[server3]
hostname=test-db-3
master_binlog_dir=/var/lib/mysql
#不能成为master
no_master=1
Step 2.3 验证ssh互认是否成功--test-maxscale
$ masterha_check_ssh --conf=/etc/masterha_default.cnf
Fri Apr 26 07:57:48 2019 - [info] Reading default configuration from /etc/masterha_default.cnf..
Fri Apr 26 07:57:48 2019 - [info] Reading application default configuration from /etc/masterha_default.cnf..
Fri Apr 26 07:57:48 2019 - [info] Reading server configuration from /etc/masterha_default.cnf..
Fri Apr 26 07:57:48 2019 - [info] Starting SSH connection tests..
Fri Apr 26 07:57:51 2019 - [debug]
Fri Apr 26 07:57:48 2019 - [debug] Connecting via SSH from root@test-db-1(192.168.151.178:22) to root@test-db-2(192.168.137.26:22)..
Fri Apr 26 07:57:49 2019 - [debug] ok.
Fri Apr 26 07:57:49 2019 - [debug] Connecting via SSH from root@test-db-1(192.168.151.178:22) to root@test-db-3(192.168.155.178:22)..
Fri Apr 26 07:57:50 2019 - [debug] ok.
Fri Apr 26 07:57:51 2019 - [debug]
Fri Apr 26 07:57:49 2019 - [debug] Connecting via SSH from root@test-db-2(192.168.137.26:22) to root@test-db-1(192.168.151.178:22)..
Warning: Permanently added '192.168.151.178' (ECDSA) to the list of known hosts.
Fri Apr 26 07:57:50 2019 - [debug] ok.
Fri Apr 26 07:57:50 2019 - [debug] Connecting via SSH from root@test-db-2(192.168.137.26:22) to root@test-db-3(192.168.155.178:22)..
Fri Apr 26 07:57:51 2019 - [debug] ok.
Fri Apr 26 07:57:52 2019 - [debug]
Fri Apr 26 07:57:49 2019 - [debug] Connecting via SSH from root@test-db-3(192.168.155.178:22) to root@test-db-1(192.168.151.178:22)..
Fri Apr 26 07:57:50 2019 - [debug] ok.
Fri Apr 26 07:57:50 2019 - [debug] Connecting via SSH from root@test-db-3(192.168.155.178:22) to root@test-db-2(192.168.137.26:22)..
Fri Apr 26 07:57:51 2019 - [debug] ok.
Fri Apr 26 07:57:52 2019 - [info] All SSH connection tests passed successfully.
Step 2.4 通过检查 下一步 检查mysql主从复制
$ masterha_check_repl --conf=/etc/masterha_default.cnf
Fri Apr 26 08:01:39 2019 - [info] Reading default configuration from /etc/masterha_default.cnf..
Fri Apr 26 08:01:39 2019 - [info] Reading application default configuration from /etc/masterha_default.cnf..
Fri Apr 26 08:01:39 2019 - [info] Reading server configuration from /etc/masterha_default.cnf..
Fri Apr 26 08:01:39 2019 - [info] MHA::MasterMonitor version 0.57.
Fri Apr 26 08:01:40 2019 - [info] GTID failover mode = 0
Fri Apr 26 08:01:40 2019 - [info] Dead Servers:
Fri Apr 26 08:01:40 2019 - [info] Alive Servers:
Fri Apr 26 08:01:40 2019 - [info] test-db-1(192.168.151.178:3306)
Fri Apr 26 08:01:40 2019 - [info] test-db-2(192.168.147.128:3306)
Fri Apr 26 08:01:40 2019 - [info] test-db-3(192.168.155.178:3306)
Fri Apr 26 08:01:40 2019 - [info] Alive Slaves:
Fri Apr 26 08:01:40 2019 - [info] test-db-2(192.168.147.128:3306) Version=5.7.26 (oldest major version between slaves) log-bin:disabled
Fri Apr 26 08:01:40 2019 - [info] Replicating from 192.168.151.178(192.168.151.178:3306)
Fri Apr 26 08:01:40 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Fri Apr 26 08:01:40 2019 - [info] test-db-3(192.168.155.178:3306) Version=5.7.26 (oldest major version between slaves) log-bin:disabled
Fri Apr 26 08:01:40 2019 - [info] Replicating from 192.168.151.178(192.168.151.178:3306)
Fri Apr 26 08:01:40 2019 - [info] Not candidate for the new Master (no_master is set)
Fri Apr 26 08:01:40 2019 - [info] Current Alive Master: test-db-1(192.168.151.178:3306)
Fri Apr 26 08:01:40 2019 - [info] Checking slave configurations..
Fri Apr 26 08:01:40 2019 - [info] read_only=1 is not set on slave test-db-2(192.168.137.26:3306).
Fri Apr 26 08:01:40 2019 - [warning] relay_log_purge=0 is not set on slave test-db-2(192.168.137.26:3306).
Fri Apr 26 08:01:40 2019 - [warning] log-bin is not set on slave test-db-2(192.168.147.128:3306). This host cannot be a master.
Fri Apr 26 08:01:40 2019 - [info] read_only=1 is not set on slave test-db-3(192.168.155.178:3306).
Fri Apr 26 08:01:40 2019 - [warning] relay_log_purge=0 is not set on slave test-db-3(192.168.155.178:3306).
Fri Apr 26 08:01:40 2019 - [warning] log-bin is not set on slave test-db-3(192.168.155.178:3306). This host cannot be a master.
Fri Apr 26 08:01:40 2019 - [info] Checking replication filtering settings..
Fri Apr 26 08:01:40 2019 - [info] binlog_do_db= , binlog_ignore_db=
Fri Apr 26 08:01:40 2019 - [info] Replication filtering check ok.
Fri Apr 26 08:01:40 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln364] None of slaves can be master. Check failover configuration file or log-bin settings in my.cnf
Fri Apr 26 08:01:40 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/bin/masterha_check_repl line 48.
Fri Apr 26 08:01:40 2019 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Fri Apr 26 08:01:40 2019 - [info] Got exit code 1 (Not master dead).
MySQL Replication Health is NOT OK!
#报错解决
这个错误原因在与集群中的slave节点的数据库配置文件/ect/my.cnf没有设置log_bin参数,解决办法就是将所有slave节点的数据库配置文件加上log_bin=XXX参数,重启数据库服务即可
#重新检查啊就正常通过啦
$ masterha_check_repl --conf=/etc/masterha_default.cnf
Fri Apr 26 08:53:38 2019 - [info] Reading default configuration from /etc/masterha_default.cnf..
Fri Apr 26 08:53:38 2019 - [info] Reading application default configuration from /etc/masterha_default.cnf..
Fri Apr 26 08:53:38 2019 - [info] Reading server configuration from /etc/masterha_default.cnf..
Fri Apr 26 08:53:38 2019 - [info] MHA::MasterMonitor version 0.57.
Fri Apr 26 08:53:39 2019 - [info] GTID failover mode = 0
Fri Apr 26 08:53:39 2019 - [info] Dead Servers:
Fri Apr 26 08:53:39 2019 - [info] Alive Servers:
Fri Apr 26 08:53:39 2019 - [info] test-db-1(192.168.151.178:3306)
Fri Apr 26 08:53:39 2019 - [info] test-db-2(192.168.147.128:3306)
Fri Apr 26 08:53:39 2019 - [info] test-db-3(192.168.155.178:3306)
Fri Apr 26 08:53:39 2019 - [info] Alive Slaves:
Fri Apr 26 08:53:39 2019 - [info] test-db-2(192.168.147.128:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled
Fri Apr 26 08:53:39 2019 - [info] Replicating from 192.168.151.178(192.168.151.178:3306)
Fri Apr 26 08:53:39 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Fri Apr 26 08:53:39 2019 - [info] test-db-3(192.168.155.178:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled
Fri Apr 26 08:53:39 2019 - [info] Replicating from 192.168.151.178(192.168.151.178:3306)
Fri Apr 26 08:53:39 2019 - [info] Not candidate for the new Master (no_master is set)
Fri Apr 26 08:53:39 2019 - [info] Current Alive Master: test-db-1(192.168.151.178:3306)
Fri Apr 26 08:53:39 2019 - [info] Checking slave configurations..
Fri Apr 26 08:53:39 2019 - [info] read_only=1 is not set on slave test-db-2(192.168.137.26:3306).
Fri Apr 26 08:53:39 2019 - [info] read_only=1 is not set on slave test-db-3(192.168.155.178:3306).
Fri Apr 26 08:53:39 2019 - [info] Checking replication filtering settings..
Fri Apr 26 08:53:39 2019 - [info] binlog_do_db= , binlog_ignore_db=
Fri Apr 26 08:53:39 2019 - [info] Replication filtering check ok.
Fri Apr 26 08:53:39 2019 - [info] GTID (with auto-pos) is not supported
Fri Apr 26 08:53:39 2019 - [info] Starting SSH connection tests..
Fri Apr 26 08:53:43 2019 - [info] All SSH connection tests passed successfully.
Fri Apr 26 08:53:43 2019 - [info] Checking MHA Node version..
Fri Apr 26 08:53:44 2019 - [info] Version check ok.
Fri Apr 26 08:53:44 2019 - [info] Checking SSH publickey authentication settings on the current master..
Fri Apr 26 08:53:44 2019 - [info] HealthCheck: SSH to test-db-1 is reachable.
Fri Apr 26 08:53:45 2019 - [info] Master MHA Node version is 0.57.
Fri Apr 26 08:53:45 2019 - [info] Checking recovery script configurations on test-db-1(192.168.151.178:3306)..
Fri Apr 26 08:53:45 2019 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/mha/save_binary_logs_test --manager_version=0.57 --start_file=test-db-1.000003
Fri Apr 26 08:53:45 2019 - [info] Connecting to root@192.168.151.178(test-db-1:22)..
Creating /mha if not exists.. Creating directory /mha.. done.
ok.
Checking output directory is accessible or not..
ok.
Binlog found at /var/lib/mysql, up to test-db-1.000003
Fri Apr 26 08:53:45 2019 - [info] Binlog setting check done.
Fri Apr 26 08:53:45 2019 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Fri Apr 26 08:53:45 2019 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=test-db-2 --slave_ip=192.168.137.26 --slave_port=3306 --workdir=/mha --target_version=5.7.26-log --manager_version=0.57 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx
Fri Apr 26 08:53:45 2019 - [info] Connecting to root@192.168.137.26(test-db-2:22)..
Creating directory /mha.. done.
Checking slave recovery environment settings..
Opening /var/lib/mysql/relay-log.info ... ok.
Relay log found at /var/lib/mysql, up to test-db-2-relay-bin.000009
Temporary relay log file is /var/lib/mysql/test-db-2-relay-bin.000009
Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Fri Apr 26 08:53:46 2019 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=test-db-3 --slave_ip=192.168.155.178 --slave_port=3306 --workdir=/mha --target_version=5.7.26-log --manager_version=0.57 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx
Fri Apr 26 08:53:46 2019 - [info] Connecting to root@192.168.155.178(test-db-3:22)..
Creating directory /mha.. done.
Checking slave recovery environment settings..
Opening /var/lib/mysql/relay-log.info ... ok.
Relay log found at /var/lib/mysql, up to test-db-3-relay-bin.000009
Temporary relay log file is /var/lib/mysql/test-db-3-relay-bin.000009
Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Fri Apr 26 08:53:47 2019 - [info] Slaves settings check done.
Fri Apr 26 08:53:47 2019 - [info]
test-db-1(192.168.151.178:3306) (current master)
+--test-db-2(192.168.147.128:3306)
+--test-db-3(192.168.155.178:3306)
Fri Apr 26 08:53:47 2019 - [info] Checking replication health on test-db-2..
Fri Apr 26 08:53:47 2019 - [info] ok.
Fri Apr 26 08:53:47 2019 - [info] Checking replication health on test-db-3..
Fri Apr 26 08:53:47 2019 - [info] ok.
Fri Apr 26 08:53:47 2019 - [warning] master_ip_failover_script is not defined.
Fri Apr 26 08:53:47 2019 - [warning] shutdown_script is not defined.
Fri Apr 26 08:53:47 2019 - [info] Got exit code 0 (Not master dead).
MySQL Replication Health is OK.
Step 2.5 启动MHA管理程序
$ masterha_manager --conf=/etc/masterha_default.cnf &
[1] 19435
[root@test-maxscale ~]# Fri Apr 26 08:58:37 2019 - [info] Reading default configuration from /etc/masterha_default.cnf..
Fri Apr 26 08:58:37 2019 - [info] Reading application default configuration from /etc/masterha_default.cnf..
Fri Apr 26 08:58:37 2019 - [info] Reading server configuration from /etc/masterha_default.cnf..
#查看日志,一切正常
$ cat /mha/manager.log
------------------------------省略号---------------------------------------------------
Fri Apr 26 08:58:45 2019 - [info] Slaves settings check done.
Fri Apr 26 08:58:45 2019 - [info]
test-db-1(192.168.151.178:3306) (current master)
+--test-db-2(192.168.137.26:3306)
+--test-db-3(192.168.155.178:3306)
Fri Apr 26 08:58:45 2019 - [warning] master_ip_failover_script is not defined.
Fri Apr 26 08:58:45 2019 - [warning] shutdown_script is not defined.
Fri Apr 26 08:58:45 2019 - [info] Set master ping interval 1 seconds.
Fri Apr 26 08:58:45 2019 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Fri Apr 26 08:58:45 2019 - [info] Starting ping health check on test-db-1(192.168.151.178:3306)..
Fri Apr 26 08:58:45 2019 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
Step 3.0 MaxScale 安装
#下载 & 安装
$ wget https://downloads.mariadb.com/MaxScale/2.1.11/centos/7/x86_64/maxscale-2.1.11-1.centos.7.x86_64.rpm
$ rpm -ivh maxscale-2.1.11-1.centos.7.x86_64.rpm
Step 3.1 配置MaxScale
[maxscale]
threads=auto
ms_timestamp=1
syslog=1
maxlog=1
log_to_shm=1
log_warning=1
log_notice=1
#log_info=1
#log_debug=1
log_augmentation=1
logdir=/usr/local/maxscale/logs/trace/
datadir=/usr/local/maxscale/data/
cachedir=/usr/local/maxscale/cache/
piddir=/usr/local/maxscale/tmp/
[test-db-1]
type=server
address=192.168.151.178
port=3306
protocol=MySQLBackend
server_weight=2
[test-db-2]
type=server
address=192.168.147.128
port=3306
protocol=MySQLBackend
server_weight=5
[test-db-3]
type=server
address=192.168.155.178
port=3306
protocol=MySQLBackend
server_weight=5
[MySQL Monitor]
type=monitor
module=mysqlmon
servers=test-db-1,test-db-2,test-db-3
user=scalemon
passwd=p4Mp*1z_&kv
monitor_interval=10000
detect_stale_master=true
[Read-Write Service]
type=service
router=readwritesplit
servers=test-db-1,test-db-2,test-db-3
user=mha
passwd=p4Mp*1z_&kv
max_slave_connections=100%
router_options=disable_sescmd_history=false,max_sescmd_history=1500,slave_selection_criteria=LEAST_GLOBAL_CONNECTIONS,master_accept_reads=true
filters=Hint
[MaxAdmin Service]
type=service
router=cli
[Read-Write Listener]
type=listener
service=Read-Write Service
protocol=MySQLClient
port=3308
[MaxAdmin Listener]
type=listener
service=MaxAdmin Service
protocol=maxscaled
socket=default
port=6603
[Hint]
type=filter
module=hintfilter
[MyCache]
type=filter
module=cache
storage=storage_inmemory
soft_ttl=300
hard_ttl=450
cached_data=shared
max_size=100Mi
selects=verify_cacheable
rules=/etc/maxscale_cache_rules.json
Step 3.2 启动Maxscale
$ service maxscale start
Redirecting to /bin/systemctl start maxscale.service
Job for maxscale.service failed because the control process exited with error code. See "systemctl status maxscale.service" and "journalctl -xe" for details.
#启动maxscale 失败
$ journalctl -xe
--
-- Unit session-67.scope has finished starting up.
--
-- The start-up result is done.
Apr 26 10:20:01 test-maxscale CROND[28374]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Apr 26 10:27:30 test-maxscale polkitd[13370]: Registered Authentication Agent for unix-process:29202:2913329 (system bus name :1.1
Apr 26 10:27:30 test-maxscale systemd[1]: Starting MariaDB MaxScale Database Proxy...
-- Subject: Unit maxscale.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit maxscale.service has begun starting up.
Apr 26 10:27:30 test-maxscale maxscale[29222]: (print_log_n_stderr): Can't access '/usr/local/maxscale/logs/trace/'.
Apr 26 10:27:30 test-maxscale maxscale[29222]: (print_log_n_stderr): Opening file '/usr/local/maxscale/tmp/' for writing failed. (
Apr 26 10:27:30 test-maxscale maxscale[29220]: * Error: Can't access '/usr/local/maxscale/logs/trace/'.
Apr 26 10:27:30 test-maxscale maxscale[29220]: * Error: Opening file '/usr/local/maxscale/tmp/' for writing failed. (Permission de
Apr 26 10:27:30 test-maxscale maxscale[29220]: * Error: MaxScale doesn't have write permission to '/usr/local/maxscale/tmp/'.
Apr 26 10:27:30 test-maxscale maxscale[29220]: * Error: Error: Failed to pre-parse configuration file /etc/maxscale.cnf. Error on
Apr 26 10:27:30 test-maxscale maxscale[29222]: (print_log_n_stderr): MaxScale doesn't have write permission to '/usr/local/maxscal
Apr 26 10:27:30 test-maxscale maxscale[29222]: (print_log_n_stderr): Error: Failed to pre-parse configuration file /etc/maxscale.c
Apr 26 10:27:30 test-maxscale systemd[1]: maxscale.service: control process exited, code=exited status=1
Apr 26 10:27:30 test-maxscale systemd[1]: Failed to start MariaDB MaxScale Database Proxy.
-- Subject: Unit maxscale.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit maxscale.service has failed.
--
-- The result is failed.
Apr 26 10:27:30 test-maxscale systemd[1]: Unit maxscale.service entered failed state.
Apr 26 10:27:30 test-maxscale systemd[1]: maxscale.service failed.
Apr 26 10:27:30 test-maxscale polkitd[13370]: Unregistered Authentication Agent for unix-process:29202:2913329 (system bus name :1
说明:从以上信息来是没有权限
使用maxscale命令启动
$ maxscale —config=/etc/maxscale.cnf
#启动成功
$ maxadmin
MaxScale> list servers
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server | Address | Port | Connections | Status
-------------------+-----------------+-------+-------------+--------------------
test-db-1 | 192.168.151.178 | 3306 | 0 | Master, Running
test-db-2 | 192.168.147.128 | 3306 | 0 | Slave, Running
test-db-3 | 192.168.155.178 | 3306 | 0 | Slave, Running
-------------------+-----------------+-------+-------------+--------------------
Step 3.4 设置服务器维护状态
MaxScale> set server test-db-2 maintenance
MaxScale> list servers
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server | Address | Port | Connections | Status
-------------------+-----------------+-------+-------------+--------------------
test-db-1 | 192.168.151.178 | 3306 | 0 | Master, Running
test-db-2 | 192.168.147.128 | 3306 | 0 | Maintenance, Slave, Running
test-db-3 | 192.168.155.178 | 3306 | 0 | Slave, Running
-------------------+-----------------+-------+-------------+--------------------
Step 3.5 清除维护状态
MaxScale> clear server test-db-2 maintenance
MaxScale> list servers
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server | Address | Port | Connections | Status
-------------------+-----------------+-------+-------------+--------------------
test-db-1 | 192.168.151.178 | 3306 | 0 | Master, Running
test-db-2 | 192.168.147.128 | 3306 | 0 | Slave, Running
test-db-3 | 192.168.155.178 | 3306 | 0 | Slave, Running
-------------------+-----------------+-------+-------------+--------------------
Step 4.1 切换mysql master--校验当前是否启用masterha_manager
$ masterha_check_status --conf=/etc/masterha_default.cnf
masterha_default is stopped(2:NOT_RUNNING).
Step 4.2 切换mysql master--切换前
mysql> show slave hosts;
+-----------+------+------+-----------+--------------------------------------+
| Server_id | Host | Port | Master_id | Slave_UUID |
+-----------+------+------+-----------+--------------------------------------+
| 3 | | 3306 | 1 | 40c0f1a0-67d8-11e9-b2e7-f23c91a3d9d9 |
| 2 | | 3306 | 1 | 34f99aa3-67d8-11e9-aceb-f23c91a3d915 |
+-----------+------+------+-----------+--------------------------------------+
2 rows in set (0.00 sec)
Step 4.3 实施在线切换
#将test-db-2设为master,
#在test-maxscale 执行
$ /mha/scripts/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.147.128 --orig_master_ip=192.168.147.128 --orig_master_port=3306
#test-db-1执行
mysql> start slave;
#test-db-2 执行
mysql> stop slave;
#test-db-3 执行
mysql> CHANGE MASTER TO MASTER_HOST='192.168.147.128' MASTER_PORT=3306 MASTER_LOG_FILE='test-db-2.000004' MASTER_LOG_POS=578, MASTER_USER='repl' MASTER_PASSWORD='p4Mp*1z_&kv';
mysql> start slave;
#现在可以看到master在test-db-2
MaxScale> list servers
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server | Address | Port | Connections | Status
-------------------+-----------------+-------+-------------+--------------------
test-db-1 | 192.168.151.178 | 3306 | 1 | Slave, Running
test-db-2 | 192.168.147.128 | 3306 | 1 | Master, Running
test-db-3 | 192.168.155.178 | 3306 | 0 | Slave, Running
-------------------+-----------------+-------+-------------+--------------------
Step 5.1 全自动切换master--切换之前
MaxScale> list servers
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server | Address | Port | Connections | Status
-------------------+-----------------+-------+-------------+--------------------
test-db-1 | 192.168.151.178 | 3306 | 0 | Master, Running
test-db-2 | 192.168.147.128 | 3306 | 0 | Slave, Running
test-db-3 | 192.168.155.178 | 3306 | 0 | Slave, Running
-------------------+-----------------+-------+-------------+--------------------
Step 5.2 全自动切换master--交互模式
$ masterha_master_switch --master_state=alive --conf=/etc/masterha_default.cnf --
new_master_host=test-db-2n --new_master_port=3306
Thu May 2 08:32:31 2019 - [info] MHA::MasterRotate version 0.57.
Thu May 2 08:32:31 2019 - [info] Starting online master switch..
Thu May 2 08:32:31 2019 - [info]
Thu May 2 08:32:31 2019 - [info] * Phase 1: Configuration Check Phase..
Thu May 2 08:32:31 2019 - [info]
Thu May 2 08:32:31 2019 - [info] Reading default configuration from /etc/masterha_default.cnf..
Thu May 2 08:32:31 2019 - [info] Reading application default configuration from /etc/masterha_default.cnf..
Thu May 2 08:32:31 2019 - [info] Reading server configuration from /etc/masterha_default.cnf..
Thu May 2 08:32:33 2019 - [warning] SQL Thread is stopped(no error) on test-db-1(192.168.151.178:3306)
Thu May 2 08:32:33 2019 - [info] Multi-master configuration is detected. Current primary(writable) master is test-db-1(192.168.151.178:3306)
Thu May 2 08:32:33 2019 - [info] Master configurations are as below:
Master test-db-2n(192.168.147.128:3306), replicating from 192.168.151.178(192.168.151.178:3306), read-only
Master test-db-1(192.168.151.178:3306), replicating from 192.168.147.128(192.168.147.128:3306)
Thu May 2 08:32:33 2019 - [info] GTID failover mode = 0
Thu May 2 08:32:33 2019 - [info] Current Alive Master: test-db-1(192.168.151.178:3306)
Thu May 2 08:32:33 2019 - [info] Alive Slaves:
Thu May 2 08:32:33 2019 - [info] test-db-2n(192.168.147.128:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled
Thu May 2 08:32:33 2019 - [info] Replicating from 192.168.151.178(192.168.151.178:3306)
Thu May 2 08:32:33 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Thu May 2 08:32:33 2019 - [info] test-db-3(192.168.155.178:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled
Thu May 2 08:32:33 2019 - [info] Replicating from 192.168.151.178(192.168.151.178:3306)
Thu May 2 08:32:33 2019 - [info] Not candidate for the new Master (no_master is set)
It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on test-db-1(192.168.151.178:3306)? (YES/no): YES
Thu May 2 08:32:35 2019 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Thu May 2 08:32:35 2019 - [info] ok.
Thu May 2 08:32:35 2019 - [info] Checking MHA is not monitoring or doing failover..
Thu May 2 08:32:35 2019 - [info] Checking replication health on test-db-2n..
Thu May 2 08:32:35 2019 - [info] ok.
Thu May 2 08:32:35 2019 - [info] Checking replication health on test-db-3..
Thu May 2 08:32:35 2019 - [info] ok.
Thu May 2 08:32:35 2019 - [info] test-db-2n can be new master.
Thu May 2 08:32:35 2019 - [info]
From:
test-db-1(192.168.151.178:3306) (current master)
+--test-db-2n(192.168.147.128:3306)
+--test-db-3(192.168.155.178:3306)
To:
test-db-2n(192.168.147.128:3306) (new master)
+--test-db-3(192.168.155.178:3306)
Starting master switch from test-db-1(192.168.151.178:3306) to test-db-2n(192.168.147.128:3306)? (yes/NO): yes
Thu May 2 08:32:43 2019 - [info] Checking whether test-db-2n(192.168.147.128:3306) is ok for the new master..
Thu May 2 08:32:43 2019 - [info] ok.
Thu May 2 08:32:43 2019 - [info] ** Phase 1: Configuration Check Phase completed.
Thu May 2 08:32:43 2019 - [info]
Thu May 2 08:32:43 2019 - [info] * Phase 2: Rejecting updates Phase..
Thu May 2 08:32:43 2019 - [info]
master_ip_online_change_script is not defined. If you do not disable writes on the current master manually, applications keep writing on the current master. Is it ok to proceed? (yes/NO): yes
Thu May 2 08:32:47 2019 - [info] Locking all tables on the orig master to reject updates from everybody (including root):
Thu May 2 08:32:47 2019 - [info] Executing FLUSH TABLES WITH READ LOCK..
Thu May 2 08:32:47 2019 - [info] ok.
Thu May 2 08:32:47 2019 - [info] Orig master binlog:pos is test-db-1.000005:883.
Thu May 2 08:32:47 2019 - [info] Waiting to execute all relay logs on test-db-2n(192.168.147.128:3306)..
Thu May 2 08:32:47 2019 - [info] master_pos_wait(test-db-1.000005:883) completed on test-db-2n(192.168.147.128:3306). Executed 0 events.
Thu May 2 08:32:47 2019 - [info] done.
Thu May 2 08:32:47 2019 - [info] Getting new master's binlog name and position..
Thu May 2 08:32:47 2019 - [info] test-db-2n.000001:154
Thu May 2 08:32:47 2019 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='test-db-2n or 192.168.147.128', MASTER_PORT=3306, MASTER_LOG_FILE='test-db-2n.000001', MASTER_LOG_POS=154, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Thu May 2 08:32:47 2019 - [info] Setting read_only=0 on test-db-2n(192.168.147.128:3306)..
Thu May 2 08:32:47 2019 - [info] ok.
Thu May 2 08:32:47 2019 - [info]
Thu May 2 08:32:47 2019 - [info] * Switching slaves in parallel..
Thu May 2 08:32:47 2019 - [info]
Thu May 2 08:32:47 2019 - [info] -- Slave switch on host test-db-3(192.168.155.178:3306) started, pid: 15130
Thu May 2 08:32:47 2019 - [info]
Thu May 2 08:32:48 2019 - [info] Log messages from test-db-3 ...
Thu May 2 08:32:48 2019 - [info]
Thu May 2 08:32:47 2019 - [info] Waiting to execute all relay logs on test-db-3(192.168.155.178:3306)..
Thu May 2 08:32:47 2019 - [info] master_pos_wait(test-db-1.000005:883) completed on test-db-3(192.168.155.178:3306). Executed 0 events.
Thu May 2 08:32:47 2019 - [info] done.
Thu May 2 08:32:47 2019 - [info] Resetting slave test-db-3(192.168.155.178:3306) and starting replication from the new master test-db-2n(192.168.147.128:3306)..
Thu May 2 08:32:47 2019 - [info] Executed CHANGE MASTER.
Thu May 2 08:32:47 2019 - [info] Slave started.
Thu May 2 08:32:48 2019 - [info] End of log messages from test-db-3 ...
Thu May 2 08:32:48 2019 - [info]
Thu May 2 08:32:48 2019 - [info] -- Slave switch on host test-db-3(192.168.155.178:3306) succeeded.
Thu May 2 08:32:48 2019 - [info] Unlocking all tables on the orig master:
Thu May 2 08:32:48 2019 - [info] Executing UNLOCK TABLES..
Thu May 2 08:32:48 2019 - [info] ok.
Thu May 2 08:32:48 2019 - [info] All new slave servers switched successfully.
Thu May 2 08:32:48 2019 - [info]
Thu May 2 08:32:48 2019 - [info] * Phase 5: New master cleanup phase..
Thu May 2 08:32:48 2019 - [info]
Thu May 2 08:32:48 2019 - [info] test-db-2n: Resetting slave info succeeded.
Thu May 2 08:32:48 2019 - [info] Switching master to test-db-2n(192.168.147.128:3306) completed successfully.
Step 5.3 全自动切换master--切换之后
说明:test-db-1需要手动changer master,否则会显示Running状态
MaxScale> list servers
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server | Address | Port | Connections | Status
-------------------+-----------------+-------+-------------+--------------------
test-db-1 | 192.168.151.178 | 3306 | 0 | Slave, Running
test-db-2 | 192.168.147.128 | 3306 | 0 | Master, Running
test-db-3 | 192.168.155.178 | 3306 | 0 | Slave, Running
-------------------+-----------------+-------+-------------+--------------------
Step 5.4 全自动切换master--非交互模式
#非交互模式:
$ masterha_master_switch --master_state=alive --conf=/etc/masterha_default.cnf --new_master_host=test-db-1 --new_master_port=3306 --interactive=0
Thu May 2 08:42:35 2019 - [info] MHA::MasterRotate version 0.57.
Thu May 2 08:42:35 2019 - [info] Starting online master switch..
Thu May 2 08:42:35 2019 - [info]
Thu May 2 08:42:35 2019 - [info] * Phase 1: Configuration Check Phase..
Thu May 2 08:42:35 2019 - [info]
Thu May 2 08:42:35 2019 - [info] Reading default configuration from /etc/masterha_default.cnf..
Thu May 2 08:42:35 2019 - [info] Reading application default configuration from /etc/masterha_default.cnf..
Thu May 2 08:42:35 2019 - [info] Reading server configuration from /etc/masterha_default.cnf..
Thu May 2 08:42:36 2019 - [info] GTID failover mode = 0
Thu May 2 08:42:36 2019 - [info] Current Alive Master: test-db-2 (192.168.147.128:3306)
Thu May 2 08:42:36 2019 - [info] Alive Slaves:
Thu May 2 08:42:36 2019 - [info] test-db-1(192.168.151.178:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled
Thu May 2 08:42:36 2019 - [info] Replicating from 192.168.147.128(192.168.147.128:3306)
Thu May 2 08:42:36 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Thu May 2 08:42:36 2019 - [info] test-db-3(192.168.155.178:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled
Thu May 2 08:42:36 2019 - [info] Replicating from 192.168.147.128(192.168.147.128:3306)
Thu May 2 08:42:36 2019 - [info] Not candidate for the new Master (no_master is set)
Thu May 2 08:42:36 2019 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Thu May 2 08:42:36 2019 - [info] ok.
Thu May 2 08:42:36 2019 - [info] Checking MHA is not monitoring or doing failover..
Thu May 2 08:42:36 2019 - [info] Checking replication health on test-db-1..
Thu May 2 08:42:36 2019 - [info] ok.
Thu May 2 08:42:36 2019 - [info] Checking replication health on test-db-3..
Thu May 2 08:42:36 2019 - [info] ok.
Thu May 2 08:42:36 2019 - [info] test-db-1 can be new master.
Thu May 2 08:42:36 2019 - [info]
From:
test-db-2 (192.168.147.128:3306) (current master)
+--test-db-1(192.168.151.178:3306)
+--test-db-3(192.168.155.178:3306)
To:
test-db-1(192.168.151.178:3306) (new master)
+--test-db-3(192.168.155.178:3306)
Thu May 2 08:42:36 2019 - [info] Checking whether test-db-1(192.168.151.178:3306) is ok for the new master..
Thu May 2 08:42:36 2019 - [info] ok.
Thu May 2 08:42:36 2019 - [info] ** Phase 1: Configuration Check Phase completed.
Thu May 2 08:42:36 2019 - [info]
Thu May 2 08:42:36 2019 - [info] * Phase 2: Rejecting updates Phase..
Thu May 2 08:42:36 2019 - [info]
Thu May 2 08:42:36 2019 - [warning] master_ip_online_change_script is not defined. Skipping disabling writes on the current master.
Thu May 2 08:42:36 2019 - [info] Locking all tables on the orig master to reject updates from everybody (including root):
Thu May 2 08:42:36 2019 - [info] Executing FLUSH TABLES WITH READ LOCK..
Thu May 2 08:42:36 2019 - [info] ok.
Thu May 2 08:42:36 2019 - [info] Orig master binlog:pos is test-db-2n.000001:154.
Thu May 2 08:42:36 2019 - [info] Waiting to execute all relay logs on test-db-1(192.168.151.178:3306)..
Thu May 2 08:42:36 2019 - [info] master_pos_wait(test-db-2.000001:154) completed on test-db-1(192.168.151.178:3306). Executed 0 events.
Thu May 2 08:42:36 2019 - [info] done.
Thu May 2 08:42:36 2019 - [info] Getting new master's binlog name and position..
Thu May 2 08:42:36 2019 - [info] test-db-1.000005:883
Thu May 2 08:42:36 2019 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='test-db-1 or 192.168.151.178', MASTER_PORT=3306, MASTER_LOG_FILE='test-db-1.000005', MASTER_LOG_POS=883, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Thu May 2 08:42:36 2019 - [info]
Thu May 2 08:42:36 2019 - [info] * Switching slaves in parallel..
Thu May 2 08:42:36 2019 - [info]
Thu May 2 08:42:36 2019 - [info] -- Slave switch on host test-db-3(192.168.155.178:3306) started, pid: 15691
Thu May 2 08:42:36 2019 - [info]
Thu May 2 08:42:37 2019 - [info] Log messages from test-db-3 ...
Thu May 2 08:42:37 2019 - [info]
Thu May 2 08:42:36 2019 - [info] Waiting to execute all relay logs on test-db-3(192.168.155.178:3306)..
Thu May 2 08:42:36 2019 - [info] master_pos_wait(test-db-2.000001:154) completed on test-db-3(192.168.155.178:3306). Executed 0 events.
Thu May 2 08:42:36 2019 - [info] done.
Thu May 2 08:42:36 2019 - [info] Resetting slave test-db-3(192.168.155.178:3306) and starting replication from the new master test-db-1(192.168.151.178:3306)..
Thu May 2 08:42:36 2019 - [info] Executed CHANGE MASTER.
Thu May 2 08:42:36 2019 - [info] Slave started.
Thu May 2 08:42:37 2019 - [info] End of log messages from test-db-3 ...
Thu May 2 08:42:37 2019 - [info]
Thu May 2 08:42:37 2019 - [info] -- Slave switch on host test-db-3(192.168.155.178:3306) succeeded.
Thu May 2 08:42:37 2019 - [info] Unlocking all tables on the orig master:
Thu May 2 08:42:37 2019 - [info] Executing UNLOCK TABLES..
Thu May 2 08:42:37 2019 - [info] ok.
Thu May 2 08:42:37 2019 - [info] All new slave servers switched successfully.
Thu May 2 08:42:37 2019 - [info]
Thu May 2 08:42:37 2019 - [info] * Phase 5: New master cleanup phase..
Thu May 2 08:42:37 2019 - [info]
Thu May 2 08:42:37 2019 - [info] test-db-1: Resetting slave info succeeded.
Thu May 2 08:42:37 2019 - [info] Switching master to test-db-1(192.168.151.178:3306) completed successfully.
Step 5.5 全自动切换master--切换之后
MaxScale> list servers
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server | Address | Port | Connections | Status
-------------------+-----------------+-------+-------------+--------------------
test-db-1 | 192.168.151.178 | 3306 | 0 | Master, Running
test-db-2 | 192.168.147.128 | 3306 | 0 | Slave, Running
test-db-3 | 192.168.155.178 | 3306 | 0 | Slave, Running
-------------------+-----------------+-------+-------------+--------------------