环境
vip 192.168.1.101
slave 192.168.1.16 5.7.17 3306
master 192.168.1.135 5.7.17 3306
proxysql192.168.1.16(为方便proxysql放在了16节点上)
一 MHA的搭建
1.安装MHA软件,首先安装epel源。(2台机器)
rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
2.安装perl相关组件(2台机器)
yum install perl-DBD-MySQLyum install perl-Config-Tinyyum install perl-Log-Dispatchyum install perl-Parallel-ForkManager
3.安装MHA软件 (两台机器建议都安装,切换方便)(2台机器)
rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm
rpm-ivh mha4mysql-manager-0.56-0.el6.noarch.rpm
4.建立SSH信任关系
5.授权
GRANT ALL PRIVILEGES ON *.* TO 'zhuch'@'%' IDENTIFIED BY "zhuch"GRANT REPLICATION SLAVE ON *.* TO 'slave'@'%' IDENTIFIED BY "oracle"
6.创建应用目录
mkdir /etc/masterha
拷贝如下文件到 /etc/masterha
[root@mysql3 masterha]# ls -l
total 32
-rw-r--r--. 1 root root 509 Feb 10 02:29app1.conf
-rw-r--r--. 1 root root 55 Feb 10 03:15 drop_vip.sh
-rw-r--r--. 1 root root 57 Feb 10 03:15 init_vip.sh
-rw-r--r--. 1 root root 354 Feb 10 02:25masterha_default.conf
-rwxr-xr-x. 1 root root 3978 Feb 10 03:16master_ip_failover
-rwxr-xr-x. 1 root root 10390 Feb 10 03:17 master_ip_online_change
app1.conf MHA相关配置文件(在软件包解压后的目录里面有样例配置文件,只不过这里我们直接创建一个重新编辑)
[root@mysql3 masterha]# cat app1.conf
[server default]
#mha manager工作目录
manager_workdir = /var/log/masterha/app1
manager_log = /var/log/masterha/app1/app1.log
remote_workdir = /var/log/masterha/app1
[server1]
hostname=192.168.1.16
master_binlog_dir = /data/mysql/mysql3306/logs
candidate_master = 1
check_repl_delay = 0 #用防止master故障时,切换时slave有延迟,卡在那里切不过来。
[server3]
hostname=192.168.1.135
master_binlog_dir=/data/mysql/mysql3306/logs
candidate_master = 1
check_repl_delay = 0
drop_vip.sh 解除绑定vip
[root@mysql3 masterha]# cat drop_vip.shvip="192.168.1.101/24"
/sbin/ip addr del $vip dev eth0
init._vip.sh 绑定vip
[root@mysql3 masterha]# cat init_vip.shvip="192.168.1.101/24"
/sbin/ip addr add $vip dev eth0
masterha_default.conf 全局级配置文件
[root@mysql3 masterha]# cat masterha_default.conf
[server default]
#MySQL的用户和密码
user=zhuch
password=zhuch
#系统ssh用户
ssh_user=root
#复制用户
repl_user=slave
repl_password=oracle
#监控
ping_interval=1
#shutdown_script=""
#切换调用的脚本
master_ip_failover_script= /etc/masterha/master_ip_failover
master_ip_online_change_script= /etc/masterha/master_ip_online_change
master_ip_failover 自动failover脚本
[root@mysql3 masterha]#cat master_ip_failover
#!/usr/bin/env perl
# Copyright (C) 2011 DeNA Co.,Ltd.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
## Note: This is a sample script and is not complete. Modify the script based on your environment.
usestrict;use warnings FATAL => 'all';use Getopt::Long;use MHA::DBHelper;#自定义该组机器的vip
my $vip = "192.168.1.101";my $if = "eth0";my($command, $ssh_user, $orig_master_host,
$orig_master_ip, $orig_master_port, $new_master_host,
$new_master_ip, $new_master_port, $new_master_user,
$new_master_password);
GetOptions('command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
'new_master_user=s' => \$new_master_user,
'new_master_password=s' => \$new_master_password,);subadd_vip {my $output1 = `ssh -o ConnectTimeout=15 -o ConnectionAttempts=3 $orig_master_host /sbin/ip addr del $vip/24 dev $if`;my $output2 = `ssh -o ConnectTimeout=15 -o ConnectionAttempts=3 $new_master_host /sbin/ip addr add $vip/24 dev $if`;
}exit &main();submain {if ( $command eq "stop" || $command eq "stopssh") {#$orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;eval{#updating global catalog, etc
$exit_code = 0;
};if($@) {warn "Got Error: $@\n";exit $exit_code;
}exit $exit_code;
}elsif ( $command eq "start") {#all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;eval{my $new_master_handler = new MHA::DBHelper();#args: hostname, port, user, password, raise_error_or_not
$new_master_handler->connect( $new_master_ip, $new_master_port,
$new_master_user, $new_master_password, 1);## Set read_only=0 on the new master
$new_master_handler->disable_log_bin_local();print "Set read_only=0 on the new master.\n";$new_master_handler->disable_read_only();## Creating an app user on the new master
#print "Creating app user on the new master..\n";
#FIXME_xxx_create_user( $new_master_handler->{dbh} );
$new_master_handler->enable_log_bin_local();$new_master_handler->disconnect();## Update master ip on the catalog database, etc
&add_vip();$exit_code = 0;
};if($@) {warn$@;#If you want to continue failover, exit 10.
exit $exit_code;
}exit $exit_code;
}elsif ( $command eq "status") {#do nothing
exit 0;
}else{&usage();exit 1;
}
}subusage {print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
View Code
master_ip_online_change 手动failover脚本
[root@mysql3 masterha]#cat master_ip_online_change
#!/usr/bin/env perl
# Copyright (C) 2011 DeNA Co.,Ltd.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
## Note: This is a sample script and is not complete. Modify the script based on your environment.
usestrict;use warnings FATAL => 'all';use Getopt::Long;use MHA::DBHelper;use MHA::NodeUtil;use Time::HiRes qw( sleepgettimeofday tv_interval );use Data::Dumper;my $_tstart;my $_running_interval = 0.1;#添加vip定义
my $vip = "192.168.1.101";my $if = "eth0";my($command, $orig_master_is_new_slave, $orig_master_host,
$orig_master_ip, $orig_master_port, $orig_master_user,
$orig_master_password, $orig_master_ssh_user, $new_master_host,
$new_master_ip, $new_master_port, $new_master_user,
$new_master_password, $new_master_ssh_user,);
GetOptions('command=s' => \$command,
'orig_master_is_new_slave' => \$orig_master_is_new_slave,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'orig_master_user=s' => \$orig_master_user,
'orig_master_password=s' => \$orig_master_password,
'orig_master_ssh_user=s' => \$orig_master_ssh_user,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
'new_master_user=s' => \$new_master_user,
'new_master_password=s' => \$new_master_password,
'new_master_ssh_user=s' => \$new_master_ssh_user,);exit &main();subdrop_vip {my $output = `ssh -o ConnectTimeout=15 -o ConnectionAttempts=3 $orig_master_host /sbin/ip addr del $vip/24 dev $if`;#mysql里的连接全部干掉
#FIXME
}subadd_vip {my $output = `ssh -o ConnectTimeout=15 -o ConnectionAttempts=3 $new_master_host /sbin/ip addr add $vip/24 dev $if`;
}subcurrent_time_us {my ( $sec, $microsec ) =gettimeofday();my $curdate = localtime($sec);return $curdate . " " . sprintf( "%06d", $microsec);
}subsleep_until {my $elapsed = tv_interval($_tstart);if ( $_running_interval > $elapsed) {sleep( $_running_interval - $elapsed);
}
}subget_threads_util {my $dbh = shift;my $my_connection_id = shift;my $running_time_threshold = shift;my $type = shift;$running_time_threshold = 0 unless ($running_time_threshold);$type = 0 unless ($type);my @threads;my $sth = $dbh->prepare("SHOW PROCESSLIST");$sth->execute();while ( my $ref = $sth->fetchrow_hashref() ) {my $id = $ref->{Id};my $user = $ref->{User};my $host = $ref->{Host};my $command = $ref->{Command};my $state = $ref->{State};my $query_time = $ref->{Time};my $info = $ref->{Info};$info =~ s/^\s*(.*?)\s*$/$1/ if defined($info);next if ( $my_connection_id == $id);next if ( defined($query_time) && $query_time < $running_time_threshold);next if ( defined($command) && $command eq "Binlog Dump");next if ( defined($user) && $user eq "system user");next
if ( defined($command)&& $command eq "Sleep"
&& defined($query_time)&& $query_time >= 1);if ( $type >= 1) {next if ( defined($command) && $command eq "Sleep");next if ( defined($command) && $command eq "Connect");
}if ( $type >= 2) {next if ( defined($info) && $info =~ m/^select/i );next if ( defined($info) && $info =~ m/^show/i );
}push @threads, $ref;
}return @threads;
}submain {if ( $command eq "stop") {## Gracefully killing connections on the current master
# 1. Set read_only= 1 on the new master
# 2. DROP USER so that no app user can establish new connections
# 3. Set read_only= 1 on the current master
# 4. Kill current queries
# * Any database access failure will result in script die.
my $exit_code = 1;eval{## Setting read_only=1 on the new master (to avoid accident)
my $new_master_handler = new MHA::DBHelper();#args: hostname, port, user, password, raise_error(die_on_error)_or_not
$new_master_handler->connect( $new_master_ip, $new_master_port,
$new_master_user, $new_master_password, 1);print current_time_us() . "Set read_only on the new master..";$new_master_handler->enable_read_only();if ( $new_master_handler->is_read_only() ) {print "ok.\n";
}else{die "Failed!\n";
}$new_master_handler->disconnect();#Connecting to the orig master, die if any database error happens
my $orig_master_handler = new MHA::DBHelper();$orig_master_handler->connect( $orig_master_ip, $orig_master_port,
$orig_master_user, $orig_master_password, 1);## Drop application user so that nobody can connect. Disabling per-session binlog beforehand
$orig_master_handler->disable_log_bin_local();#print current_time_us() . " Drpping app user on the orig master..\n";
print current_time_us() . "drop vip $vip..\n";#drop_app_user($orig_master_handler);
&drop_vip();## Waiting for N * 100 milliseconds so that current connections can exit
my $time_until_read_only = 15;$_tstart =[gettimeofday];my @threads = get_threads_util( $orig_master_handler->{dbh},
$orig_master_handler->{connection_id} );while ( $time_until_read_only > 0 && $#threads >= 0 ) {
if ( $time_until_read_only % 5 == 0) {printf
"%s Waiting all running %d threads are disconnected.. (max %d milliseconds)\n",current_time_us(), $#threads + 1, $time_until_read_only * 100;
if ( $#threads < 5 ) {
print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"
foreach (@threads);
}
}
sleep_until();$_tstart =[gettimeofday];$time_until_read_only--;@threads = get_threads_util( $orig_master_handler->{dbh},
$orig_master_handler->{connection_id} );
}## Setting read_only=1 on the current master so that nobody(except SUPER) can write
print current_time_us() . "Set read_only=1 on the orig master..";$orig_master_handler->enable_read_only();if ( $orig_master_handler->is_read_only() ) {print "ok.\n";
}else{die "Failed!\n";
}## Waiting for M * 100 milliseconds so that current update queries can complete
my $time_until_kill_threads = 5;@threads = get_threads_util( $orig_master_handler->{dbh},
$orig_master_handler->{connection_id} );while ( $time_until_kill_threads > 0 && $#threads >= 0 ) {
if ( $time_until_kill_threads % 5 == 0) {printf
"%s Waiting all running %d queries are disconnected.. (max %d milliseconds)\n",current_time_us(), $#threads + 1, $time_until_kill_threads * 100;
if ( $#threads < 5 ) {
print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"
foreach (@threads);
}
}
sleep_until();$_tstart =[gettimeofday];$time_until_kill_threads--;@threads = get_threads_util( $orig_master_handler->{dbh},
$orig_master_handler->{connection_id} );
}## Terminating all threads
print current_time_us() . "Killing all application threads..\n";$orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 );
print current_time_us() . "done.\n";$orig_master_handler->enable_log_bin_local();$orig_master_handler->disconnect();## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK
$exit_code = 0;
};if($@) {warn "Got Error: $@\n";exit $exit_code;
}exit $exit_code;
}elsif ( $command eq "start") {## Activating master ip on the new master
# 1. Create app user with write privileges
# 2. Moving backup script if needed
# 3. Register new master's ip to the catalog database
# We don't return error even though activating updatable accounts/ip failed so that we don't interrupt slaves' recovery.
# If exit code is 0 or 10, MHA does not abort
my $exit_code = 10;eval{my $new_master_handler = new MHA::DBHelper();#args: hostname, port, user, password, raise_error_or_not
$new_master_handler->connect( $new_master_ip, $new_master_port,
$new_master_user, $new_master_password, 1);## Set read_only=0 on the new master
$new_master_handler->disable_log_bin_local();print current_time_us() . "Set read_only=0 on the new master.\n";$new_master_handler->disable_read_only();## Creating an app user on the new master
#print current_time_us() . " Creating app user on the new master..\n";
print current_time_us() . "Add vip $vip on $if..\n";#create_app_user($new_master_handler);
&add_vip();$new_master_handler->enable_log_bin_local();$new_master_handler->disconnect();## Update master ip on the catalog database, etc
$exit_code = 0;
};if($@) {warn "Got Error: $@\n";exit $exit_code;
}exit $exit_code;
}elsif ( $command eq "status") {#do nothing
exit 0;
}else{&usage();exit 1;
}
}subusage {print
"Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";die;
}
View Code
7.在主库绑定vip(执行脚本)
sh init._vip.sh
8.检测SSH 是否ok
[root@mysql2 opt]# masterha_check_ssh --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf
Sat Feb10 22:00:34 2018 - [info] Reading default configuration from /etc/masterha/masterha_default.conf..
Sat Feb10 22:00:34 2018 - [info] Reading application default configuration from /etc/masterha/app1.conf..
Sat Feb10 22:00:34 2018 - [info] Reading server configuration from /etc/masterha/app1.conf..
Sat Feb10 22:00:34 2018 - [info] Starting SSH connection tests..
Sat Feb10 22:00:36 2018 -[debug]
Sat Feb10 22:00:35 2018 - [debug] Connecting via SSH from root@192.168.1.135(192.168.1.135:22) to root@192.168.1.16(192.168.1.16:22)..
Sat Feb10 22:00:36 2018 -[debug] ok.
Sat Feb10 22:00:41 2018 -[debug]
Sat Feb10 22:00:34 2018 - [debug] Connecting via SSH from root@192.168.1.16(192.168.1.16:22) to root@192.168.1.135(192.168.1.135:22)..
Sat Feb10 22:00:41 2018 -[debug] ok.
Sat Feb10 22:00:41 2018 - [info] All SSH connection tests passed successfully.
9.检测主从复制情况是否ok
[root@mysql2 opt]# masterha_check_ssh --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf
Sat Feb10 22:00:34 2018 - [info] Reading default configuration from /etc/masterha/masterha_default.conf..
Sat Feb10 22:00:34 2018 - [info] Reading application default configuration from /etc/masterha/app1.conf..
Sat Feb10 22:00:34 2018 - [info] Reading server configuration from /etc/masterha/app1.conf..
Sat Feb10 22:00:34 2018 - [info] Starting SSH connection tests..
Sat Feb10 22:00:36 2018 -[debug]
Sat Feb10 22:00:35 2018 - [debug] Connecting via SSH from root@192.168.1.135(192.168.1.135:22) to root@192.168.1.16(192.168.1.16:22)..
Sat Feb10 22:00:36 2018 -[debug] ok.
Sat Feb10 22:00:41 2018 -[debug]
Sat Feb10 22:00:34 2018 - [debug] Connecting via SSH from root@192.168.1.16(192.168.1.16:22) to root@192.168.1.135(192.168.1.135:22)..
Sat Feb10 22:00:41 2018 -[debug] ok.
Sat Feb10 22:00:41 2018 - [info] All SSH connection tests passed successfully.
[root@mysql2 opt]#
[root@mysql2 opt]#
[root@mysql2 opt]# masterha_check_repl--global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf
Sat Feb10 22:26:50 2018 - [info] Reading default configuration from /etc/masterha/masterha_default.conf..
Sat Feb10 22:26:50 2018 - [info] Reading application default configuration from /etc/masterha/app1.conf..
Sat Feb10 22:26:50 2018 - [info] Reading server configuration from /etc/masterha/app1.conf..
Sat Feb10 22:26:50 2018 - [info] MHA::MasterMonitor version 0.56.
Sat Feb10 22:26:50 2018 - [info] GTID failover mode = 1Sat Feb10 22:26:50 2018 - [info] Dead Servers:
Sat Feb10 22:26:50 2018 - [info] Alive Servers:
Sat Feb10 22:26:50 2018 - [info] 192.168.1.16(192.168.1.16:3306)
Sat Feb10 22:26:50 2018 - [info] 192.168.1.135(192.168.1.135:3306)
Sat Feb10 22:26:50 2018 - [info] Alive Slaves:
Sat Feb10 22:26:50 2018 - [info] 192.168.1.16(192.168.1.16:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled
Sat Feb10 22:26:50 2018 - [info] GTID ON
Sat Feb10 22:26:50 2018 - [info] Replicating from 192.168.1.135(192.168.1.135:3306)
Sat Feb10 22:26:50 2018 - [info] Primary candidate forthe new Master (candidate_master is set)
Sat Feb10 22:26:50 2018 - [info] Current Alive Master: 192.168.1.135(192.168.1.135:3306)
Sat Feb10 22:26:50 2018 - [info] Checking slave configurations..
Sat Feb10 22:26:50 2018 - [info] read_only=1 is not set on slave 192.168.1.16(192.168.1.16:3306).
Sat Feb10 22:26:50 2018 - [info] Checking replication filtering settings..
Sat Feb10 22:26:50 2018 - [info] binlog_do_db= , binlog_ignore_db=Sat Feb10 22:26:50 2018 - [info] Replication filtering check ok.
Sat Feb10 22:26:50 2018 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Sat Feb10 22:26:50 2018 - [info] Checking SSH publickey authentication settings on the current master..
Sat Feb10 22:26:51 2018 - [info] HealthCheck: SSH to 192.168.1.135is reachable.
Sat Feb10 22:26:51 2018 - [info]192.168.1.135(192.168.1.135:3306) (current master)+--192.168.1.16(192.168.1.16:3306)
Sat Feb10 22:26:51 2018 - [info] Checking replication health on 192.168.1.16..
Sat Feb10 22:26:51 2018 - [info] ok.
Sat Feb10 22:26:51 2018 - [info] Checking master_ip_failover_script status:
Sat Feb10 22:26:51 2018 - [info] /etc/masterha/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.1.135 --orig_master_ip=192.168.1.135 --orig_master_port=3306Sat Feb10 22:26:51 2018 - [info] OK.
Sat Feb10 22:26:51 2018 -[warning] shutdown_script is not defined.
Sat Feb10 22:26:51 2018 - [info] Got exit code 0(Not master dead).
MySQL Replication Health is OK.
10.设置从库上的 relay_log_purge=0 以及 read_only=1 (只读)
'set global relay_log_purge=0'
'set global read_only=1'
应用差异的中继日志到其他从库的时候也许会用到 ,但是我们这里一主一从其实不必配置,如果设置了 relay_log_purge=0 的话,又怕从库的relay log产生过多,这时候我们可以使用purge_relay_logs 命令定时删除,这个是MHA自带的
可以写成一个脚本定时删除 如下:
#!/bin/bash
user=zhuchpasswd=zhuch
port=3306log_dir='/etc/masterha/log'work_dir='/etc/masterha/relay_log_node'purge='/usr/bin/purge_relay_logs'
if [ ! -d $log_dir ]then
mkdir $log_dir -pfi
if [ ! -d $work_dir ]then
mkdir $work_dir -pfi$purge--user=$user --password=$passwd --disable_relay_log_purge --port=$port --workdir=$work_dir >> $log_dir/purge_relay_logs.log 2>&1
基本上MHA 就已经搭建完了 ,主库挂掉后会切换到从库 并且vip 也会漂移到从库
二 安装配置proxysql
1.安装
下载地址 https://www.percona.com/downloads/proxysql/
rpm -ivh proxysql-1.4.3-1-centos67.x86_64.rpm
2.配置 登入proxysql 把MySQL主从信息添加进去,将主库master放入写节点中,也加就是hostgroup_id 为100中,slave节点做读放到1000中
mysql -uadmin -padmin -P6032 -h127.0.0.1
但是注意:这里我直接将写节点的 设置为 VIP 192.168.1.101
insert into mysql_servers(hostgroup_id,hostname,port,weight,max_connections,max_replication_lag,comment) values(100,'192.168.1.101',3306,1,1000,10,'vip');
insert into mysql_servers(hostgroup_id,hostname,port,weight,max_connections,max_replication_lag,comment) values(1000,'192.168.1.16',3306,1,1000,10,'slave'
admin@ 23:16: [(none)]> select * frommysql_servers;+--------------+---------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------------+
| hostgroup_id | hostname | port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
+--------------+---------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------------+
| 100 | 192.168.1.101 | 3306 | ONLINE | 1 | 0 | 1000 | 10 | 0 | 0 | test proxysql |
| 1000 | 192.168.1.16 | 3306 | ONLINE | 1 | 0 | 1000 | 10 | 0 | 0 | test proxysql |
+--------------+---------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------------+
3. 配置后端使用的MySQL用户,需要先在后端MySQL(135,16) 里真实存在,一个是监控账号,一个是程序账号:
GRANT ALL PRIVILEGESON *.* TO 'proxysql'@'192.168.1.16' identified by 'proxysql'
GRANT ALL PRIVILEGES ON *.* TO 'sbuser'@'%' identified by 'sbuser'
在后端MySQL里添加完之后再配置proxysql: 这里需要注意,default_hostgroup需要和上面的对应
insert into mysql_users(username,password,active,default_hostgroup,transaction_persistent) values('sbuser','sbuser',1,100,1);
admin@ 23:37: [(none)]> select * frommysql_users;+----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+
| username | password | active | use_ssl | default_hostgroup | default_schema | schema_locked | transaction_persistent | fast_forward | backend | frontend | max_connections |
+----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+
| sbuser | sbuser | 1 | 0 | 100 | | 0 | 1 | 0 | 1 | 1 | 10000 |
+----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+
4.设置健康的监测账号
admin@ 23:37: [(none)]>set mysql-monitor_username='proxysql';
admin@ 23:37: [(none)]>set mysql-monitor_password='proxysql';
-- 应用到线上
load mysql servers to runtime;
load mysql users to runtime;
load mysql variables to runtime;
-- 持久化
save mysql servers to disk;
save mysql users to disk;
save mysql variables to disk;
要是是用明文密码设置mysql_users,在这里可以用save命令来转换成了hash值的密码:
save mysql users to mem;
admin@ 23:39: [(none)]> select * frommysql_users;+----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+
| username | password | active | use_ssl | default_hostgroup | default_schema | schema_locked | transaction_persistent | fast_forward | backend | frontend | max_connections |
+----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+
| sbuser | *CA96E56547F43610DDE9EB7B12B4EF4C51CDDFFC | 1 | 0 | 100 | | 0 | 1 | 0 | 1 | 1 | 10000 |
+----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+
5.配置路由
-- 发送到M
admin@127.0.0.1 : (none) 04:58:11>INSERT INTO mysql_query_rules(active,match_pattern,destination_hostgroup,apply) VALUES(1,'^SELECT.*FOR UPDATE$',100,1);
Query OK, 1 row affected (0.00 sec)
-- 发送到S
admin@127.0.0.1 : (none) 05:08:17>INSERT INTO mysql_query_rules(active,match_pattern,destination_hostgroup,apply) VALUES(1,'^SELECT',1000,1);
Query OK, 1 row affected (0.00 sec)
admin@127.0.0.1 : (none) 05:09:37>load mysql query rules to runtime;
Query OK, 0 rows affected (0.00 sec)
admin@127.0.0.1 : (none) 05:09:57>save mysql query rules to disk;
Query OK, 0 rows affected (0.00 sec)
6.连接数据库6033 测试读写分离
[root@mysql2 sysbench]# mysql -usbuser -psbuser -P6033 -h192.168.1.16
sbuser@ 23:59: [(none)]>show databases;+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| sys |
| z1_email |
| z1_exchange |
| z1_relation |
+--------------------+
7 rows in set (0.03sec)
sbuser@00:02: [(none)]>sbuser@00:02: [(none)]>sbuser@00:02: [(none)]> usez1_email;Database changed, 2warnings
sbuser@00:02: [z1_email]>sbuser@00:02: [z1_email]> insert into a1 values(134);
Query OK,1 row affected (0.01sec)
sbuser@00:03: [z1_email]> insert into a1 values(146);
Query OK,1 row affected (0.01sec)
sbuser@00:03: [z1_email]> insert into a1 values(157);
Query OK,1 row affected (0.02sec)
sbuser@00:03: [z1_email]>sbuser@00:03: [z1_email]> selet * froma1;
ERROR1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'selet * from a1' at line 1sbuser@00:03: [z1_email]>sbuser@00:03: [z1_email]>sbuser@00:03: [z1_email]>sbuser@00:03: [z1_email]>sbuser@00:03: [z1_email]> select * froma1;+------+
| id |
+------+
| 1 |
| 2 |
| 12 |
| 13 |
| 14 |
| 111 |
| 222 |
| 333 |
| 250 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |
| 11 |
| 12 |
| 13 |
| 14 |
| 15 |
| 15 |
| 15 |
| 16 |
| 123 |
| 124 |
| 17 |
| 1000 |
| 1001 |
| 1002 |
| 1003 |
| 1003 |
| 1004 |
| 1004 |
| 134 |
| 146 |
| 157 |
+------+
36 rows in set (0.00 sec)
进入管理账户6032端口查看,可以看到的确有读写分离已经完成了
admin@ 00:10: [(none)]> select * fromstats_mysql_query_digest;+-----------+--------------------+----------+--------------------+--------------------------+------------+------------+------------+----------+----------+----------+
| hostgroup | schemaname | username | digest | digest_text | count_star | first_seen | last_seen | sum_time | min_time | max_time |
+-----------+--------------------+----------+--------------------+--------------------------+------------+------------+------------+----------+----------+----------+
| 1000 | z1_email | sbuser | 0xB17CC7AAA7E39A4A | select * from a1 | 1 | 1518278606 | 1518278606 | 2123 | 2123 | 2123 |
| 100 | z1_email | sbuser | 0x496C8B86BBC0D398 | insert into a1 values(?) | 3 | 1518278580 | 1518278588 | 30478 | 6373 | 16671 |
| 1000 | information_schema | sbuser | 0x620B328FE9D6D71A | SELECT DATABASE() | 1 | 1518278568 | 1518278568 | 508 | 508 | 508 |
| 100 | information_schema | sbuser | 0x02033E45904D3DF0 | show databases | 1 | 1518278563 | 1518278563 | 30233 | 30233 | 30233 |
+-----------+--------------------+----------+--------------------+--------------------------+------------+------------+------------+----------+----------+----------+
4 rows in set (0.00 sec)
三 测试
1. 模拟主库宕机的情况
分析:主库挂掉后proxysql的写入情况
主库故障,使用MHA 手动failover 将 vip 切换到从库 192.168.1.16上 ,此时 192.168.1.16 上的 vip是192.168.1.101
admin@ 00:17: [(none)]> select hostgroup_id,hostname,port,status,weight fromruntime_mysql_servers;+--------------+---------------+------+--------+--------+
| hostgroup_id | hostname | port | status | weight |
+--------------+---------------+------+--------+--------+
| 100 | 192.168.1.101 | 3306 | ONLINE | 1 |
| 1000 | 192.168.1.16 | 3306 | ONLINE | 1 |
+--------------+---------------+------+--------+--------+
2 rows in set (0.00 sec)
从上面可以看出来 mysql_servers 中的 hostname 的写是192.168.1.101 读是192.168.1.16,这样一来是不是 主库挂了后手动切换后就可以直接写了呢? 测试一下
在主节点上模拟主库挂掉的情况
[root@mysql3 masterha]# ps -ef |grepmysql
mysql2020 65360 0 Feb10 pts/1 00:00:58 mysqld --defaults-file=/etc/my.cnf
root5356 65360 0 00:43 pts/1 00:00:00 grepmysql
[root@mysql3 masterha]#
[root@mysql3 masterha]#
[root@mysql3 masterha]#kill -9 2020
然后去6033 程序端口查看是否可以写 发现报错了,超时
sbuser@ 00:57: [z1_email]> insert into a1 values(158);
ERROR9001 (HY000): Max connect timeout reached while reaching hostgroup 100 after 10001ms
然后去6033 程序端口查看是否可以读 发现也报错了,超时 (这里很奇怪按理说可以读才对)
sbuser@ 18:59: [z1_email]> select *from a1;
ERROR9001 (HY000): Max connect timeout reached while reaching hostgroup 1000 after 10000ms
现在进行手动切换
masterha_master_switch --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf --dead_master_host=192.168.1.135 --master_state=dead --new_master_host=192.168.1.16 --ignore_last_failover
现在已经切换完毕了 并且vip已经切换到了 192.168.1.16上
[root@mysql2 masterha]# ip addr show1: lo: mtu 65536qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00inet127.0.0.1/8scope host lo
inet6 ::1/128scope host
valid_lft forever preferred_lft forever2: eth0: mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000link/ether 00:0c:29:92:bf:e3 brd ff:ff:ff:ff:ff:ff
inet192.168.1.16/24 brd 192.168.1.255scope global eth0
inet192.168.1.101/24scope global secondary eth0
inet6 fe80::20c:29ff:fe92:bfe3/64scope link
valid_lft forever preferred_lft forever
这时候再去程序端口 6033 进行插入和读取的操作,发现可以进行读写了
sbuser@ 19:08: [z1_email]> select * froma1;+------+
| id |
+------+
| 1 |
| 2 |
| 12 |
| 13 |
| 14 |
| 111 |
| 222 |
| 333 |
| 250 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |
| 11 |
| 12 |
| 13 |
| 14 |
| 15 |
| 15 |
| 15 |
| 16 |
| 123 |
| 124 |
| 17 |
| 1000 |
| 1001 |
| 1002 |
| 1003 |
| 1003 |
| 1004 |
| 1004 |
| 134 |
| 146 |
| 157 |
+------+
36 rows in set (0.00sec)
sbuser@19:08: [z1_email]>sbuser@19:08: [z1_email]> insert into a1 values(1590);
Query OK,1 row affected (0.00 sec)
此时主库恢复后 change 到新的主库
root@ 19:21: [(none)]> change master to master_host='192.168.1.16',-> master_user='slave',-> master_password='oracle',-> master_auto_position=1;
查看主从同步状态是OK的
root@ 19:51: [(none)]>show slave status\G;*************************** 1. row ***************************Slave_IO_State: Waitingfor master tosend event
Master_Host:192.168.1.16Master_User: slave
Master_Port:3306Connect_Retry:60Master_Log_File: mysql-bin.000018Read_Master_Log_Pos:2231Relay_Log_File: mysql3-relay-bin.000002Relay_Log_Pos:675Relay_Master_Log_File: mysql-bin.000018Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno:0Last_Error:
Skip_Counter:0Exec_Master_Log_Pos:2231Relay_Log_Space:883Until_Condition: None
Until_Log_File:
Until_Log_Pos:0Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master:0Master_SSL_Verify_Server_Cert: No
Last_IO_Errno:0Last_IO_Error:
Last_SQL_Errno:0Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id:330616Master_UUID: 25aa2017-083b-11e8-b78a-000c2992bfe3
Master_Info_File: mysql.slave_master_info
SQL_Delay:0SQL_Remaining_Delay:NULLSlave_SQL_Running_State: Slave hasread all relay log; waiting formore updates
Master_Retry_Count:86400Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 25aa2017-083b-11e8-b78a-000c2992bfe3:59554Executed_Gtid_Set: 25aa2017-083b-11e8-b78a-000c2992bfe3:1-59554,
7af79590-0840-11e8-ac17-000c29459399:1-10Auto_Position:1Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:1 row in set (0.00 sec)
此时我们再去管理端口查看一下,发现其实管理端口只有192.168.1.16 和 vip 192.168.1.101 并且vip 已经漂移到了 192.168.1.16这台机器上
[root@mysql2 opt]# mysql -uadmin -padmin -P6032 -h127.0.0.1mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connectionid is 19Server version:5.5.30(ProxySQL Admin Module)
Copyright (c)2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type'help;' or '\h' for help. Type '\c' to clearthe current input statement.
admin@19:53: [(none)]> select hostgroup_id,hostname,port,status,weight from runtime_mysql_servers;+--------------+---------------+------+--------+--------+
| hostgroup_id | hostname | port | status | weight |
+--------------+---------------+------+--------+--------+
| 100 | 192.168.1.101 | 3306 | ONLINE | 1 |
| 1000 | 192.168.1.16 | 3306 | ONLINE | 1 |
+--------------+---------------+------+--------+--------+
2 rows in set (0.00 sec)
然后我们加入192.168.1.135 并且我这里分配的权重是9
insert into mysql_servers(hostgroup_id,hostname,port,weight,max_connections,max_replication_lag,comment) values(1000,'192.168.1.135',3306,9,1000,10,'test proxysql');
admin@ 19:59: [(none)]> load mysql servers to runtime;
Query OK, 0 rows affected (0.01 sec)
admin@ 19:59: [(none)]> save mysql servers to disk;
Query OK, 0 rows affected (0.05 sec)
查看runtime_mysql_servers ,有 十分之九的概率的 读操作会分配到 192.168.1.135 十分之一的读会在 192.168.1.16 并且全部的写操作都在 192.168.1.16(因为VIP 192.168.1.101在16上)
admin@ 19:59: [(none)]> select hostgroup_id,hostname,port,status,weight fromruntime_mysql_servers;+--------------+---------------+------+--------+--------+
| hostgroup_id | hostname | port | status | weight |
+--------------+---------------+------+--------+--------+
| 100 | 192.168.1.101 | 3306 | ONLINE | 1 |
| 1000 | 192.168.1.135 | 3306 | ONLINE | 9 |
| 1000 | 192.168.1.16 | 3306 | ONLINE | 1 |
+--------------+---------------+------+--------+--------+
现在主库为192.168.1.16 如果此时主库挂了怎么办? 是否还会影响在proxysql中的读写操作呢?
我们再次模拟 主库挂掉的情况 此时主库是 192.168.1.16
[root@mysql2 opt]# ps -ef |grep mysql
root2976 21612 0 19:50 pts/4 00:00:00 mysql -uroot -px xxxx
root2983 6583 0 19:53 pts/3 00:00:00 mysql -uadmin -px xxx -P6032 -h127.0.0.1root3369 15620 0 22:09 pts/1 00:00:00grep mysql
mysql28714 15620 0 Feb10 pts/1 00:01:51 mysqld --defaults-file=/etc/my.cnf
root 31851 21524 0 Feb10 pts/0 00:00:00 mysql -usbuser -px xxxx -P6033 -h192.168.1.16
[root@mysql2 opt]#[root@mysql2 opt]#[root@mysql2 opt]# kill -9 28714
此时再去 proxysql的程序端口6033中做读操作 超时不可读
sbuser@ 20:35: [z1_email]> select * froma1;
ERROR9001 (HY000): Max connect timeout reached while reaching hostgroup 1000 after 10000ms
此时再去 proxysql的程序端口6033中做写操作 超时不可写
sbuser@ 22:17: [z1_email]> insert into a1 values(1591);
ERROR9001 (HY000): Max connect timeout reached while reaching hostgroup 100 after 10001ms
这时候我们做基于MHA 的手动failover操作
masterha_master_switch --global_conf=/etc/masterha/masterha_default.conf --conf=/etc/masterha/app1.conf --dead_master_host=192.168.1.16 --master_state=dead --new_master_host=192.168.1.135 --ignore_last_failover
此时vip 已经漂移到192.168.1.135 上了 ,并且我们进proxysql管理端口 6032 看看
admin@ 20:35: [(none)]> select hostgroup_id,hostname,port,status,weight fromruntime_mysql_servers;+--------------+---------------+------+---------+--------+
| hostgroup_id | hostname | port | status | weight |
+--------------+---------------+------+---------+--------+
| 100 | 192.168.1.101 | 3306 | SHUNNED | 1 |
| 1000 | 192.168.1.135 | 3306 | ONLINE | 9 |
| 1000 | 192.168.1.16 | 3306 | SHUNNED | 1 |
+--------------+---------------+------+---------+--------+
我们再进入 proxysql的 6033 端口看看是否可以做读操作 因为此时 192.168.1.135 的状态还是online的
sbuser@ 22:22: [z1_email]> select * froma1;+------+
| id |
+------+
| 1 |
| 2 |
| 12 |
| 13 |
| 14 |
| 111 |
| 222 |
| 333 |
| 250 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |
| 11 |
| 12 |
| 13 |
| 14 |
| 15 |
| 15 |
| 15 |
| 16 |
| 123 |
| 124 |
| 17 |
| 1000 |
| 1001 |
| 1002 |
| 1003 |
| 1003 |
| 1004 |
| 1004 |
| 134 |
| 146 |
| 157 |
| 1590 |
+------+
37 rows in set (0.00 sec)
可见是可以读的,那么我们vip 已经漂移到了192.168.1.135上了啊 是否可以写呢?
sbuser@ 22:23: [z1_email]> insert into a1 values(1591);
Query OK,1 row affected (0.14 sec)
发现可以写的,我们再回到管理端口6302 去看看居然发现 vip 192.168.1.101 的状态又变回了ONLINE (emmmm.....)
admin@ 22:21: [(none)]> select hostgroup_id,hostname,port,status,weight fromruntime_mysql_servers;+--------------+---------------+------+---------+--------+
| hostgroup_id | hostname | port | status | weight |
+--------------+---------------+------+---------+--------+
| 100 | 192.168.1.101 | 3306 | ONLINE | 1 |
| 1000 | 192.168.1.135 | 3306 | ONLINE | 9 |
| 1000 | 192.168.1.16 | 3306 | SHUNNED | 1 |
+--------------+---------------+------+---------+--------+
3 rows in set (0.01 sec)
所以这里我觉得 应该是proxysql 没有立刻获取 vip 已经漂移的状态,显示的是 SHUNNED ,但是并不影响使用 只是显示有问题
最后我们再把 192.168.1.16 恢复起来 change 到 新的master 192.168.1.135上
[root@mysql2 masterha]# mysqld --defaults-file=/etc/my.cnf &
[2] 3490
[root@mysql2 masterha]# mysql -uroot -poracle
mysql:[Warning] Using a password onthe command line interface can be insecure.
Welcometo the MySQL monitor. Commands end with ; or\g.
Your MySQL connection idis 3Server version:5.7.17-logMySQL Community Server (GPL)
Copyright (c)2000, 2016, Oracle and/or its affiliates. Allrights reserved.
Oracleis a registered trademark of Oracle Corporation and/orits
affiliates. Other names may be trademarksoftheir respective
owners.
Type'help;' or '\h' for help. Type '\c' to clear the currentinput statement.
root@22:28: [(none)]> change master to master_host='192.168.1.135',-> master_user='slave',-> master_password='oracle',-> master_auto_position=1;
Query OK,0 rows affected, 2 warnings (0.07sec)
root@22:35: [(none)]>start slave;
Query OK,0 rows affected (0.02sec)
root@22:36: [(none)]>show slave status\G;*************************** 1. row ***************************Slave_IO_State: Waitingfor master tosend event
Master_Host:192.168.1.135Master_User: slave
Master_Port:3306Connect_Retry:60Master_Log_File: mysql-bin.000016Read_Master_Log_Pos:743Relay_Log_File: mysql2-relay-bin.000002Relay_Log_Pos:675Relay_Master_Log_File: mysql-bin.000016Slave_IO_Running: Yes
Slave_SQL_Running: Yes
........
再查看一下 proxysql的管理端口 6032,发现192.168.1.16显示状态还是
admin@ 22:24: [(none)]> select hostgroup_id,hostname,port,status,weight fromruntime_mysql_servers;+--------------+---------------+------+---------+--------+
| hostgroup_id | hostname | port | status | weight |
+--------------+---------------+------+---------+--------+
| 100 | 192.168.1.101 | 3306 | ONLINE | 1 |
| 1000 | 192.168.1.135 | 3306 | ONLINE | 9 |
| 1000 | 192.168.1.16 | 3306 | SHUNNED | 1 |
+--------------+---------------+------+---------+--------+
3 rows in set (0.01 sec)
我们去proxysql的程序端口6033 进行查询一次
sbuser@ 22:23: [z1_email]> select * froma1;+------+
| id |
+------+
| 1 |
| 2 |
| 12 |
| 13 |
| 14 |
| 111 |
| 222 |
| 333 |
| 250 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |
| 11 |
| 12 |
| 13 |
| 14 |
| 15 |
| 15 |
| 15 |
| 16 |
| 123 |
| 124 |
| 17 |
| 1000 |
| 1001 |
| 1002 |
| 1003 |
| 1003 |
| 1004 |
| 1004 |
| 134 |
| 146 |
| 157 |
| 1590 |
| 1591 |
+------+
38 rows in set (0.00 sec)
再查看一下 proxysql的管理端口 6032看看 可见都显示ONLINE 了
admin@ 22:37: [(none)]> select hostgroup_id,hostname,port,status,weight fromruntime_mysql_servers;+--------------+---------------+------+--------+--------+
| hostgroup_id | hostname | port | status | weight |
+--------------+---------------+------+--------+--------+
| 100 | 192.168.1.101 | 3306 | ONLINE | 1 |
| 1000 | 192.168.1.135 | 3306 | ONLINE | 9 |
| 1000 | 192.168.1.16 | 3306 | ONLINE | 1 |
+--------------+---------------+------+--------+--------+
3 rows in set (0.01 sec)
最后做一个总结:
MHA + proxysql 可以做到高可用和读写分离,在主库挂掉后切换到从库,通过主库的vip漂移的特性将proxysql中的写节点配置成vip,
并且总是主库在做写操作的,因为vip在哪台机器哪台机器就是主库。
而且如果我们做了如下结构的proxysql策略,则无论是 哪台机器挂掉 ,只要进行切换就不会影响读和写
admin@ 22:37: [(none)]> select hostgroup_id,hostname,port,status,weight fromruntime_mysql_servers;+--------------+---------------+------+--------+--------+
| hostgroup_id | hostname | port | status | weight |
+--------------+---------------+------+--------+--------+
| 100 | 192.168.1.101 | 3306 | ONLINE | 1 |
| 1000 | 192.168.1.135 | 3306 | ONLINE | 9 |
| 1000 | 192.168.1.16 | 3306 | ONLINE | 1 |
+--------------+---------------+------+--------+--------+
3 rows in set (0.01 sec)