基于DRBD+Pacemaker+Corosync实现高可用的MySQL

最新推荐文章于 2021-01-27 13:47:26 发布

congrang5960

最新推荐文章于 2021-01-27 13:47:26 发布

阅读量146

点赞数

基于DRBD的Active/Passive架构

DRBD Active/Passive架构说明

注:在Active/Standby的架构体系中,永远只有Active主机在提供服务,Standby主机不对外提供任何服务(包括MySQL的”读”).
环境搭建说明
node1:
- IP: 192.168.1.189
- HostName: centos189
node2:
- IP: 192.168.1.193
- HostName: centos193

虚拟IP(VIP):
- IP: 192.168.1.198

网络和服务器设置

时间同步
```
# ntpdate cn.pool.ntp.org
```

设置Selinux

可将SELINUX设置为permissive或disabled

[root@centos193 ~]# cat /etc/sysconfig/selinux

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=disabled # SELINUXTYPE= can take one of these two values:
#     targeted - Targeted processes are protected,
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted

iptables防火墙设置

这里为了方便,关闭防火墙iptables:

# service iptables stop
iptables: Flushing firewall rules:                         [  OK  ]
iptables: Setting chains to policy ACCEPT: filter          [  OK  ]
iptables: Unloading modules:                               [  OK  ]
# chkconfig iptables off

注:实际环境中不必关闭防火墙,只需要开启相关端口即可(DRBD:7788-7789,CoroSync:3999-4000)

设置机器hostname

[root@centos193 ~]# cat /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=centos193
[root@centos193 ~]# source /etc/sysconfig/network
[root@centos193 ~]# hostname $HOSTNAME

添加hostname到每台机器的/etc/hosts
```
[root@centos193 ~]# cat /etc/hosts
...
192.168.1.189 node1.zrwm.com centos189
192.168.1.193 node2.zrwm.com centos193
```
建议:不使用外部的DNS服务(那样会成为额外的故障点),而是将这些mappings配置到每台机器的/etc/hosts文件.

配置机器间的SSH无密码访问

[root@centos193 ~]# ssh-keygen -t rsa -b 1024
[root@centos193 ~]# ssh-copy-id root@192.168.1.189
[root@centos189 ~]# ssh-keygen -t rsa -b 1024
[root@centos189 ~]# ssh-copy-id root@192.168.1.193

DRBD的安装与配置

DRBD下载与安装

# wget -c http://elrepo.org/linux/elrepo/el6/x86_64/RPMS/drbd84-utils-8.4.2-1.el6.elrepo.x86_64.rpm
# wget -c http://elrepo.org/linux/elrepo/el6/x86_64/RPMS/kmod-drbd84-8.4.2-1.el6_3.elrepo.x86_64.rpm
# rpm -ivh *.rpm
warning: drbd84-utils-8.4.2-1.el6.elrepo.x86_64.rpm: Header V4 DSA/SHA1 Signature, key ID baadae52: NOKEY
Preparing...                ########################################### [100%]
   1:drbd84-utils           ########################################### [ 50%]
   2:kmod-drbd84            ########################################### [100%]
Working. This may take some time ...
Done.

DRBD配置
- 获取一个sha1值做为shared-secret
```
[root@centos189 drbd]# sha1sum /etc/drbd.conf
8a6c5f3c21b84c66049456d34b4c4980468bcfb3  /etc/drbd.conf
```
- 创建并编辑资源配置文件:/etc/drbd.d/dbcluster.res
```
[root@centos189 drbd.d]# cat /etc/drbd.d/dbcluster.res 
resource dbcluster {
    protocol C;
    net {
        cram-hmac-alg sha1;
        shared-secret "8a6c5f3c21b84c66049456d34b4c4980468bcfb3";
        after-sb-0pri discard-zero-changes;
        after-sb-1pri discard-secondary;
        after-sb-2pri disconnect;
        rr-conflict disconnect;
    }
    device    /dev/drbd0;
    disk      /dev/sdb1;
    meta-disk internal;
    on centos189 {
        address   192.168.1.189:7789;
    }
    on centos193 {
        address   192.168.1.193:7789;
    }
}
```
  以上配置所用参数说明:
  RESOURCE: 资源名称
  PROTOCOL: 使用协议”C”表示”同步的”,即收到远程的写入确认之后,则认为写入完成.
  NET: 两个节点的SHA1 key是一样的
  after-sb-0pri : “Split Brain”发生时且没有数据变更,两节点之间正常连接
  after-sb-1pri : 如果有数据变更,则放弃辅设备数据,并且从主设备同步
  after-sb-2pri : 如果前面的选择是不可能的,那么断开节点之间连接.这种情况下,要求手动处理”Split-Brain”
  rr-conflict: 假如前面的设置不能应用,并且drbd系统有角色冲突的话,系统自动断开节点间连接
  DEVICE: 虚拟设备
  DISK: 物理磁盘设备
  META-DISK: Meta data保存在同一个磁盘(sdb1)
  ON : 组成集群的节点
- 将DRBD配置拷贝到node机器:
```
[root@centos189 drbd]# scp /etc/drbd.d/dbcluster.res root@192.168.1.193:/etc/drbd.d/
```

创建资源及文件系统

创建分区(未格式化过)
在node1和node2上创建LVM分区：

# fdisk /dev/sdb WARNING: DOS-compatible mode is deprecated. It's strongly recommended to
         switch off the mode (command 'c') and change display units to
         sectors (command 'u').

Command (m for help): n Command action
   e   extended
   p   primary partition (1-4) p Partition number (1-4): 1 First cylinder (1-1044, default 1): 
Using default value 1 Last cylinder, +cylinders or +size{K,M,G} (1-1044, default 1044): +8096M Command (m for help): w The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

给资源(dbcluster)创建meta data
Node1:

[root@centos189 drbd]# drbdadm create-md dbcluster md_offset 8496676864
al_offset 8496644096
bm_offset 8496381952

Found ext3 filesystem
     7911980 kB data area apparently used
     8297248 kB left usable by current configuration

Even though it looks like this would place the new meta data into
unused space, you still need to confirm, as this is only a guess.

Do you want to proceed?
[need to type 'yes' to confirm] yes Writing meta data...
initializing activity log
NOT initializing bitmap
New drbd meta data block successfully created.
success

Node2:

[root@centos193 ~]# drbdadm create-md dbcluster
md_offset 8496676864
al_offset 8496644096
bm_offset 8496381952

Found ext3 filesystem
     7911980 kB data area apparently used
     8297248 kB left usable by current configuration

Even though it looks like this would place the new meta data into
unused space, you still need to confirm, as this is only a guess.

Do you want to proceed?
[need to type 'yes' to confirm] yes

Writing meta data...
initializing activity log
NOT initializing bitmap
New drbd meta data block successfully created.
success

激活资源
- 首先确保drbd module已经加载
查看是否加载:

# lsmod | grep drbd

若未加载,则需加载:

# modprobe drbd
# lsmod | grep drbd
drbd                  317261  0 
libcrc32c               1246  1 drbd

- 启动drbd后台进程:

[root@centos189 drbd]# drbdadm up dbcluster
[root@centos191 drbd]# drbdadm up dbcluster

查看drbd状态:
Node1:

[root@centos189 drbd]# /etc/init.d/drbd status
drbd driver loaded OK; device status:
version: 8.4.2 (api:1/proto:86-101)
GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10
m:res        cs         ro                   ds                         p  mounted  fstype
0:dbcluster  Connected Secondary/Secondary Inconsistent/Inconsistent  C

Node2:

[root@centos193 drbd]# /etc/init.d/drbd status
drbd driver loaded OK; device status:
version: 8.4.2 (api:1/proto:86-101)
GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10
m:res        cs         ro                   ds                         p  mounted  fstype
0:dbcluster  Connected Secondary/Secondary Inconsistent/Inconsistent  C

从上面的信息可以看到,DRBD服务已经在两台机器上运行,但任何一台机器都不是主机器(“primary” host),因此无法访问到资源(block device).

开始同步
仅在主节点操作(这里为node1).

[root@centos189 drbd]# drbdadm -- --overwrite-data-of-peer primary dbcluster

查看同步状态:

[root@centos189 drbd.d]# cat /proc/drbd 
version: 8.4.2 (api:1/proto:86-101)
GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10
 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r---n-
    ns:4347852 nr:0 dw:0 dr:4349592 al:0 bm:265 lo:0 pe:2 ua:2 ap:0 ep:1 wo:f oos:3951392
	[=========>..........] sync'ed: 52.5% (3856/8100)M
	finish: 0:01:26 speed: 45,852 (46,232) K/sec
[root@centos189 drbd.d]# cat /proc/drbd 
version: 8.4.2 (api:1/proto:86-101)
GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10
 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
    ns:8297248 nr:0 dw:0 dr:8297912 al:0 bm:507 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

上面的输出结果的一些说明:

cs (connection state): 网络连接状态
ro (roles): 节点的角色(本节点的角色首先显示)
ds (disk states):硬盘的状态
复制协议: A, B or C(本配置是C)

看到drbd状态为”cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate”即表示同步结束.

也可以这样查看drbd状态:

[root@centos193 drbd]# drbd-overview 
  0:dbcluster/0  Connected Secondary/Primary UpToDate/UpToDate C r-----

创建文件系统
在主节点(Node1)创建文件系统:

[root@centos189 drbd]# mkfs -t ext4 /dev/drbd0 mke2fs 1.41.12 (17-May-2010)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
519168 inodes, 2074312 blocks
103715 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=2126512128
64 block groups
32768 blocks per group, 32768 fragments per group
8112 inodes per group
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632

Writing inode tables: done                            
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 39 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.

注:没必要在辅节点(Node2)做同样的操作,因为DRBD会处理原始磁盘数据的同步.
另外,我们也不需要将这个DRBD系统挂载到任何一台机器(当然安装MySQL的时候需要临时挂载来安装MySQL),因为集群管理软件会处理.还有要确保复制的文件系统仅仅挂载在Active的主服务器上.

安装和配置MySQL

MySQL 5.6安装

创建mysql用户组/用户(Node1,Node2)

# groupadd mysql
# useradd -g mysql mysql

安装MySQL(Node1,Node2)

# yum -y install gcc-c++  ncurses-devel  cmake
# wget -c http://dev.mysql.com/get/Downloads/MySQL-5.6/mysql-5.6.10.tar.gz/from/http://cdn.mysql.com/
# tar zxvf mysql-5.6.10.tar.gz
# cd mysql-5.6.10
# cmake . -DCMAKE_INSTALL_PREFIX=/usr/local/mysql
# make && make install

创建DRBD分区挂载目录(Node1,Node2)

# mkdir /var/lib/mysql_drbd
# mkdir /var/lib/mysql
# chown mysql:mysql -R /var/lib/mysql_drbd
# chown mysql:mysql -R /var/lib/mysql

初始化MySQL数据库
- 初始化之前先临时挂载DRBD文件系统到主节点(Node1)

[root@centos189 ~]# mount /dev/drbd0 /var/lib/mysql_drbd/

- 初始化操作(Node1):

[root@centos189 mysql]# cd /usr/local/mysql
[root@centos189 mysql]# mkdir /var/lib/mysql_drbd/data
[root@centos189 mysql]# chown -R mysql:mysql /var/lib/mysql_drbd/data
[root@centos189 mysql]# chown -R mysql:mysql .
[root@centos189 mysql]# scripts/mysql_install_db --datadir=/var/lib/mysql_drbd/data --user=mysql

- 初始化完成之后:

[root@centos189 mysql]# cp support-files/mysql.server /etc/init.d/mysql
[root@centos189 mysql]# mv support-files/my-default.cnf /etc/my.cnf
[root@centos189 mysql]# chown mysql /etc/my.cnf 
[root@centos189 mysql]# chmod 644 /etc/my.cnf 
[root@centos189 mysql]# chown -R root .
[root@centos189 mysql]# cd /var/lib/mysql_drbd
[root@centos189 mysql_drbd]# chmod -R uog+rw *
[root@centos189 mysql_drbd]# chown -R mysql data

配置MySQL(Node1):

[root@centos189 mysql_drbd]# cat /etc/my.cnf
#
# /etc/my.cnf
#

[client]

port                           = 3306
socket                         = /var/lib/mysql/mysql.sock

[mysqld]

port                           = 3306
socket                         = /var/lib/mysql/mysql.sock

datadir                        = /var/lib/mysql_drbd/data
user                           = mysql
#memlock                        = 1

#table_open_cache               = 3072
#table_definition_cache         = 1024
max_heap_table_size            = 64M
tmp_table_size                 = 64M

# Connections

max_connections                = 505
max_user_connections           = 500
max_allowed_packet             = 16M
thread_cache_size              = 32

# Buffers

sort_buffer_size               = 8M
join_buffer_size               = 8M
read_buffer_size               = 2M
read_rnd_buffer_size           = 16M

# Query Cache

#query_cache_size               = 64M

# InnoDB

#innodb_buffer_pool_size        = 1G
#innodb_data_file_path          = ibdata1:2G:autoextend

#innodb_log_file_size           = 128M
#innodb_log_files_in_group      = 2

# MyISAM

myisam_recover                 = backup,force

# Logging

#general-log = 0
#general_log_file               = /var/lib/mysql/mysql_general.log

log_warnings                   = 2
log_error                      = /var/lib/mysql/mysql_error.log

#slow_query_log                 = 1
#slow_query_log_file            = /var/lib/mysql/mysql_slow.log
#long_query_time                = 0.5
#log_queries_not_using_indexes  = 1
#min_examined_row_limit         = 20

# Binary Log / Replication

server_id                      = 1
log-bin                        = mysql-bin
binlog_cache_size              = 1M
#sync_binlog                    = 8
binlog_format                  = row
expire_logs_days               = 7
max_binlog_size                = 128M

[mysqldump]

quick
max_allowed_packet             = 16M

[mysql]

no_auto_rehash

[myisamchk]

#key_buffer                     = 512M
#sort_buffer_size               = 512M
read_buffer                    = 8M
write_buffer                   = 8M

[mysqld_safe]

open-files-limit               = 8192
pid-file                       = /var/lib/mysql/mysql.pid

在主节点Node1测试MySQL

[root@centos189 mysql_drbd]# /usr/local/mysql/bin/mysqld_safe --user=mysql > /dev/null &
[root@centos189 mysql_drbd]# mysql -uroot -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.6.10-log Source distribution

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> use test;
Database changed
mysql> show tables;
Empty set (0.10 sec)

mysql> create table tbl (a int);
Query OK, 0 rows affected (3.80 sec)

mysql> insert into tbl values (1), (2);
Query OK, 2 rows affected (0.25 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> quit;
Bye
[root@centos189 mysql_drbd]# /usr/local/mysql/bin/mysqladmin -uroot -p shutdown
Enter password: 
[1]+  Done                    /usr/local/mysql/bin/mysqld_safe --user=mysql > /dev/null

在节点Node1卸载DRBD文件系统

[root@centos189 ~]# umount /var/lib/mysql_drbd
[root@centos189 ~]# drbdadm secondary dbcluster

将DRBD文件系统挂载节点Node2

[root@centos193 ~]# drbdadm primary dbcluster
[root@centos193 ~]# mount /dev/drbd0 /var/lib/mysql_drbd
[root@centos193 ~]# ll /var/lib/mysql_drbd/
total 20
drwxrwxrwx 5 mysql mysql  4096 Mar 12 09:30 data
drwxrw-rw- 2 mysql mysql 16384 Mar 10 07:49 lost+found

节点Node2上配置MySQL并测试

[root@centos193 ~]# scp centos189:/etc/my.cnf /etc/my.cnf
[root@centos193 ~]# chown mysql /etc/my.cnf
[root@centos193 ~]# chmod 644 /etc/my.cnf 
[root@centos193 ~]# cd /usr/local/mysql/
[root@centos193 mysql]# cp support-files/mysql.server /etc/init.d/mysql
[root@centos193 mysql]# chown -R root:mysql .

测试MySQL:

[root@centos193 mysql]# /usr/local/mysql/bin/mysqld_safe --user=mysql > /dev/null &
[1] 15864
[root@centos193 mysql]# mysql -uroot -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.6.10-log Source distribution

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> use test;
Database changed
mysql> select * from tbl;
+------+
| a    |
+------+
|    1 |
|    2 |
+------+
2 rows in set (0.26 sec)

mysql> quit
Bye
[root@centos193 mysql]# /usr/local/mysql/bin/mysqladmin -uroot -p shutdown
Enter password: 
[1]+  Done                    /usr/local/mysql/bin/mysqld_safe --user=mysql > /dev/null

在Node2上卸载DRBD文件系统,交由集群管理软件Pacemaker来管理

[root@centos193 mysql]# umount /var/lib/mysql_drbd
[root@centos193 mysql]# drbdadm secondary dbcluster
[root@centos193 mysql]# drbd-overview 
  0:dbcluster/0  Connected Secondary/Secondary UpToDate/UpToDate C r----- 
[root@centos193 mysql]#

Corosync和Pacemaker

安装软件

安装corosync:

# yum -y install corosync corosynclib-devel
# yum -y install corosync corosynclib-devel

安装pacemaker:

# yum -y install libtool-ltdl-devel libuuid-devel libxslt-devel libqb libqb-devel 
# yum -y install glib2-devel bzip2-devel libxml2-devel docbook-dtds.noarch
# yum -y install pacemaker resource-agents pacemaker-libs-devel

安装cluster-glue:

#yum install cluster-glue cluster-glue-libs-devel

安装crmsh:

[root@centos189 pacemaker]# wget -c http://hg.savannah.gnu.org/hgweb/crmsh/archive/tip.tar.gz
[root@centos189 pacemaker]# tar zxvf tip.tar.gz
[root@centos189 pacemaker]# cd crmsh-6cf4ba4f2568/
[root@centos189 crmsh-6cf4ba4f2568]# ./autogen.sh
[root@centos189 crmsh-6cf4ba4f2568]# ./configure
[root@centos189 crmsh-6cf4ba4f2568]# make && make install

配置corosync

Corosync Key
- 生成节点间安全通信的key:

[root@centos189 ~]# corosync-keygen
Corosync Cluster Engine Authentication key generator.
Gathering 1024 bits for key from /dev/random.
Press keys on your keyboard to generate entropy.
Writing corosync key to /etc/corosync/authkey.

- 将authkey拷贝到另一个节点(保持authkey的权限为400):

[root@centos189 ~]# scp /etc/corosync/authkey centos193:/etc/corosync/
[root@centos189 ~]# ll /etc/corosync/authkey 
-r-------- 1 root root 128 Mar 10 13:56 /etc/corosync/authkey
[root@centos193 ~]# ll /etc/corosync/authkey 
-r-------- 1 root root 128 Mar 10 14:02 /etc/corosync/authkey

corosync配置:
- 编辑/etc/corosync/corosync.conf:

[root@centos189 ~]# cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf
[root@centos189 ~]# vi /etc/corosync/corosync.conf
[root@centos189 ~]# cat /etc/corosync/corosync.conf
# Please read the corosync.conf.5 manual page
compatibility: whitetank

aisexec { 
        user: root 
        group: root 
} 

totem {
	version: 2
	secauth: off
	threads: 0
	interface {
		ringnumber: 0
		bindnetaddr: 192.168.1.0
		mcastaddr: 226.94.1.1
		mcastport: 4000
		ttl: 1
	}
}

logging {
	fileline: off
	to_stderr: no
	to_logfile: yes
	to_syslog: yes
	logfile: /var/log/cluster/corosync.log
	debug: off
	timestamp: on
	logger_subsys {
		subsys: AMF
		debug: off
	}
}

amf {
	mode: disabled
}

注:CoroSync使用两个UDP端口,一个用于发送(4000),一个用于接收(3999).只要配置文件里面配置一个端口为N,那另外一个即为N-1.

- 创建并编辑/etc/corosync/service.d/pcmk,添加”pacemaker”服务

[root@centos189 ~]# cat /etc/corosync/service.d/pcmk 
service {
# Load the Pacemaker Cluster Resource Manager
name: pacemaker
ver: 1
}

将上面两个配置文件拷贝到另一节点

[root@centos189 ~]# scp /etc/corosync/corosync.conf centos193:/etc/corosync/corosync.conf
[root@centos189 ~]# scp /etc/corosync/service.d/pcmk centos193:/etc/corosync/service.d/pcmk

启动corosync和Pacemaker

- 分别在两个节点上启动corosync并检查.

[root@centos189 ~]# /etc/init.d/corosync start
Starting Corosync Cluster Engine (corosync):               [  OK  ]
[root@centos189 ~]# /etc/init.d/corosync status
corosync (pid  24831) is running...
[root@centos189 ~]# corosync-cfgtool -s
Printing ring status.
Local node ID -1123964736
RING ID 0
	id	= 192.168.1.189
	status	= ring 0 active with no faults

[root@centos193 ~]# /etc/init.d/corosync start
Starting Corosync Cluster Engine (corosync):               [  OK  ]
[root@centos193 ~]# /etc/init.d/corosync status
corosync (pid  19251) is running...
[root@centos193 ~]# corosync-objctl | grep members
runtime.totem.pg.mrp.srp.members.-1123964736.ip=r(0) ip(192.168.1.189) 
runtime.totem.pg.mrp.srp.members.-1123964736.join_count=1
runtime.totem.pg.mrp.srp.members.-1123964736.status=joined
runtime.totem.pg.mrp.srp.members.-1056855872.ip=r(0) ip(192.168.1.193) 
runtime.totem.pg.mrp.srp.members.-1056855872.join_count=1
runtime.totem.pg.mrp.srp.members.-1056855872.status=joined

- 在启动Pacemaker之前可查看日志看是否有出错

# cat /var/log/cluster/corosync.log
# tail -f /var/log/messages

- 在两节点上分别启动Pacemaker:

[root@centos189 corosync]# /etc/init.d/pacemaker start
Starting Pacemaker Cluster Manager:                        [  OK  ]
[root@centos189 corosync]# /etc/init.d/pacemaker status
pacemakerd (pid  24895) is running...

[root@centos193 ~]# /etc/init.d/pacemaker start
Starting Pacemaker Cluster Manager:                        [  OK  ]
[root@centos193 ~]# /etc/init.d/pacemaker status
pacemakerd (pid  19417) is running...

- 查看集群状态:

[root@centos189 ~]# crm_mon -1
Last updated: Sun Mar 10 15:06:52 2013
Last change: Sun Mar 10 14:59:27 2013 via crmd on centos189
Stack: classic openais (with plugin)
Current DC: centos189 - partition with quorum
Version: 1.1.8-7.el6-394e906
2 Nodes configured, 2 expected votes
0 Resources configured.

Online: [ centos189 centos193 ]

资源配置

配置资源及约束

配置默认属性
查看已存在的配置:

[root@centos189 ~]# crm configure show
node centos189
node centos193
property $id="cib-bootstrap-options" \
	dc-version="1.1.8-7.el6-394e906" \
	cluster-infrastructure="classic openais (with plugin)" \
	expected-quorum-votes="2"

检验配置是否正确:

[root@centos189 ~]# crm_verify -L -V
   error: unpack_resources: 	Resource start-up disabled since no STONITH resources have been defined
   error: unpack_resources: 	Either configure some or disable STONITH with the stonith-enabled option
   error: unpack_resources: 	NOTE: Clusters with shared data need STONITH to ensure data integrity
Errors found during check: config not valid
  -V may provide more details

禁止STONITH错误:

[root@centos189 ~]# crm configure property stonith-enabled=false
[root@centos189 ~]# crm_verify -L

让集群忽略Quorum:

[root@centos189 ~]# crm configure property no-quorum-policy=ignore

防止资源在恢复之后移动:

[root@centos189 ~]# crm configure rsc_defaults resource-stickiness=100

设置操作的默认超时:

[root@centos189 www]# crm configure property default-action-timeout="180s"

设置默认的启动失败是否为致命的:

[root@centos189 www]# crm configure property start-failure-is-fatal="false"

配置DRBD资源
- 配置之前先停止DRBD:

[root@centos189 ~]# /etc/init.d/drbd stop
[root@centos193 ~]# /etc/init.d/drbd stop

- 配置DRBD资源:

[root@centos189 www]# crm configure
crm(live)configure# primitive p_drbd_mysql ocf:linbit:drbd params \
> drbd_resource="dbcluster" op monitor interval="15s" op start timeout="240s" \
> op stop timeout="100s"

- 配置DRBD资源主从关系(定义只有一个Master节点):

crm(live)configure# ms ms_drbd_mysql p_drbd_mysql meta master-max="1" \
> master-node-max="1" clone-max="2" clone-node-max="1" notify="true"

- 配置文件系统资源,定义挂载点(mount point):

crm(live)configure# primitive p_fs_mysql ocf:heartbeat:Filesystem \
> params device="/dev/drbd0" directory="/var/lib/mysql_drbd/" fstype="ext4"

配置VIP资源

crm(live)configure# primitive p_ip_mysql ocf:heartbeat:IPaddr2 params \
> ip="192.168.1.198" cidr_netmask="24" op monitor interval="30s"

配置MySQL资源
使用LSB方式(本文使用):

crm(live)configure# primitive p_mysql lsb:mysql \
> op monitor interval="20s" timeout="30s" \
> op start interval="0" timeout="180s" \
> op stop interval="0" timeout="240s"

或使用OCF方式:

crm(live)configure# primitive p_mysql ocf:heartbeat:mysql params \
> binary="/usr/local/mysql/bin/mysqld_safe" config="/etc/my.cnf" \
> user="mysql" group="mysql" log="/var/lib/mysql/mysql_error.log" \
> pid="/var/lib/mysql/mysql.pid" socket="/var/lib/mysql/mysql.sock" \
> datadir="/var/lib/mysql_drbd/data" \
> op monitor interval="60s" timeout="60s" \
> op start timeout="180s" op stop timeout="240s"

组资源和约束
通过”组”确保DRBD,MySQL和VIP是在同一个节点(Master)并且确定资源的启动/停止顺序.
启动: p_fs_mysql–>p_ip_mysql->p_mysql
停止: p_mysql–>p_ip_mysql–>p_fs_mysql
```
 
crm(live)configure# group g_mysql p_fs_mysql p_ip_mysql p_mysql
```
组group_mysql永远只在Master节点:
```
crm(live)configure# colocation c_mysql_on_drbd inf: g_mysql ms_drbd_mysql:Master
```
MySQL的启动永远是在DRBD Master之后:
```
crm(live)configure# order o_drbd_before_mysql inf: ms_drbd_mysql:promote g_mysql:start
```

配置检查和提交

crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# quit

查看集群状态和failover测试
状态查看:

[root@centos189 mysql]# crm_mon -1r
Last updated: Wed Mar 13 11:24:44 2013
Last change: Wed Mar 13 11:24:04 2013 via crm_attribute on centos193
Stack: classic openais (with plugin)
Current DC: centos189 - partition with quorum
Version: 1.1.8-7.el6-394e906
2 Nodes configured, 2 expected votes
5 Resources configured.

Online: [ centos189 centos193 ]

Full list of resources:

 Master/Slave Set: ms_drbd_mysql [p_drbd_mysql] Masters: [ centos189 ] Slaves: [ centos193 ]
 Resource Group: g_mysql
     p_fs_mysql	(ocf::heartbeat:Filesystem): Started centos189 p_ip_mysql	(ocf::heartbeat:IPaddr2): Started centos189 p_mysql	(lsb:mysql): Started centos189

Failover测试:
将Node1设置为Standby状态

[root@centos189 ~]# crm node standby

过几分钟查看集群状态(若切换成功,则看到如下状态):

[root@centos189 ~]# crm status
Last updated: Wed Mar 13 11:29:41 2013
Last change: Wed Mar 13 11:26:46 2013 via crm_attribute on centos189
Stack: classic openais (with plugin)
Current DC: centos189 - partition with quorum
Version: 1.1.8-7.el6-394e906
2 Nodes configured, 2 expected votes
5 Resources configured.

Node centos189: standby
Online: [ centos193 ]

 Master/Slave Set: ms_drbd_mysql [p_drbd_mysql] Masters: [ centos193 ] Stopped: [ p_drbd_mysql:1 ]
 Resource Group: g_mysql
     p_fs_mysql	(ocf::heartbeat:Filesystem): Started centos193 p_ip_mysql	(ocf::heartbeat:IPaddr2): Started centos193 p_mysql	(lsb:mysql): Started centos193

将Node1(centos189)恢复online状态:

[root@centos189 mysql]# crm node online
[root@centos189 mysql]# crm status
Last updated: Wed Mar 13 11:32:49 2013
Last change: Wed Mar 13 11:31:23 2013 via crm_attribute on centos189
Stack: classic openais (with plugin)
Current DC: centos189 - partition with quorum
Version: 1.1.8-7.el6-394e906
2 Nodes configured, 2 expected votes
5 Resources configured.

Online: [ centos189 centos193 ]

 Master/Slave Set: ms_drbd_mysql [p_drbd_mysql]
     Masters: [ centos193 ]
     Slaves: [ centos189 ]
 Resource Group: g_mysql
     p_fs_mysql	(ocf::heartbeat:Filesystem):	Started centos193
     p_ip_mysql	(ocf::heartbeat:IPaddr2):	Started centos193
     p_mysql	(lsb:mysql):	Started centos193

“断网”即停止Master服务

避免因”断网”而发生”split brain”(“裂脑”)

利用Pacemaker去ping一个独立的网络(比如网络路由),当发现主机网络断网(被隔离)的时候,即阻止该主机为DRBD master.

[root@centos189 ~]# crm configure
crm(live)configure# primitive p_ping ocf:pacemaker:ping params name="ping" \
> multiplier="1000" host_list="192.168.1.1" op monitor interval="15s" timeout="60s" \
> start timeout="60s"

由于两台主机需要运行ping去检查他们的网络连接,需要创建一个clone (cl_ping),让ping资源可以运行在集群所有的主机上.

crm(live)configure# clone cl_ping p_ping meta interleave="true"

告诉Pacemaker如何处理ping的结果:

crm(live)configure# location l_drbd_master_on_ping ms_drbd_mysql rule $role="Master" \
> -inf: not_defined ping or ping number:lte 0

上面的例子表示:当主机没有ping的服务或是无法ping通至少一个节点的时候,就为该主机设置一个偏好分数(preference score)为负无穷大 (-inf),
从而让location约束(l_drbd_master_on_ping)控制DRBD master的资源地址.

验证和提交配置:

crm(live)configure# verify
WARNING: p_drbd_mysql: action monitor not advertised in meta-data, it may not be supported by the RA
crm(live)configure# commit
crm(live)configure# quit

检查ping服务是否已经在运行:

[root@centos189 ~]# crm_mon -1
Last updated: Thu Mar 14 01:02:14 2013
Last change: Thu Mar 14 01:01:20 2013 via cibadmin on centos189
Stack: classic openais (with plugin)
Current DC: centos189 - partition with quorum
Version: 1.1.8-7.el6-394e906
2 Nodes configured, 2 expected votes
7 Resources configured.

Online: [ centos189 centos193 ]

 Master/Slave Set: ms_drbd_mysql [p_drbd_mysql]
     Masters: [ centos193 ]
     Slaves: [ centos189 ]
 Resource Group: g_mysql
     p_fs_mysql	(ocf::heartbeat:Filesystem):	Started centos193
     p_ip_mysql	(ocf::heartbeat:IPaddr2):	Started centos193
     p_mysql	(lsb:mysql):	Started centos193
 Clone Set: cl_ping [p_ping]
     Started: [ centos189 centos193 ]

断网测试

- 在当前Master停止网络服务:

[root@centos193 ~]# service network stop

[root@centos189 ~]# crm resource status
 Master/Slave Set: ms_drbd_mysql [p_drbd_mysql]
     Slaves: [ centos189 ]
     Stopped: [ p_drbd_mysql:1 ]
 Resource Group: g_mysql
     p_fs_mysql	(ocf::heartbeat:Filesystem):	Stopped 
     p_ip_mysql	(ocf::heartbeat:IPaddr2):	Stopped 
     p_mysql	(lsb:mysql):	Stopped 
 Clone Set: cl_ping [p_ping]
     Started: [ centos189 ]
     Stopped: [ p_ping:1 ]

- 恢复Master的网络服务:

[root@centos193 ~]# service network stop

[root@centos189 ~]# crm resource status
 Master/Slave Set: ms_drbd_mysql [p_drbd_mysql]
     Masters: [ centos193 ]
     Slaves: [ centos189 ]
 Resource Group: g_mysql
     p_fs_mysql	(ocf::heartbeat:Filesystem):	Started 
     p_ip_mysql	(ocf::heartbeat:IPaddr2):	Started 
     p_mysql	(lsb:mysql):	Started 
 Clone Set: cl_ping [p_ping]
     Started: [ centos189 centos193 ]

[root@centos189 ~]# crm status
Last updated: Thu Mar 14 01:09:51 2013
Last change: Thu Mar 14 01:09:49 2013 via crmd on centos189
Stack: classic openais (with plugin)
Current DC: centos193 - partition with quorum
Version: 1.1.8-7.el6-394e906
2 Nodes configured, 2 expected votes
7 Resources configured.

Online: [ centos189 centos193 ]

 Master/Slave Set: ms_drbd_mysql [p_drbd_mysql]
     Masters: [ centos193 ]
     Slaves: [ centos189 ]
 Resource Group: g_mysql
     p_fs_mysql	(ocf::heartbeat:Filesystem):	Started centos193
     p_ip_mysql	(ocf::heartbeat:IPaddr2):	Started centos193
     p_mysql	(lsb:mysql):	Started centos193
 Clone Set: cl_ping [p_ping]
     Started: [ centos189 centos193 ]

系统启动项设置

系统启动选项设置

由于DRBD,MySQL等服务已经交由Pacemaker来管理,需要将他们的系统自启动选项关掉,同时确保CoroSync和Pacemaker随着系统启动.

[root@centos189 ~]# chkconfig drbd off
[root@centos189 ~]# chkconfig mysql off
[root@centos189 ~]# chkconfig corosync on
[root@centos189 ~]# chkconfig pacemaker on

[root@centos193 ~]# chkconfig drbd off
[root@centos193 ~]# chkconfig mysql off
[root@centos193 ~]# chkconfig corosync on
[root@centos193 ~]# chkconfig pacemaker on

手动解决”Split-Brain”

从”Split-Brain”中恢复

DRBD的Active/Standby架构设计的两主机的数据因为某些原因也可能发生不一致.假如这种情况发生的话,
DRBD两主机之间将会中断连接(可以通过/etc/init.d/drbd status或drbd-overview查看他们的关系状态).
如果查看日志(/var/log/messages)确定造成DRBD连接中断的原因是”Split-Brain”的话,那么就需要找出/确定拥有正确的数据的主机,然后让DRBD重新同步数据.

- 查看DRBD主机状态及查看日志:

[root@centos189 ~]# cat /proc/drbd 
version: 8.4.2 (api:1/proto:86-101)
GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10
 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
    ns:32948 nr:0 dw:4 dr:34009 al:1 bm:9 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
[root@centos189 ~]# cat /var/log/messages | grep Split-Brain
Mar 14 21:11:48 node1 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!

[root@centos193 drbd.d]# cat /proc/drbd 
version: 8.4.2 (api:1/proto:86-101)
GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10
 0: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r-----
    ns:0 nr:32948 dw:32948 dr:0 al:0 bm:9 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

- 手动解决”Split-Brain”:
这里找到的”好数据”数据的主机为centos189.出现”坏数据”的主机为centos193.

在”坏数据”主机centos193上:

[root@centos193 ~]# drbdadm disconnect dbcluster
[root@centos193 ~]# cat /proc/drbd 
version: 8.4.2 (api:1/proto:86-101)
GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10
 0: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r-----
    ns:0 nr:32948 dw:32948 dr:0 al:0 bm:9 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
[root@centos193 ~]# drbdadm secondary dbcluster
[root@centos193 ~]# drbdadm connect --discard-my-data dbcluster

在”好数据”的主机centos189上(如果下面的cs:状态为WFConnection,则无需下面操作.)

[root@centos189 ~]# cat /proc/drbd 
version: 8.4.2 (api:1/proto:86-101)
GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10
 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
    ns:32948 nr:0 dw:4 dr:34009 al:1 bm:9 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
[root@centos189 ~]# drbdadm connect dbcluster
[root@centos189 ~]# /etc/init.d/drbd status
drbd driver loaded OK; device status:
version: 8.4.2 (api:1/proto:86-101)
GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10
m:res        cs         ro                 ds                 p  mounted              fstype
0:dbcluster Connected Primary/Secondary  UpToDate/UpToDate  C  /var/lib/mysql_drbd  ext4

转载，请注明：http://www.51itstudy.com/30152.html

来自 “ ITPUB博客 ” ，链接：http://blog.itpub.net/14431099/viewspace-1316638/，如需转载，请注明出处，否则将追究法律责任。

转载于:http://blog.itpub.net/14431099/viewspace-1316638/

congrang5960

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
基于DRBD+Pacemaker+Corosync实现高可用的MySQL

基于DRBD的Active/Passive架构 DRBD Active/Passive架构说明注:在Activ...
复制链接

扫一扫