基于DRBD的Active/Passive架构
- DRBD Active/Passive架构说明
注:在Active/Standby的架构体系中,永远只有Active主机在提供服务,Standby主机不对外提供任何服务(包括MySQL的”读”).
- 环境搭建说明
node1:
- IP: 192.168.1.189
- HostName: centos189
node2:
- IP: 192.168.1.193
- HostName: centos193虚拟IP(VIP):
- IP: 192.168.1.198
网络和服务器设置
- 时间同步
# ntpdate cn.pool.ntp.org
- 设置Selinux
可将SELINUX设置为permissive或disabled
[root@centos193 ~]# cat /etc/sysconfig/selinux
# This file controls the state of SELinux on the system. # SELINUX= can take one of these three values: # enforcing - SELinux security policy is enforced. # permissive - SELinux prints warnings instead of enforcing. # disabled - No SELinux policy is loaded. SELINUX=disabled # SELINUXTYPE= can take one of these two values: # targeted - Targeted processes are protected, # mls - Multi Level Security protection. SELINUXTYPE=targeted
- iptables防火墙设置
这里为了方便,关闭防火墙iptables:
# service iptables stop iptables: Flushing firewall rules: [ OK ] iptables: Setting chains to policy ACCEPT: filter [ OK ] iptables: Unloading modules: [ OK ] # chkconfig iptables off
注:实际环境中不必关闭防火墙,只需要开启相关端口即可(DRBD:7788-7789,CoroSync:3999-4000)
- 设置机器hostname
[root@centos193 ~]# cat /etc/sysconfig/network NETWORKING=yes HOSTNAME=centos193 [root@centos193 ~]# source /etc/sysconfig/network [root@centos193 ~]# hostname $HOSTNAME
- 添加hostname到每台机器的/etc/hosts
[root@centos193 ~]# cat /etc/hosts ... 192.168.1.189 node1.zrwm.com centos189 192.168.1.193 node2.zrwm.com centos193
建议:不使用外部的DNS服务(那样会成为额外的故障点),而是将这些mappings配置到每台机器的/etc/hosts文件.
- 配置机器间的SSH无密码访问
[root@centos193 ~]# ssh-keygen -t rsa -b 1024 [root@centos193 ~]# ssh-copy-id root@192.168.1.189 [root@centos189 ~]# ssh-keygen -t rsa -b 1024 [root@centos189 ~]# ssh-copy-id root@192.168.1.193
DRBD的安装与配置
- DRBD下载与安装
# wget -c http://elrepo.org/linux/elrepo/el6/x86_64/RPMS/drbd84-utils-8.4.2-1.el6.elrepo.x86_64.rpm # wget -c http://elrepo.org/linux/elrepo/el6/x86_64/RPMS/kmod-drbd84-8.4.2-1.el6_3.elrepo.x86_64.rpm # rpm -ivh *.rpm warning: drbd84-utils-8.4.2-1.el6.elrepo.x86_64.rpm: Header V4 DSA/SHA1 Signature, key ID baadae52: NOKEY Preparing... ########################################### [100%] 1:drbd84-utils ########################################### [ 50%] 2:kmod-drbd84 ########################################### [100%] Working. This may take some time ... Done.
- DRBD配置
- 获取一个sha1值做为shared-secret
[root@centos189 drbd]# sha1sum /etc/drbd.conf 8a6c5f3c21b84c66049456d34b4c4980468bcfb3 /etc/drbd.conf
- 创建并编辑资源配置文件:/etc/drbd.d/dbcluster.res
[root@centos189 drbd.d]# cat /etc/drbd.d/dbcluster.res resource dbcluster { protocol C; net { cram-hmac-alg sha1; shared-secret "8a6c5f3c21b84c66049456d34b4c4980468bcfb3"; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; rr-conflict disconnect; } device /dev/drbd0; disk /dev/sdb1; meta-disk internal; on centos189 { address 192.168.1.189:7789; } on centos193 { address 192.168.1.193:7789; } }
以上配置所用参数说明:
RESOURCE: 资源名称
PROTOCOL: 使用协议”C”表示”同步的”,即收到远程的写入确认之后,则认为写入完成.
NET: 两个节点的SHA1 key是一样的
after-sb-0pri : “Split Brain”发生时且没有数据变更,两节点之间正常连接
after-sb-1pri : 如果有数据变更,则放弃辅设备数据,并且从主设备同步
after-sb-2pri : 如果前面的选择是不可能的,那么断开节点之间连接.这种情况下,要求手动处理”Split-Brain”
rr-conflict: 假如前面的设置不能应用,并且drbd系统有角色冲突的话,系统自动断开节点间连接
DEVICE: 虚拟设备
DISK: 物理磁盘设备
META-DISK: Meta data保存在同一个磁盘(sdb1)
ON : 组成集群的节点 - 将DRBD配置拷贝到node机器:
[root@centos189 drbd]# scp /etc/drbd.d/dbcluster.res root@192.168.1.193:/etc/drbd.d/
- 获取一个sha1值做为shared-secret
- 创建资源及文件系统
- 创建分区(未格式化过)
在node1和node2上创建LVM分区:# fdisk /dev/sdb WARNING: DOS-compatible mode is deprecated. It's strongly recommended to switch off the mode (command 'c') and change display units to sectors (command 'u'). Command (m for help): n Command action e extended p primary partition (1-4) p Partition number (1-4): 1 First cylinder (1-1044, default 1): Using default value 1 Last cylinder, +cylinders or +size{K,M,G} (1-1044, default 1044): +8096M Command (m for help): w The partition table has been altered! Calling ioctl() to re-read partition table. Syncing disks.
- 给资源(dbcluster)创建meta data
Node1:[root@centos189 drbd]# drbdadm create-md dbcluster md_offset 8496676864 al_offset 8496644096 bm_offset 8496381952 Found ext3 filesystem 7911980 kB data area apparently used 8297248 kB left usable by current configuration Even though it looks like this would place the new meta data into unused space, you still need to confirm, as this is only a guess. Do you want to proceed? [need to type 'yes' to confirm] yes Writing meta data... initializing activity log NOT initializing bitmap New drbd meta data block successfully created. success
Node2:
[root@centos193 ~]# drbdadm create-md dbcluster md_offset 8496676864 al_offset 8496644096 bm_offset 8496381952 Found ext3 filesystem 7911980 kB data area apparently used 8297248 kB left usable by current configuration Even though it looks like this would place the new meta data into unused space, you still need to confirm, as this is only a guess. Do you want to proceed? [need to type 'yes' to confirm] yes Writing meta data... initializing activity log NOT initializing bitmap New drbd meta data block successfully created. success
- 激活资源
- 首先确保drbd module已经加载
查看是否加载:# lsmod | grep drbd
若未加载,则需加载:
# modprobe drbd # lsmod | grep drbd drbd 317261 0 libcrc32c 1246 1 drbd
- 启动drbd后台进程:
[root@centos189 drbd]# drbdadm up dbcluster [root@centos191 drbd]# drbdadm up dbcluster
查看drbd状态:
Node1:[root@centos189 drbd]# /etc/init.d/drbd status drbd driver loaded OK; device status: version: 8.4.2 (api:1/proto:86-101) GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10 m:res cs ro ds p mounted fstype 0:dbcluster Connected Secondary/Secondary Inconsistent/Inconsistent C
Node2:
[root@centos193 drbd]# /etc/init.d/drbd status drbd driver loaded OK; device status: version: 8.4.2 (api:1/proto:86-101) GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10 m:res cs ro ds p mounted fstype 0:dbcluster Connected Secondary/Secondary Inconsistent/Inconsistent C
从上面的信息可以看到,DRBD服务已经在两台机器上运行,但任何一台机器都不是主机器(“primary” host),因此无法访问到资源(block device).
- 开始同步
仅在主节点操作(这里为node1).[root@centos189 drbd]# drbdadm -- --overwrite-data-of-peer primary dbcluster
查看同步状态:
[root@centos189 drbd.d]# cat /proc/drbd version: 8.4.2 (api:1/proto:86-101) GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r---n- ns:4347852 nr:0 dw:0 dr:4349592 al:0 bm:265 lo:0 pe:2 ua:2 ap:0 ep:1 wo:f oos:3951392 [=========>..........] sync'ed: 52.5% (3856/8100)M finish: 0:01:26 speed: 45,852 (46,232) K/sec [root@centos189 drbd.d]# cat /proc/drbd version: 8.4.2 (api:1/proto:86-101) GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- ns:8297248 nr:0 dw:0 dr:8297912 al:0 bm:507 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
上面的输出结果的一些说明:
cs (connection state): 网络连接状态 ro (roles): 节点的角色(本节点的角色首先显示) ds (disk states):硬盘的状态 复制协议: A, B or C(本配置是C)
看到drbd状态为”cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate”即表示同步结束.
也可以这样查看drbd状态:
[root@centos193 drbd]# drbd-overview 0:dbcluster/0 Connected Secondary/Primary UpToDate/UpToDate C r-----
- 创建文件系统
在主节点(Node1)创建文件系统:[root@centos189 drbd]# mkfs -t ext4 /dev/drbd0 mke2fs 1.41.12 (17-May-2010) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=0 blocks, Stripe width=0 blocks 519168 inodes, 2074312 blocks 103715 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=2126512128 64 block groups 32768 blocks per group, 32768 fragments per group 8112 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632 Writing inode tables: done Creating journal (32768 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 39 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override.
注:没必要在辅节点(Node2)做同样的操作,因为DRBD会处理原始磁盘数据的同步.
另外,我们也不需要将这个DRBD系统挂载到任何一台机器(当然安装MySQL的时候需要临时挂载来安装MySQL),因为集群管理软件会处理.还有要确保复制的文件系统仅仅挂载在Active的主服务器上.
- 创建分区(未格式化过)
安装和配置MySQL
- MySQL 5.6安装
- 创建mysql用户组/用户(Node1,Node2)
# groupadd mysql # useradd -g mysql mysql
- 安装MySQL(Node1,Node2)
# yum -y install gcc-c++ ncurses-devel cmake # wget -c http://dev.mysql.com/get/Downloads/MySQL-5.6/mysql-5.6.10.tar.gz/from/http://cdn.mysql.com/ # tar zxvf mysql-5.6.10.tar.gz # cd mysql-5.6.10 # cmake . -DCMAKE_INSTALL_PREFIX=/usr/local/mysql # make && make install
- 创建DRBD分区挂载目录(Node1,Node2)
# mkdir /var/lib/mysql_drbd # mkdir /var/lib/mysql # chown mysql:mysql -R /var/lib/mysql_drbd # chown mysql:mysql -R /var/lib/mysql
- 初始化MySQL数据库
- 初始化之前先临时挂载DRBD文件系统到主节点(Node1)[root@centos189 ~]# mount /dev/drbd0 /var/lib/mysql_drbd/
- 初始化操作(Node1):
[root@centos189 mysql]# cd /usr/local/mysql [root@centos189 mysql]# mkdir /var/lib/mysql_drbd/data [root@centos189 mysql]# chown -R mysql:mysql /var/lib/mysql_drbd/data [root@centos189 mysql]# chown -R mysql:mysql . [root@centos189 mysql]# scripts/mysql_install_db --datadir=/var/lib/mysql_drbd/data --user=mysql
- 初始化完成之后:
[root@centos189 mysql]# cp support-files/mysql.server /etc/init.d/mysql [root@centos189 mysql]# mv support-files/my-default.cnf /etc/my.cnf [root@centos189 mysql]# chown mysql /etc/my.cnf [root@centos189 mysql]# chmod 644 /etc/my.cnf [root@centos189 mysql]# chown -R root . [root@centos189 mysql]# cd /var/lib/mysql_drbd [root@centos189 mysql_drbd]# chmod -R uog+rw * [root@centos189 mysql_drbd]# chown -R mysql data
- 配置MySQL(Node1):
[root@centos189 mysql_drbd]# cat /etc/my.cnf # # /etc/my.cnf # [client] port = 3306 socket = /var/lib/mysql/mysql.sock [mysqld] port = 3306 socket = /var/lib/mysql/mysql.sock datadir = /var/lib/mysql_drbd/data user = mysql #memlock = 1 #table_open_cache = 3072 #table_definition_cache = 1024 max_heap_table_size = 64M tmp_table_size = 64M # Connections max_connections = 505 max_user_connections = 500 max_allowed_packet = 16M thread_cache_size = 32 # Buffers sort_buffer_size = 8M join_buffer_size = 8M read_buffer_size = 2M read_rnd_buffer_size = 16M # Query Cache #query_cache_size = 64M # InnoDB #innodb_buffer_pool_size = 1G #innodb_data_file_path = ibdata1:2G:autoextend #innodb_log_file_size = 128M #innodb_log_files_in_group = 2 # MyISAM myisam_recover = backup,force # Logging #general-log = 0 #general_log_file = /var/lib/mysql/mysql_general.log log_warnings = 2 log_error = /var/lib/mysql/mysql_error.log #slow_query_log = 1 #slow_query_log_file = /var/lib/mysql/mysql_slow.log #long_query_time = 0.5 #log_queries_not_using_indexes = 1 #min_examined_row_limit = 20 # Binary Log / Replication server_id = 1 log-bin = mysql-bin binlog_cache_size = 1M #sync_binlog = 8 binlog_format = row expire_logs_days = 7 max_binlog_size = 128M [mysqldump] quick max_allowed_packet = 16M [mysql] no_auto_rehash [myisamchk] #key_buffer = 512M #sort_buffer_size = 512M read_buffer = 8M write_buffer = 8M [mysqld_safe] open-files-limit = 8192 pid-file = /var/lib/mysql/mysql.pid
- 在主节点Node1测试MySQL
[root@centos189 mysql_drbd]# /usr/local/mysql/bin/mysqld_safe --user=mysql > /dev/null & [root@centos189 mysql_drbd]# mysql -uroot -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.6.10-log Source distribution Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> use test; Database changed mysql> show tables; Empty set (0.10 sec) mysql> create table tbl (a int); Query OK, 0 rows affected (3.80 sec) mysql> insert into tbl values (1), (2); Query OK, 2 rows affected (0.25 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> quit; Bye [root@centos189 mysql_drbd]# /usr/local/mysql/bin/mysqladmin -uroot -p shutdown Enter password: [1]+ Done /usr/local/mysql/bin/mysqld_safe --user=mysql > /dev/null
- 在节点Node1卸载DRBD文件系统
[root@centos189 ~]# umount /var/lib/mysql_drbd [root@centos189 ~]# drbdadm secondary dbcluster
- 将DRBD文件系统挂载节点Node2
[root@centos193 ~]# drbdadm primary dbcluster [root@centos193 ~]# mount /dev/drbd0 /var/lib/mysql_drbd [root@centos193 ~]# ll /var/lib/mysql_drbd/ total 20 drwxrwxrwx 5 mysql mysql 4096 Mar 12 09:30 data drwxrw-rw- 2 mysql mysql 16384 Mar 10 07:49 lost+found
- 节点Node2上配置MySQL并测试
[root@centos193 ~]# scp centos189:/etc/my.cnf /etc/my.cnf [root@centos193 ~]# chown mysql /etc/my.cnf [root@centos193 ~]# chmod 644 /etc/my.cnf [root@centos193 ~]# cd /usr/local/mysql/ [root@centos193 mysql]# cp support-files/mysql.server /etc/init.d/mysql [root@centos193 mysql]# chown -R root:mysql .
测试MySQL:
[root@centos193 mysql]# /usr/local/mysql/bin/mysqld_safe --user=mysql > /dev/null & [1] 15864 [root@centos193 mysql]# mysql -uroot -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.6.10-log Source distribution Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> use test; Database changed mysql> select * from tbl; +------+ | a | +------+ | 1 | | 2 | +------+ 2 rows in set (0.26 sec) mysql> quit Bye [root@centos193 mysql]# /usr/local/mysql/bin/mysqladmin -uroot -p shutdown Enter password: [1]+ Done /usr/local/mysql/bin/mysqld_safe --user=mysql > /dev/null
- 在Node2上卸载DRBD文件系统,交由集群管理软件Pacemaker来管理
[root@centos193 mysql]# umount /var/lib/mysql_drbd [root@centos193 mysql]# drbdadm secondary dbcluster [root@centos193 mysql]# drbd-overview 0:dbcluster/0 Connected Secondary/Secondary UpToDate/UpToDate C r----- [root@centos193 mysql]#
- 创建mysql用户组/用户(Node1,Node2)
Corosync和Pacemaker
- 安装软件
安装corosync:
# yum -y install corosync corosynclib-devel # yum -y install corosync corosynclib-devel
安装pacemaker:
# yum -y install libtool-ltdl-devel libuuid-devel libxslt-devel libqb libqb-devel # yum -y install glib2-devel bzip2-devel libxml2-devel docbook-dtds.noarch # yum -y install pacemaker resource-agents pacemaker-libs-devel
安装cluster-glue:
#yum install cluster-glue cluster-glue-libs-devel
安装crmsh:
[root@centos189 pacemaker]# wget -c http://hg.savannah.gnu.org/hgweb/crmsh/archive/tip.tar.gz [root@centos189 pacemaker]# tar zxvf tip.tar.gz [root@centos189 pacemaker]# cd crmsh-6cf4ba4f2568/ [root@centos189 crmsh-6cf4ba4f2568]# ./autogen.sh [root@centos189 crmsh-6cf4ba4f2568]# ./configure [root@centos189 crmsh-6cf4ba4f2568]# make && make install
- 配置corosync
- Corosync Key
- 生成节点间安全通信的key:[root@centos189 ~]# corosync-keygen Corosync Cluster Engine Authentication key generator. Gathering 1024 bits for key from /dev/random. Press keys on your keyboard to generate entropy. Writing corosync key to /etc/corosync/authkey.
- 将authkey拷贝到另一个节点(保持authkey的权限为400):
[root@centos189 ~]# scp /etc/corosync/authkey centos193:/etc/corosync/ [root@centos189 ~]# ll /etc/corosync/authkey -r-------- 1 root root 128 Mar 10 13:56 /etc/corosync/authkey [root@centos193 ~]# ll /etc/corosync/authkey -r-------- 1 root root 128 Mar 10 14:02 /etc/corosync/authkey
- corosync配置:
- 编辑/etc/corosync/corosync.conf:[root@centos189 ~]# cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf [root@centos189 ~]# vi /etc/corosync/corosync.conf [root@centos189 ~]# cat /etc/corosync/corosync.conf # Please read the corosync.conf.5 manual page compatibility: whitetank aisexec { user: root group: root } totem { version: 2 secauth: off threads: 0 interface { ringnumber: 0 bindnetaddr: 192.168.1.0 mcastaddr: 226.94.1.1 mcastport: 4000 ttl: 1 } } logging { fileline: off to_stderr: no to_logfile: yes to_syslog: yes logfile: /var/log/cluster/corosync.log debug: off timestamp: on logger_subsys { subsys: AMF debug: off } } amf { mode: disabled }
注:CoroSync使用两个UDP端口,一个用于发送(4000),一个用于接收(3999).只要配置文件里面配置一个端口为N,那另外一个即为N-1.
- 创建并编辑/etc/corosync/service.d/pcmk,添加”pacemaker”服务
[root@centos189 ~]# cat /etc/corosync/service.d/pcmk service { # Load the Pacemaker Cluster Resource Manager name: pacemaker ver: 1 }
将上面两个配置文件拷贝到另一节点
[root@centos189 ~]# scp /etc/corosync/corosync.conf centos193:/etc/corosync/corosync.conf [root@centos189 ~]# scp /etc/corosync/service.d/pcmk centos193:/etc/corosync/service.d/pcmk
- Corosync Key
- 启动corosync和Pacemaker
- 分别在两个节点上启动corosync并检查.
[root@centos189 ~]# /etc/init.d/corosync start Starting Corosync Cluster Engine (corosync): [ OK ] [root@centos189 ~]# /etc/init.d/corosync status corosync (pid 24831) is running... [root@centos189 ~]# corosync-cfgtool -s Printing ring status. Local node ID -1123964736 RING ID 0 id = 192.168.1.189 status = ring 0 active with no faults
[root@centos193 ~]# /etc/init.d/corosync start Starting Corosync Cluster Engine (corosync): [ OK ] [root@centos193 ~]# /etc/init.d/corosync status corosync (pid 19251) is running... [root@centos193 ~]# corosync-objctl | grep members runtime.totem.pg.mrp.srp.members.-1123964736.ip=r(0) ip(192.168.1.189) runtime.totem.pg.mrp.srp.members.-1123964736.join_count=1 runtime.totem.pg.mrp.srp.members.-1123964736.status=joined runtime.totem.pg.mrp.srp.members.-1056855872.ip=r(0) ip(192.168.1.193) runtime.totem.pg.mrp.srp.members.-1056855872.join_count=1 runtime.totem.pg.mrp.srp.members.-1056855872.status=joined
- 在启动Pacemaker之前可查看日志看是否有出错
# cat /var/log/cluster/corosync.log # tail -f /var/log/messages
- 在两节点上分别启动Pacemaker:
[root@centos189 corosync]# /etc/init.d/pacemaker start Starting Pacemaker Cluster Manager: [ OK ] [root@centos189 corosync]# /etc/init.d/pacemaker status pacemakerd (pid 24895) is running...
[root@centos193 ~]# /etc/init.d/pacemaker start Starting Pacemaker Cluster Manager: [ OK ] [root@centos193 ~]# /etc/init.d/pacemaker status pacemakerd (pid 19417) is running...
- 查看集群状态:
[root@centos189 ~]# crm_mon -1 Last updated: Sun Mar 10 15:06:52 2013 Last change: Sun Mar 10 14:59:27 2013 via crmd on centos189 Stack: classic openais (with plugin) Current DC: centos189 - partition with quorum Version: 1.1.8-7.el6-394e906 2 Nodes configured, 2 expected votes 0 Resources configured. Online: [ centos189 centos193 ]
资源配置
- 配置资源及约束
- 配置默认属性
查看已存在的配置:[root@centos189 ~]# crm configure show node centos189 node centos193 property $id="cib-bootstrap-options" \ dc-version="1.1.8-7.el6-394e906" \ cluster-infrastructure="classic openais (with plugin)" \ expected-quorum-votes="2"
检验配置是否正确:
[root@centos189 ~]# crm_verify -L -V error: unpack_resources: Resource start-up disabled since no STONITH resources have been defined error: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option error: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity Errors found during check: config not valid -V may provide more details
禁止STONITH错误:
[root@centos189 ~]# crm configure property stonith-enabled=false [root@centos189 ~]# crm_verify -L
让集群忽略Quorum:
[root@centos189 ~]# crm configure property no-quorum-policy=ignore
防止资源在恢复之后移动:
[root@centos189 ~]# crm configure rsc_defaults resource-stickiness=100
设置操作的默认超时:
[root@centos189 www]# crm configure property default-action-timeout="180s"
设置默认的启动失败是否为致命的:
[root@centos189 www]# crm configure property start-failure-is-fatal="false"
- 配置DRBD资源
- 配置之前先停止DRBD:[root@centos189 ~]# /etc/init.d/drbd stop [root@centos193 ~]# /etc/init.d/drbd stop
- 配置DRBD资源:
[root@centos189 www]# crm configure crm(live)configure# primitive p_drbd_mysql ocf:linbit:drbd params \ > drbd_resource="dbcluster" op monitor interval="15s" op start timeout="240s" \ > op stop timeout="100s"
- 配置DRBD资源主从关系(定义只有一个Master节点):
crm(live)configure# ms ms_drbd_mysql p_drbd_mysql meta master-max="1" \ > master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
- 配置文件系统资源,定义挂载点(mount point):
crm(live)configure# primitive p_fs_mysql ocf:heartbeat:Filesystem \ > params device="/dev/drbd0" directory="/var/lib/mysql_drbd/" fstype="ext4"
- 配置VIP资源
crm(live)configure# primitive p_ip_mysql ocf:heartbeat:IPaddr2 params \ > ip="192.168.1.198" cidr_netmask="24" op monitor interval="30s"
- 配置MySQL资源
使用LSB方式(本文使用):crm(live)configure# primitive p_mysql lsb:mysql \ > op monitor interval="20s" timeout="30s" \ > op start interval="0" timeout="180s" \ > op stop interval="0" timeout="240s"
或使用OCF方式:
crm(live)configure# primitive p_mysql ocf:heartbeat:mysql params \ > binary="/usr/local/mysql/bin/mysqld_safe" config="/etc/my.cnf" \ > user="mysql" group="mysql" log="/var/lib/mysql/mysql_error.log" \ > pid="/var/lib/mysql/mysql.pid" socket="/var/lib/mysql/mysql.sock" \ > datadir="/var/lib/mysql_drbd/data" \ > op monitor interval="60s" timeout="60s" \ > op start timeout="180s" op stop timeout="240s"
- 组资源和约束
通过”组”确保DRBD,MySQL和VIP是在同一个节点(Master)并且确定资源的启动/停止顺序.
启动: p_fs_mysql–>p_ip_mysql->p_mysql
停止: p_mysql–>p_ip_mysql–>p_fs_mysqlcrm(live)configure# group g_mysql p_fs_mysql p_ip_mysql p_mysql
组group_mysql永远只在Master节点:
crm(live)configure# colocation c_mysql_on_drbd inf: g_mysql ms_drbd_mysql:Master
MySQL的启动永远是在DRBD Master之后:
crm(live)configure# order o_drbd_before_mysql inf: ms_drbd_mysql:promote g_mysql:start
- 配置检查和提交
crm(live)configure# verify crm(live)configure# commit crm(live)configure# quit
- 查看集群状态和failover测试
状态查看:[root@centos189 mysql]# crm_mon -1r Last updated: Wed Mar 13 11:24:44 2013 Last change: Wed Mar 13 11:24:04 2013 via crm_attribute on centos193 Stack: classic openais (with plugin) Current DC: centos189 - partition with quorum Version: 1.1.8-7.el6-394e906 2 Nodes configured, 2 expected votes 5 Resources configured. Online: [ centos189 centos193 ] Full list of resources: Master/Slave Set: ms_drbd_mysql [p_drbd_mysql] Masters: [ centos189 ] Slaves: [ centos193 ] Resource Group: g_mysql p_fs_mysql (ocf::heartbeat:Filesystem): Started centos189 p_ip_mysql (ocf::heartbeat:IPaddr2): Started centos189 p_mysql (lsb:mysql): Started centos189
Failover测试:
将Node1设置为Standby状态[root@centos189 ~]# crm node standby
过几分钟查看集群状态(若切换成功,则看到如下状态):
[root@centos189 ~]# crm status Last updated: Wed Mar 13 11:29:41 2013 Last change: Wed Mar 13 11:26:46 2013 via crm_attribute on centos189 Stack: classic openais (with plugin) Current DC: centos189 - partition with quorum Version: 1.1.8-7.el6-394e906 2 Nodes configured, 2 expected votes 5 Resources configured. Node centos189: standby Online: [ centos193 ] Master/Slave Set: ms_drbd_mysql [p_drbd_mysql] Masters: [ centos193 ] Stopped: [ p_drbd_mysql:1 ] Resource Group: g_mysql p_fs_mysql (ocf::heartbeat:Filesystem): Started centos193 p_ip_mysql (ocf::heartbeat:IPaddr2): Started centos193 p_mysql (lsb:mysql): Started centos193
将Node1(centos189)恢复online状态:
[root@centos189 mysql]# crm node online [root@centos189 mysql]# crm status Last updated: Wed Mar 13 11:32:49 2013 Last change: Wed Mar 13 11:31:23 2013 via crm_attribute on centos189 Stack: classic openais (with plugin) Current DC: centos189 - partition with quorum Version: 1.1.8-7.el6-394e906 2 Nodes configured, 2 expected votes 5 Resources configured. Online: [ centos189 centos193 ] Master/Slave Set: ms_drbd_mysql [p_drbd_mysql] Masters: [ centos193 ] Slaves: [ centos189 ] Resource Group: g_mysql p_fs_mysql (ocf::heartbeat:Filesystem): Started centos193 p_ip_mysql (ocf::heartbeat:IPaddr2): Started centos193 p_mysql (lsb:mysql): Started centos193
- 配置默认属性
“断网”即停止Master服务
- 避免因”断网”而发生”split brain”(“裂脑”)
利用Pacemaker去ping一个独立的网络(比如网络路由),当发现主机网络断网(被隔离)的时候,即阻止该主机为DRBD master.
[root@centos189 ~]# crm configure crm(live)configure# primitive p_ping ocf:pacemaker:ping params name="ping" \ > multiplier="1000" host_list="192.168.1.1" op monitor interval="15s" timeout="60s" \ > start timeout="60s"
由于两台主机需要运行ping去检查他们的网络连接,需要创建一个clone (cl_ping),让ping资源可以运行在集群所有的主机上.
crm(live)configure# clone cl_ping p_ping meta interleave="true"
告诉Pacemaker如何处理ping的结果:
crm(live)configure# location l_drbd_master_on_ping ms_drbd_mysql rule $role="Master" \ > -inf: not_defined ping or ping number:lte 0
上面的例子表示:当主机没有ping的服务或是无法ping通至少一个节点的时候,就为该主机设置一个偏好分数(preference score)为负无穷大 (-inf),
从而让location约束(l_drbd_master_on_ping)控制DRBD master的资源地址.验证和提交配置:
crm(live)configure# verify WARNING: p_drbd_mysql: action monitor not advertised in meta-data, it may not be supported by the RA crm(live)configure# commit crm(live)configure# quit
检查ping服务是否已经在运行:
[root@centos189 ~]# crm_mon -1 Last updated: Thu Mar 14 01:02:14 2013 Last change: Thu Mar 14 01:01:20 2013 via cibadmin on centos189 Stack: classic openais (with plugin) Current DC: centos189 - partition with quorum Version: 1.1.8-7.el6-394e906 2 Nodes configured, 2 expected votes 7 Resources configured. Online: [ centos189 centos193 ] Master/Slave Set: ms_drbd_mysql [p_drbd_mysql] Masters: [ centos193 ] Slaves: [ centos189 ] Resource Group: g_mysql p_fs_mysql (ocf::heartbeat:Filesystem): Started centos193 p_ip_mysql (ocf::heartbeat:IPaddr2): Started centos193 p_mysql (lsb:mysql): Started centos193 Clone Set: cl_ping [p_ping] Started: [ centos189 centos193 ]
- 断网测试
- 在当前Master停止网络服务:
[root@centos193 ~]# service network stop
[root@centos189 ~]# crm resource status Master/Slave Set: ms_drbd_mysql [p_drbd_mysql] Slaves: [ centos189 ] Stopped: [ p_drbd_mysql:1 ] Resource Group: g_mysql p_fs_mysql (ocf::heartbeat:Filesystem): Stopped p_ip_mysql (ocf::heartbeat:IPaddr2): Stopped p_mysql (lsb:mysql): Stopped Clone Set: cl_ping [p_ping] Started: [ centos189 ] Stopped: [ p_ping:1 ]
- 恢复Master的网络服务:
[root@centos193 ~]# service network stop
[root@centos189 ~]# crm resource status Master/Slave Set: ms_drbd_mysql [p_drbd_mysql] Masters: [ centos193 ] Slaves: [ centos189 ] Resource Group: g_mysql p_fs_mysql (ocf::heartbeat:Filesystem): Started p_ip_mysql (ocf::heartbeat:IPaddr2): Started p_mysql (lsb:mysql): Started Clone Set: cl_ping [p_ping] Started: [ centos189 centos193 ]
[root@centos189 ~]# crm status Last updated: Thu Mar 14 01:09:51 2013 Last change: Thu Mar 14 01:09:49 2013 via crmd on centos189 Stack: classic openais (with plugin) Current DC: centos193 - partition with quorum Version: 1.1.8-7.el6-394e906 2 Nodes configured, 2 expected votes 7 Resources configured. Online: [ centos189 centos193 ] Master/Slave Set: ms_drbd_mysql [p_drbd_mysql] Masters: [ centos193 ] Slaves: [ centos189 ] Resource Group: g_mysql p_fs_mysql (ocf::heartbeat:Filesystem): Started centos193 p_ip_mysql (ocf::heartbeat:IPaddr2): Started centos193 p_mysql (lsb:mysql): Started centos193 Clone Set: cl_ping [p_ping] Started: [ centos189 centos193 ]
系统启动项设置
- 系统启动选项设置
由于DRBD,MySQL等服务已经交由Pacemaker来管理,需要将他们的系统自启动选项关掉,同时确保CoroSync和Pacemaker随着系统启动.
[root@centos189 ~]# chkconfig drbd off [root@centos189 ~]# chkconfig mysql off [root@centos189 ~]# chkconfig corosync on [root@centos189 ~]# chkconfig pacemaker on
[root@centos193 ~]# chkconfig drbd off [root@centos193 ~]# chkconfig mysql off [root@centos193 ~]# chkconfig corosync on [root@centos193 ~]# chkconfig pacemaker on
手动解决”Split-Brain”
- 从”Split-Brain”中恢复
DRBD的Active/Standby架构设计的两主机的数据因为某些原因也可能发生不一致.假如这种情况发生的话,
DRBD两主机之间将会中断连接(可以通过/etc/init.d/drbd status或drbd-overview查看他们的关系状态).
如果查看日志(/var/log/messages)确定造成DRBD连接中断的原因是”Split-Brain”的话,那么就需要找出/确定拥有正确的数据的主机,然后让DRBD重新同步数据.- 查看DRBD主机状态及查看日志:
[root@centos189 ~]# cat /proc/drbd version: 8.4.2 (api:1/proto:86-101) GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r----- ns:32948 nr:0 dw:4 dr:34009 al:1 bm:9 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 [root@centos189 ~]# cat /var/log/messages | grep Split-Brain Mar 14 21:11:48 node1 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
[root@centos193 drbd.d]# cat /proc/drbd version: 8.4.2 (api:1/proto:86-101) GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10 0: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r----- ns:0 nr:32948 dw:32948 dr:0 al:0 bm:9 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
- 手动解决”Split-Brain”:
这里找到的”好数据”数据的主机为centos189.出现”坏数据”的主机为centos193.在”坏数据”主机centos193上:
[root@centos193 ~]# drbdadm disconnect dbcluster [root@centos193 ~]# cat /proc/drbd version: 8.4.2 (api:1/proto:86-101) GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10 0: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown r----- ns:0 nr:32948 dw:32948 dr:0 al:0 bm:9 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 [root@centos193 ~]# drbdadm secondary dbcluster [root@centos193 ~]# drbdadm connect --discard-my-data dbcluster
在”好数据”的主机centos189上(如果下面的cs:状态为WFConnection,则无需下面操作.)
[root@centos189 ~]# cat /proc/drbd version: 8.4.2 (api:1/proto:86-101) GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r----- ns:32948 nr:0 dw:4 dr:34009 al:1 bm:9 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 [root@centos189 ~]# drbdadm connect dbcluster [root@centos189 ~]# /etc/init.d/drbd status drbd driver loaded OK; device status: version: 8.4.2 (api:1/proto:86-101) GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10 m:res cs ro ds p mounted fstype 0:dbcluster Connected Primary/Secondary UpToDate/UpToDate C /var/lib/mysql_drbd ext4
-
转载,请注明:http://www.51itstudy.com/30152.html
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/14431099/viewspace-1316638/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/14431099/viewspace-1316638/