基于DRBD+Pacemaker+Corosync实现高可用的MySQL

基于DRBD的Active/Passive架构
  • DRBD Active/Passive架构说明
    drbd_active_standby

    注:在Active/Standby的架构体系中,永远只有Active主机在提供服务,Standby主机不对外提供任何服务(包括MySQL的”读”).

  • 环境搭建说明

    node1:
    - IP: 192.168.1.189
    - HostName: centos189
    node2:
    - IP: 192.168.1.193
    - HostName: centos193

    虚拟IP(VIP):
    - IP: 192.168.1.198

网络和服务器设置
  • 时间同步
    # ntpdate cn.pool.ntp.org
  • 设置Selinux

    可将SELINUX设置为permissive或disabled

    [root@centos193 ~]# cat /etc/sysconfig/selinux
    # This file controls the state of SELinux on the system.
    # SELINUX= can take one of these three values:
    #     enforcing - SELinux security policy is enforced.
    #     permissive - SELinux prints warnings instead of enforcing.
    #     disabled - No SELinux policy is loaded.
    SELINUX=disabled # SELINUXTYPE= can take one of these two values:
    #     targeted - Targeted processes are protected,
    #     mls - Multi Level Security protection.
    SELINUXTYPE=targeted
  • iptables防火墙设置

    这里为了方便,关闭防火墙iptables:

    # service iptables stop
    iptables: Flushing firewall rules:                         [  OK  ]
    iptables: Setting chains to policy ACCEPT: filter          [  OK  ]
    iptables: Unloading modules:                               [  OK  ]
    # chkconfig iptables off

    注:实际环境中不必关闭防火墙,只需要开启相关端口即可(DRBD:7788-7789,CoroSync:3999-4000)

  • 设置机器hostname
    [root@centos193 ~]# cat /etc/sysconfig/network
    NETWORKING=yes
    HOSTNAME=centos193
    [root@centos193 ~]# source /etc/sysconfig/network
    [root@centos193 ~]# hostname $HOSTNAME
  • 添加hostname到每台机器的/etc/hosts
    [root@centos193 ~]# cat /etc/hosts
    ...
    192.168.1.189 node1.zrwm.com centos189
    192.168.1.193 node2.zrwm.com centos193

    建议:不使用外部的DNS服务(那样会成为额外的故障点),而是将这些mappings配置到每台机器的/etc/hosts文件.

  • 配置机器间的SSH无密码访问
    [root@centos193 ~]# ssh-keygen -t rsa -b 1024
    [root@centos193 ~]# ssh-copy-id root@192.168.1.189
    [root@centos189 ~]# ssh-keygen -t rsa -b 1024
    [root@centos189 ~]# ssh-copy-id root@192.168.1.193
DRBD的安装与配置
  • DRBD下载与安装
    # wget -c http://elrepo.org/linux/elrepo/el6/x86_64/RPMS/drbd84-utils-8.4.2-1.el6.elrepo.x86_64.rpm
    # wget -c http://elrepo.org/linux/elrepo/el6/x86_64/RPMS/kmod-drbd84-8.4.2-1.el6_3.elrepo.x86_64.rpm
    # rpm -ivh *.rpm
    warning: drbd84-utils-8.4.2-1.el6.elrepo.x86_64.rpm: Header V4 DSA/SHA1 Signature, key ID baadae52: NOKEY
    Preparing...                ########################################### [100%]
       1:drbd84-utils           ########################################### [ 50%]
       2:kmod-drbd84            ########################################### [100%]
    Working. This may take some time ...
    Done.
  • DRBD配置
    • 获取一个sha1值做为shared-secret
      [root@centos189 drbd]# sha1sum /etc/drbd.conf
      8a6c5f3c21b84c66049456d34b4c4980468bcfb3  /etc/drbd.conf
    • 创建并编辑资源配置文件:/etc/drbd.d/dbcluster.res
      [root@centos189 drbd.d]# cat /etc/drbd.d/dbcluster.res 
      resource dbcluster {
          protocol C;
          net {
              cram-hmac-alg sha1;
              shared-secret "8a6c5f3c21b84c66049456d34b4c4980468bcfb3";
              after-sb-0pri discard-zero-changes;
              after-sb-1pri discard-secondary;
              after-sb-2pri disconnect;
              rr-conflict disconnect;
          }
          device    /dev/drbd0;
          disk      /dev/sdb1;
          meta-disk internal;
          on centos189 {
              address   192.168.1.189:7789;
          }
          on centos193 {
              address   192.168.1.193:7789;
          }
      }

      以上配置所用参数说明:
      RESOURCE: 资源名称
      PROTOCOL: 使用协议”C”表示”同步的”,即收到远程的写入确认之后,则认为写入完成.
      NET: 两个节点的SHA1 key是一样的
      after-sb-0pri : “Split Brain”发生时且没有数据变更,两节点之间正常连接
      after-sb-1pri : 如果有数据变更,则放弃辅设备数据,并且从主设备同步
      after-sb-2pri : 如果前面的选择是不可能的,那么断开节点之间连接.这种情况下,要求手动处理”Split-Brain”
      rr-conflict: 假如前面的设置不能应用,并且drbd系统有角色冲突的话,系统自动断开节点间连接
      DEVICE: 虚拟设备
      DISK: 物理磁盘设备
      META-DISK: Meta data保存在同一个磁盘(sdb1)
      ON : 组成集群的节点

    • 将DRBD配置拷贝到node机器:
      [root@centos189 drbd]# scp /etc/drbd.d/dbcluster.res root@192.168.1.193:/etc/drbd.d/
  • 创建资源及文件系统
    • 创建分区(未格式化过)
      在node1和node2上创建LVM分区:


      # fdisk /dev/sdb WARNING: DOS-compatible mode is deprecated. It's strongly recommended to
               switch off the mode (command 'c') and change display units to
               sectors (command 'u').
      
      Command (m for help): n Command action
         e   extended
         p   primary partition (1-4) p Partition number (1-4): 1 First cylinder (1-1044, default 1): 
      Using default value 1 Last cylinder, +cylinders or +size{K,M,G} (1-1044, default 1044): +8096M Command (m for help): w The partition table has been altered!
      
      Calling ioctl() to re-read partition table.
      Syncing disks.
    • 给资源(dbcluster)创建meta data
      Node1:


      [root@centos189 drbd]# drbdadm create-md dbcluster md_offset 8496676864
      al_offset 8496644096
      bm_offset 8496381952
      
      Found ext3 filesystem
           7911980 kB data area apparently used
           8297248 kB left usable by current configuration
      
      Even though it looks like this would place the new meta data into
      unused space, you still need to confirm, as this is only a guess.
      
      Do you want to proceed?
      [need to type 'yes' to confirm] yes Writing meta data...
      initializing activity log
      NOT initializing bitmap
      New drbd meta data block successfully created.
      success

      Node2:

      [root@centos193 ~]# drbdadm create-md dbcluster
      md_offset 8496676864
      al_offset 8496644096
      bm_offset 8496381952
      
      Found ext3 filesystem
           7911980 kB data area apparently used
           8297248 kB left usable by current configuration
      
      Even though it looks like this would place the new meta data into
      unused space, you still need to confirm, as this is only a guess.
      
      Do you want to proceed?
      [need to type 'yes' to confirm] yes
      
      Writing meta data...
      initializing activity log
      NOT initializing bitmap
      New drbd meta data block successfully created.
      success
    • 激活资源
      - 首先确保drbd module已经加载
      查看是否加载:


      # lsmod | grep drbd

      若未加载,则需加载:

      # modprobe drbd
      # lsmod | grep drbd
      drbd                  317261  0 
      libcrc32c               1246  1 drbd

      - 启动drbd后台进程:

      [root@centos189 drbd]# drbdadm up dbcluster
      [root@centos191 drbd]# drbdadm up dbcluster

      查看drbd状态:
      Node1:

      [root@centos189 drbd]# /etc/init.d/drbd status
      drbd driver loaded OK; device status:
      version: 8.4.2 (api:1/proto:86-101)
      GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10
      m:res        cs         ro                   ds                         p  mounted  fstype
      0:dbcluster  Connected Secondary/Secondary Inconsistent/Inconsistent  C

      Node2:

      [root@centos193 drbd]# /etc/init.d/drbd status
      drbd driver loaded OK; device status:
      version: 8.4.2 (api:1/proto:86-101)
      GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10
      m:res        cs         ro                   ds                         p  mounted  fstype
      0:dbcluster  Connected Secondary/Secondary Inconsistent/Inconsistent  C

      从上面的信息可以看到,DRBD服务已经在两台机器上运行,但任何一台机器都不是主机器(“primary” host),因此无法访问到资源(block device).

    • 开始同步
      仅在主节点操作(这里为node1).


      [root@centos189 drbd]# drbdadm -- --overwrite-data-of-peer primary dbcluster

      查看同步状态:

      [root@centos189 drbd.d]# cat /proc/drbd 
      version: 8.4.2 (api:1/proto:86-101)
      GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10
       0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r---n-
          ns:4347852 nr:0 dw:0 dr:4349592 al:0 bm:265 lo:0 pe:2 ua:2 ap:0 ep:1 wo:f oos:3951392
      	[=========>..........] sync'ed: 52.5% (3856/8100)M
      	finish: 0:01:26 speed: 45,852 (46,232) K/sec
      [root@centos189 drbd.d]# cat /proc/drbd 
      version: 8.4.2 (api:1/proto:86-101)
      GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10
       0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
          ns:8297248 nr:0 dw:0 dr:8297912 al:0 bm:507 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

      上面的输出结果的一些说明:

      cs (connection state): 网络连接状态
      ro (roles): 节点的角色(本节点的角色首先显示)
      ds (disk states):硬盘的状态
      复制协议: A, B or C(本配置是C)

      看到drbd状态为”cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate”即表示同步结束.

      也可以这样查看drbd状态:

      [root@centos193 drbd]# drbd-overview 
        0:dbcluster/0  Connected Secondary/Primary UpToDate/UpToDate C r-----
    • 创建文件系统
      主节点(Node1)创建文件系统:


      [root@centos189 drbd]# mkfs -t ext4 /dev/drbd0 mke2fs 1.41.12 (17-May-2010)
      Filesystem label=
      OS type: Linux
      Block size=4096 (log=2)
      Fragment size=4096 (log=2)
      Stride=0 blocks, Stripe width=0 blocks
      519168 inodes, 2074312 blocks
      103715 blocks (5.00%) reserved for the super user
      First data block=0
      Maximum filesystem blocks=2126512128
      64 block groups
      32768 blocks per group, 32768 fragments per group
      8112 inodes per group
      Superblock backups stored on blocks: 
      	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632
      
      Writing inode tables: done                            
      Creating journal (32768 blocks): done
      Writing superblocks and filesystem accounting information: done
      
      This filesystem will be automatically checked every 39 mounts or
      180 days, whichever comes first.  Use tune2fs -c or -i to override.

      注:没必要在辅节点(Node2)做同样的操作,因为DRBD会处理原始磁盘数据的同步.
      另外,我们也不需要将这个DRBD系统挂载到任何一台机器(当然安装MySQL的时候需要临时挂载来安装MySQL),因为集群管理软件会处理.还有要确保复制的文件系统仅仅挂载在Active的主服务器上.

安装和配置MySQL
  • MySQL 5.6安装
    • 创建mysql用户组/用户(Node1,Node2)
      # groupadd mysql
      # useradd -g mysql mysql
    • 安装MySQL(Node1,Node2)
      # yum -y install gcc-c++  ncurses-devel  cmake
      # wget -c http://dev.mysql.com/get/Downloads/MySQL-5.6/mysql-5.6.10.tar.gz/from/http://cdn.mysql.com/
      # tar zxvf mysql-5.6.10.tar.gz
      # cd mysql-5.6.10
      # cmake . -DCMAKE_INSTALL_PREFIX=/usr/local/mysql
      # make && make install
    • 创建DRBD分区挂载目录(Node1,Node2)
      # mkdir /var/lib/mysql_drbd
      # mkdir /var/lib/mysql
      # chown mysql:mysql -R /var/lib/mysql_drbd
      # chown mysql:mysql -R /var/lib/mysql
    • 初始化MySQL数据库
      - 初始化之前先临时挂载DRBD文件系统到主节点(Node1)


      [root@centos189 ~]# mount /dev/drbd0 /var/lib/mysql_drbd/

      - 初始化操作(Node1):

      [root@centos189 mysql]# cd /usr/local/mysql
      [root@centos189 mysql]# mkdir /var/lib/mysql_drbd/data
      [root@centos189 mysql]# chown -R mysql:mysql /var/lib/mysql_drbd/data
      [root@centos189 mysql]# chown -R mysql:mysql .
      [root@centos189 mysql]# scripts/mysql_install_db --datadir=/var/lib/mysql_drbd/data --user=mysql

      - 初始化完成之后:

      [root@centos189 mysql]# cp support-files/mysql.server /etc/init.d/mysql
      [root@centos189 mysql]# mv support-files/my-default.cnf /etc/my.cnf
      [root@centos189 mysql]# chown mysql /etc/my.cnf 
      [root@centos189 mysql]# chmod 644 /etc/my.cnf 
      [root@centos189 mysql]# chown -R root .
      [root@centos189 mysql]# cd /var/lib/mysql_drbd
      [root@centos189 mysql_drbd]# chmod -R uog+rw *
      [root@centos189 mysql_drbd]# chown -R mysql data
    • 配置MySQL(Node1):
      [root@centos189 mysql_drbd]# cat /etc/my.cnf
      #
      # /etc/my.cnf
      #
      
      [client]
      
      port                           = 3306
      socket                         = /var/lib/mysql/mysql.sock
      
      [mysqld]
      
      port                           = 3306
      socket                         = /var/lib/mysql/mysql.sock
      
      datadir                        = /var/lib/mysql_drbd/data
      user                           = mysql
      #memlock                        = 1
      
      #table_open_cache               = 3072
      #table_definition_cache         = 1024
      max_heap_table_size            = 64M
      tmp_table_size                 = 64M
      
      # Connections
      
      max_connections                = 505
      max_user_connections           = 500
      max_allowed_packet             = 16M
      thread_cache_size              = 32
      
      # Buffers
      
      sort_buffer_size               = 8M
      join_buffer_size               = 8M
      read_buffer_size               = 2M
      read_rnd_buffer_size           = 16M
      
      # Query Cache
      
      #query_cache_size               = 64M
      
      # InnoDB
      
      #innodb_buffer_pool_size        = 1G
      #innodb_data_file_path          = ibdata1:2G:autoextend
      
      #innodb_log_file_size           = 128M
      #innodb_log_files_in_group      = 2
      
      # MyISAM
      
      myisam_recover                 = backup,force
      
      # Logging
      
      #general-log = 0
      #general_log_file               = /var/lib/mysql/mysql_general.log
      
      log_warnings                   = 2
      log_error                      = /var/lib/mysql/mysql_error.log
      
      #slow_query_log                 = 1
      #slow_query_log_file            = /var/lib/mysql/mysql_slow.log
      #long_query_time                = 0.5
      #log_queries_not_using_indexes  = 1
      #min_examined_row_limit         = 20
      
      # Binary Log / Replication
      
      server_id                      = 1
      log-bin                        = mysql-bin
      binlog_cache_size              = 1M
      #sync_binlog                    = 8
      binlog_format                  = row
      expire_logs_days               = 7
      max_binlog_size                = 128M
      
      [mysqldump]
      
      quick
      max_allowed_packet             = 16M
      
      [mysql]
      
      no_auto_rehash
      
      [myisamchk]
      
      #key_buffer                     = 512M
      #sort_buffer_size               = 512M
      read_buffer                    = 8M
      write_buffer                   = 8M
      
      [mysqld_safe]
      
      open-files-limit               = 8192
      pid-file                       = /var/lib/mysql/mysql.pid
    • 在主节点Node1测试MySQL
      [root@centos189 mysql_drbd]# /usr/local/mysql/bin/mysqld_safe --user=mysql > /dev/null &
      [root@centos189 mysql_drbd]# mysql -uroot -p
      Enter password: 
      Welcome to the MySQL monitor.  Commands end with ; or \g.
      Your MySQL connection id is 1
      Server version: 5.6.10-log Source distribution
      
      Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
      
      Oracle is a registered trademark of Oracle Corporation and/or its
      affiliates. Other names may be trademarks of their respective
      owners.
      
      Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
      
      mysql> use test;
      Database changed
      mysql> show tables;
      Empty set (0.10 sec)
      
      mysql> create table tbl (a int);
      Query OK, 0 rows affected (3.80 sec)
      
      mysql> insert into tbl values (1), (2);
      Query OK, 2 rows affected (0.25 sec)
      Records: 2  Duplicates: 0  Warnings: 0
      
      mysql> quit;
      Bye
      [root@centos189 mysql_drbd]# /usr/local/mysql/bin/mysqladmin -uroot -p shutdown
      Enter password: 
      [1]+  Done                    /usr/local/mysql/bin/mysqld_safe --user=mysql > /dev/null
    • 在节点Node1卸载DRBD文件系统
      [root@centos189 ~]# umount /var/lib/mysql_drbd
      [root@centos189 ~]# drbdadm secondary dbcluster
    • 将DRBD文件系统挂载节点Node2
      [root@centos193 ~]# drbdadm primary dbcluster
      [root@centos193 ~]# mount /dev/drbd0 /var/lib/mysql_drbd
      [root@centos193 ~]# ll /var/lib/mysql_drbd/
      total 20
      drwxrwxrwx 5 mysql mysql  4096 Mar 12 09:30 data
      drwxrw-rw- 2 mysql mysql 16384 Mar 10 07:49 lost+found
    • 节点Node2上配置MySQL并测试
      [root@centos193 ~]# scp centos189:/etc/my.cnf /etc/my.cnf
      [root@centos193 ~]# chown mysql /etc/my.cnf
      [root@centos193 ~]# chmod 644 /etc/my.cnf 
      [root@centos193 ~]# cd /usr/local/mysql/
      [root@centos193 mysql]# cp support-files/mysql.server /etc/init.d/mysql
      [root@centos193 mysql]# chown -R root:mysql .

      测试MySQL:

      [root@centos193 mysql]# /usr/local/mysql/bin/mysqld_safe --user=mysql > /dev/null &
      [1] 15864
      [root@centos193 mysql]# mysql -uroot -p
      Enter password: 
      Welcome to the MySQL monitor.  Commands end with ; or \g.
      Your MySQL connection id is 1
      Server version: 5.6.10-log Source distribution
      
      Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
      
      Oracle is a registered trademark of Oracle Corporation and/or its
      affiliates. Other names may be trademarks of their respective
      owners.
      
      Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
      
      mysql> use test;
      Database changed
      mysql> select * from tbl;
      +------+
      | a    |
      +------+
      |    1 |
      |    2 |
      +------+
      2 rows in set (0.26 sec)
      
      mysql> quit
      Bye
      [root@centos193 mysql]# /usr/local/mysql/bin/mysqladmin -uroot -p shutdown
      Enter password: 
      [1]+  Done                    /usr/local/mysql/bin/mysqld_safe --user=mysql > /dev/null
    • 在Node2上卸载DRBD文件系统,交由集群管理软件Pacemaker来管理
      [root@centos193 mysql]# umount /var/lib/mysql_drbd
      [root@centos193 mysql]# drbdadm secondary dbcluster
      [root@centos193 mysql]# drbd-overview 
        0:dbcluster/0  Connected Secondary/Secondary UpToDate/UpToDate C r----- 
      [root@centos193 mysql]#
Corosync和Pacemaker
  • 安装软件

    安装corosync:

    # yum -y install corosync corosynclib-devel
    # yum -y install corosync corosynclib-devel

    安装pacemaker:

    # yum -y install libtool-ltdl-devel libuuid-devel libxslt-devel libqb libqb-devel 
    # yum -y install glib2-devel bzip2-devel libxml2-devel docbook-dtds.noarch
    # yum -y install pacemaker resource-agents pacemaker-libs-devel

    安装cluster-glue:

    #yum install cluster-glue cluster-glue-libs-devel

    安装crmsh:

    [root@centos189 pacemaker]# wget -c http://hg.savannah.gnu.org/hgweb/crmsh/archive/tip.tar.gz
    [root@centos189 pacemaker]# tar zxvf tip.tar.gz
    [root@centos189 pacemaker]# cd crmsh-6cf4ba4f2568/
    [root@centos189 crmsh-6cf4ba4f2568]# ./autogen.sh
    [root@centos189 crmsh-6cf4ba4f2568]# ./configure
    [root@centos189 crmsh-6cf4ba4f2568]# make && make install
  • 配置corosync
    • Corosync Key
      - 生成节点间安全通信的key:


      [root@centos189 ~]# corosync-keygen
      Corosync Cluster Engine Authentication key generator.
      Gathering 1024 bits for key from /dev/random.
      Press keys on your keyboard to generate entropy.
      Writing corosync key to /etc/corosync/authkey.

      - 将authkey拷贝到另一个节点(保持authkey的权限为400):

      [root@centos189 ~]# scp /etc/corosync/authkey centos193:/etc/corosync/
      [root@centos189 ~]# ll /etc/corosync/authkey 
      -r-------- 1 root root 128 Mar 10 13:56 /etc/corosync/authkey
      [root@centos193 ~]# ll /etc/corosync/authkey 
      -r-------- 1 root root 128 Mar 10 14:02 /etc/corosync/authkey
    • corosync配置:
      - 编辑/etc/corosync/corosync.conf:


      [root@centos189 ~]# cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf
      [root@centos189 ~]# vi /etc/corosync/corosync.conf
      [root@centos189 ~]# cat /etc/corosync/corosync.conf
      # Please read the corosync.conf.5 manual page
      compatibility: whitetank
      
      aisexec { 
              user: root 
              group: root 
      } 
      
      totem {
      	version: 2
      	secauth: off
      	threads: 0
      	interface {
      		ringnumber: 0
      		bindnetaddr: 192.168.1.0
      		mcastaddr: 226.94.1.1
      		mcastport: 4000
      		ttl: 1
      	}
      }
      
      logging {
      	fileline: off
      	to_stderr: no
      	to_logfile: yes
      	to_syslog: yes
      	logfile: /var/log/cluster/corosync.log
      	debug: off
      	timestamp: on
      	logger_subsys {
      		subsys: AMF
      		debug: off
      	}
      }
      
      amf {
      	mode: disabled
      }

      注:CoroSync使用两个UDP端口,一个用于发送(4000),一个用于接收(3999).只要配置文件里面配置一个端口为N,那另外一个即为N-1.

      - 创建并编辑/etc/corosync/service.d/pcmk,添加”pacemaker”服务

      [root@centos189 ~]# cat /etc/corosync/service.d/pcmk 
      service {
      # Load the Pacemaker Cluster Resource Manager
      name: pacemaker
      ver: 1
      }

      将上面两个配置文件拷贝到另一节点

      [root@centos189 ~]# scp /etc/corosync/corosync.conf centos193:/etc/corosync/corosync.conf
      [root@centos189 ~]# scp /etc/corosync/service.d/pcmk centos193:/etc/corosync/service.d/pcmk
  • 启动corosync和Pacemaker

    - 分别在两个节点上启动corosync并检查.

    [root@centos189 ~]# /etc/init.d/corosync start
    Starting Corosync Cluster Engine (corosync):               [  OK  ]
    [root@centos189 ~]# /etc/init.d/corosync status
    corosync (pid  24831) is running...
    [root@centos189 ~]# corosync-cfgtool -s
    Printing ring status.
    Local node ID -1123964736
    RING ID 0
    	id	= 192.168.1.189
    	status	= ring 0 active with no faults
    [root@centos193 ~]# /etc/init.d/corosync start
    Starting Corosync Cluster Engine (corosync):               [  OK  ]
    [root@centos193 ~]# /etc/init.d/corosync status
    corosync (pid  19251) is running...
    [root@centos193 ~]# corosync-objctl | grep members
    runtime.totem.pg.mrp.srp.members.-1123964736.ip=r(0) ip(192.168.1.189) 
    runtime.totem.pg.mrp.srp.members.-1123964736.join_count=1
    runtime.totem.pg.mrp.srp.members.-1123964736.status=joined
    runtime.totem.pg.mrp.srp.members.-1056855872.ip=r(0) ip(192.168.1.193) 
    runtime.totem.pg.mrp.srp.members.-1056855872.join_count=1
    runtime.totem.pg.mrp.srp.members.-1056855872.status=joined

    - 在启动Pacemaker之前可查看日志看是否有出错

    # cat /var/log/cluster/corosync.log
    # tail -f /var/log/messages

    - 在两节点上分别启动Pacemaker:

    [root@centos189 corosync]# /etc/init.d/pacemaker start
    Starting Pacemaker Cluster Manager:                        [  OK  ]
    [root@centos189 corosync]# /etc/init.d/pacemaker status
    pacemakerd (pid  24895) is running...
    [root@centos193 ~]# /etc/init.d/pacemaker start
    Starting Pacemaker Cluster Manager:                        [  OK  ]
    [root@centos193 ~]# /etc/init.d/pacemaker status
    pacemakerd (pid  19417) is running...

    - 查看集群状态:

    [root@centos189 ~]# crm_mon -1
    Last updated: Sun Mar 10 15:06:52 2013
    Last change: Sun Mar 10 14:59:27 2013 via crmd on centos189
    Stack: classic openais (with plugin)
    Current DC: centos189 - partition with quorum
    Version: 1.1.8-7.el6-394e906
    2 Nodes configured, 2 expected votes
    0 Resources configured.
    
    Online: [ centos189 centos193 ]
资源配置
  • 配置资源及约束
    • 配置默认属性
      查看已存在的配置:


      [root@centos189 ~]# crm configure show
      node centos189
      node centos193
      property $id="cib-bootstrap-options" \
      	dc-version="1.1.8-7.el6-394e906" \
      	cluster-infrastructure="classic openais (with plugin)" \
      	expected-quorum-votes="2"

      检验配置是否正确:

      [root@centos189 ~]# crm_verify -L -V
         error: unpack_resources: 	Resource start-up disabled since no STONITH resources have been defined
         error: unpack_resources: 	Either configure some or disable STONITH with the stonith-enabled option
         error: unpack_resources: 	NOTE: Clusters with shared data need STONITH to ensure data integrity
      Errors found during check: config not valid
        -V may provide more details

      禁止STONITH错误:

      [root@centos189 ~]# crm configure property stonith-enabled=false
      [root@centos189 ~]# crm_verify -L

      让集群忽略Quorum:

      [root@centos189 ~]# crm configure property no-quorum-policy=ignore

      防止资源在恢复之后移动:

      [root@centos189 ~]# crm configure rsc_defaults resource-stickiness=100

      设置操作的默认超时:

      [root@centos189 www]# crm configure property default-action-timeout="180s"

      设置默认的启动失败是否为致命的:

      [root@centos189 www]# crm configure property start-failure-is-fatal="false"
    • 配置DRBD资源
      - 配置之前先停止DRBD:


      [root@centos189 ~]# /etc/init.d/drbd stop
      [root@centos193 ~]# /etc/init.d/drbd stop

      - 配置DRBD资源:

      [root@centos189 www]# crm configure
      crm(live)configure# primitive p_drbd_mysql ocf:linbit:drbd params \
      > drbd_resource="dbcluster" op monitor interval="15s" op start timeout="240s" \
      > op stop timeout="100s"

      - 配置DRBD资源主从关系(定义只有一个Master节点):

      crm(live)configure# ms ms_drbd_mysql p_drbd_mysql meta master-max="1" \
      > master-node-max="1" clone-max="2" clone-node-max="1" notify="true"

      - 配置文件系统资源,定义挂载点(mount point):

      crm(live)configure# primitive p_fs_mysql ocf:heartbeat:Filesystem \
      > params device="/dev/drbd0" directory="/var/lib/mysql_drbd/" fstype="ext4"
    • 配置VIP资源
      crm(live)configure# primitive p_ip_mysql ocf:heartbeat:IPaddr2 params \
      > ip="192.168.1.198" cidr_netmask="24" op monitor interval="30s"
    • 配置MySQL资源
      使用LSB方式(本文使用):


      crm(live)configure# primitive p_mysql lsb:mysql \
      > op monitor interval="20s" timeout="30s" \
      > op start interval="0" timeout="180s" \
      > op stop interval="0" timeout="240s"

      或使用OCF方式:

      crm(live)configure# primitive p_mysql ocf:heartbeat:mysql params \
      > binary="/usr/local/mysql/bin/mysqld_safe" config="/etc/my.cnf" \
      > user="mysql" group="mysql" log="/var/lib/mysql/mysql_error.log" \
      > pid="/var/lib/mysql/mysql.pid" socket="/var/lib/mysql/mysql.sock" \
      > datadir="/var/lib/mysql_drbd/data" \
      > op monitor interval="60s" timeout="60s" \
      > op start timeout="180s" op stop timeout="240s"
    • 组资源和约束
      通过”组”确保DRBD,MySQL和VIP是在同一个节点(Master)并且确定资源的启动/停止顺序.
      启动: p_fs_mysql–>p_ip_mysql->p_mysql
      停止: p_mysql–>p_ip_mysql–>p_fs_mysql


       
      crm(live)configure# group g_mysql p_fs_mysql p_ip_mysql p_mysql

      组group_mysql永远只在Master节点:

      crm(live)configure# colocation c_mysql_on_drbd inf: g_mysql ms_drbd_mysql:Master

      MySQL的启动永远是在DRBD Master之后:

      crm(live)configure# order o_drbd_before_mysql inf: ms_drbd_mysql:promote g_mysql:start
    • 配置检查和提交
      crm(live)configure# verify
      crm(live)configure# commit
      crm(live)configure# quit
    • 查看集群状态和failover测试
      状态查看:


      [root@centos189 mysql]# crm_mon -1r
      Last updated: Wed Mar 13 11:24:44 2013
      Last change: Wed Mar 13 11:24:04 2013 via crm_attribute on centos193
      Stack: classic openais (with plugin)
      Current DC: centos189 - partition with quorum
      Version: 1.1.8-7.el6-394e906
      2 Nodes configured, 2 expected votes
      5 Resources configured.
      
      Online: [ centos189 centos193 ]
      
      Full list of resources:
      
       Master/Slave Set: ms_drbd_mysql [p_drbd_mysql] Masters: [ centos189 ] Slaves: [ centos193 ]
       Resource Group: g_mysql
           p_fs_mysql	(ocf::heartbeat:Filesystem): Started centos189 p_ip_mysql	(ocf::heartbeat:IPaddr2): Started centos189 p_mysql	(lsb:mysql): Started centos189

      Failover测试:
      将Node1设置为Standby状态

      [root@centos189 ~]# crm node standby

      过几分钟查看集群状态(若切换成功,则看到如下状态):

      [root@centos189 ~]# crm status
      Last updated: Wed Mar 13 11:29:41 2013
      Last change: Wed Mar 13 11:26:46 2013 via crm_attribute on centos189
      Stack: classic openais (with plugin)
      Current DC: centos189 - partition with quorum
      Version: 1.1.8-7.el6-394e906
      2 Nodes configured, 2 expected votes
      5 Resources configured.
      
      Node centos189: standby
      Online: [ centos193 ]
      
       Master/Slave Set: ms_drbd_mysql [p_drbd_mysql] Masters: [ centos193 ] Stopped: [ p_drbd_mysql:1 ]
       Resource Group: g_mysql
           p_fs_mysql	(ocf::heartbeat:Filesystem): Started centos193 p_ip_mysql	(ocf::heartbeat:IPaddr2): Started centos193 p_mysql	(lsb:mysql): Started centos193

      将Node1(centos189)恢复online状态:

      [root@centos189 mysql]# crm node online
      [root@centos189 mysql]# crm status
      Last updated: Wed Mar 13 11:32:49 2013
      Last change: Wed Mar 13 11:31:23 2013 via crm_attribute on centos189
      Stack: classic openais (with plugin)
      Current DC: centos189 - partition with quorum
      Version: 1.1.8-7.el6-394e906
      2 Nodes configured, 2 expected votes
      5 Resources configured.
      
      Online: [ centos189 centos193 ]
      
       Master/Slave Set: ms_drbd_mysql [p_drbd_mysql]
           Masters: [ centos193 ]
           Slaves: [ centos189 ]
       Resource Group: g_mysql
           p_fs_mysql	(ocf::heartbeat:Filesystem):	Started centos193
           p_ip_mysql	(ocf::heartbeat:IPaddr2):	Started centos193
           p_mysql	(lsb:mysql):	Started centos193
“断网”即停止Master服务
  • 避免因”断网”而发生”split brain”(“裂脑”)

    利用Pacemaker去ping一个独立的网络(比如网络路由),当发现主机网络断网(被隔离)的时候,即阻止该主机为DRBD master.

    [root@centos189 ~]# crm configure
    crm(live)configure# primitive p_ping ocf:pacemaker:ping params name="ping" \
    > multiplier="1000" host_list="192.168.1.1" op monitor interval="15s" timeout="60s" \
    > start timeout="60s"

    由于两台主机需要运行ping去检查他们的网络连接,需要创建一个clone (cl_ping),让ping资源可以运行在集群所有的主机上.

    crm(live)configure# clone cl_ping p_ping meta interleave="true"

    告诉Pacemaker如何处理ping的结果:

    crm(live)configure# location l_drbd_master_on_ping ms_drbd_mysql rule $role="Master" \
    > -inf: not_defined ping or ping number:lte 0

    上面的例子表示:当主机没有ping的服务或是无法ping通至少一个节点的时候,就为该主机设置一个偏好分数(preference score)为负无穷大 (-inf),
    从而让location约束(l_drbd_master_on_ping)控制DRBD master的资源地址.

    验证和提交配置:

    crm(live)configure# verify
    WARNING: p_drbd_mysql: action monitor not advertised in meta-data, it may not be supported by the RA
    crm(live)configure# commit
    crm(live)configure# quit

    检查ping服务是否已经在运行:

    [root@centos189 ~]# crm_mon -1
    Last updated: Thu Mar 14 01:02:14 2013
    Last change: Thu Mar 14 01:01:20 2013 via cibadmin on centos189
    Stack: classic openais (with plugin)
    Current DC: centos189 - partition with quorum
    Version: 1.1.8-7.el6-394e906
    2 Nodes configured, 2 expected votes
    7 Resources configured.
    
    Online: [ centos189 centos193 ]
    
     Master/Slave Set: ms_drbd_mysql [p_drbd_mysql]
         Masters: [ centos193 ]
         Slaves: [ centos189 ]
     Resource Group: g_mysql
         p_fs_mysql	(ocf::heartbeat:Filesystem):	Started centos193
         p_ip_mysql	(ocf::heartbeat:IPaddr2):	Started centos193
         p_mysql	(lsb:mysql):	Started centos193
     Clone Set: cl_ping [p_ping]
         Started: [ centos189 centos193 ]
     

     

  • 断网测试

    - 在当前Master停止网络服务:

    [root@centos193 ~]# service network stop
    [root@centos189 ~]# crm resource status
     Master/Slave Set: ms_drbd_mysql [p_drbd_mysql]
         Slaves: [ centos189 ]
         Stopped: [ p_drbd_mysql:1 ]
     Resource Group: g_mysql
         p_fs_mysql	(ocf::heartbeat:Filesystem):	Stopped 
         p_ip_mysql	(ocf::heartbeat:IPaddr2):	Stopped 
         p_mysql	(lsb:mysql):	Stopped 
     Clone Set: cl_ping [p_ping]
         Started: [ centos189 ]
         Stopped: [ p_ping:1 ]

    - 恢复Master的网络服务:

    [root@centos193 ~]# service network stop
    [root@centos189 ~]# crm resource status
     Master/Slave Set: ms_drbd_mysql [p_drbd_mysql]
         Masters: [ centos193 ]
         Slaves: [ centos189 ]
     Resource Group: g_mysql
         p_fs_mysql	(ocf::heartbeat:Filesystem):	Started 
         p_ip_mysql	(ocf::heartbeat:IPaddr2):	Started 
         p_mysql	(lsb:mysql):	Started 
     Clone Set: cl_ping [p_ping]
         Started: [ centos189 centos193 ]
    [root@centos189 ~]# crm status
    Last updated: Thu Mar 14 01:09:51 2013
    Last change: Thu Mar 14 01:09:49 2013 via crmd on centos189
    Stack: classic openais (with plugin)
    Current DC: centos193 - partition with quorum
    Version: 1.1.8-7.el6-394e906
    2 Nodes configured, 2 expected votes
    7 Resources configured.
    
    Online: [ centos189 centos193 ]
    
     Master/Slave Set: ms_drbd_mysql [p_drbd_mysql]
         Masters: [ centos193 ]
         Slaves: [ centos189 ]
     Resource Group: g_mysql
         p_fs_mysql	(ocf::heartbeat:Filesystem):	Started centos193
         p_ip_mysql	(ocf::heartbeat:IPaddr2):	Started centos193
         p_mysql	(lsb:mysql):	Started centos193
     Clone Set: cl_ping [p_ping]
         Started: [ centos189 centos193 ]

 

系统启动项设置
  • 系统启动选项设置

    由于DRBD,MySQL等服务已经交由Pacemaker来管理,需要将他们的系统自启动选项关掉,同时确保CoroSync和Pacemaker随着系统启动.

    [root@centos189 ~]# chkconfig drbd off
    [root@centos189 ~]# chkconfig mysql off
    [root@centos189 ~]# chkconfig corosync on
    [root@centos189 ~]# chkconfig pacemaker on
    [root@centos193 ~]# chkconfig drbd off
    [root@centos193 ~]# chkconfig mysql off
    [root@centos193 ~]# chkconfig corosync on
    [root@centos193 ~]# chkconfig pacemaker on
手动解决”Split-Brain”
  • 从”Split-Brain”中恢复

    DRBD的Active/Standby架构设计的两主机的数据因为某些原因也可能发生不一致.假如这种情况发生的话,
    DRBD两主机之间将会中断连接(可以通过/etc/init.d/drbd status或drbd-overview查看他们的关系状态).
    如果查看日志(/var/log/messages)确定造成DRBD连接中断的原因是”Split-Brain”的话,那么就需要找出/确定拥有正确的数据的主机,然后让DRBD重新同步数据.

    - 查看DRBD主机状态及查看日志:

    [root@centos189 ~]# cat /proc/drbd 
    version: 8.4.2 (api:1/proto:86-101)
    GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10
     0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
        ns:32948 nr:0 dw:4 dr:34009 al:1 bm:9 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
    [root@centos189 ~]# cat /var/log/messages | grep Split-Brain
    Mar 14 21:11:48 node1 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
    [root@centos193 drbd.d]# cat /proc/drbd 
    version: 8.4.2 (api:1/proto:86-101)
    GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10
     0: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r-----
        ns:0 nr:32948 dw:32948 dr:0 al:0 bm:9 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

    手动解决”Split-Brain”:
    这里找到的”好数据”数据的主机为centos189.出现”坏数据”的主机为centos193.

    在”坏数据”主机centos193上:

    [root@centos193 ~]# drbdadm disconnect dbcluster
    [root@centos193 ~]# cat /proc/drbd 
    version: 8.4.2 (api:1/proto:86-101)
    GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10
     0: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r-----
        ns:0 nr:32948 dw:32948 dr:0 al:0 bm:9 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
    [root@centos193 ~]# drbdadm secondary dbcluster
    [root@centos193 ~]# drbdadm connect --discard-my-data dbcluster

    在”好数据”的主机centos189上(如果下面的cs:状态为WFConnection,则无需下面操作.)

    [root@centos189 ~]# cat /proc/drbd 
    version: 8.4.2 (api:1/proto:86-101)
    GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10
     0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r-----
        ns:32948 nr:0 dw:4 dr:34009 al:1 bm:9 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
    [root@centos189 ~]# drbdadm connect dbcluster
    [root@centos189 ~]# /etc/init.d/drbd status
    drbd driver loaded OK; device status:
    version: 8.4.2 (api:1/proto:86-101)
    GIT-hash: 7ad5f850d711223713d6dcadc3dd48860321070c build by dag@Build64R6, 2012-09-06 08:16:10
    m:res        cs         ro                 ds                 p  mounted              fstype
    0:dbcluster Connected Primary/Secondary  UpToDate/UpToDate  C  /var/lib/mysql_drbd  ext4
  • 转载,请注明:http://www.51itstudy.com/30152.html

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/14431099/viewspace-1316638/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/14431099/viewspace-1316638/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值