Linux下高可用群集之corosync+openais+pacemaker+web+drbd

项目拓扑图:

image

corosync 具体配置:

1.配置IP   setup

image

image

2.保证名称你能够相互解析:uname –r 必须相同

[root@www1 ~]# uname -rn 
www1.gjp.com 2.6.18-164.el5

www1.gjp.com上的配置:

[root@gjp99 ~]# cat /etc/sysconfig/network 
NETWORKING=yes 
NETWORKING_IPV6=yes 
HOSTNAME=www1.gjp.com 
[root@gjp99 ~]# hostname www1.gjp.com 
[root@gjp99 ~]# hostname 
www1.gjp.com

logout登出重新登陆即可!

3.保证系统时钟一致

[root@www1 ~]# hwclock -s 
[root@www1 ~]# clock 
Tue 23 Oct 2012 05:20:36 PM CST  -0.017990 seconds

4.修改hosts(代替dns)

[root@www1 ~]# cat /etc/hosts 
# Do not remove the following line, or various programs 
# that require network functionality will fail. 
127.0.0.1   localhost.localdomain  localhost 
::1        localhost6.localdomain6 localhost6 
192.168.2.1     www1.gjp.com    www1 
192.168.2.2     www2.gjp.com    www2

[root@www1 ~]# ping www2.gjp.com 
PING www2.gjp.com (192.168.2.2) 56(84) bytes of data. 
64 bytes from www2.gjp.com (192.168.2.2): icmp_seq=1 ttl=64 time=3.45 ms 
64 bytes from www2.gjp.com (192.168.2.2): icmp_seq=2 ttl=64 time=0.658 ms

名称已经能够相互解析!

5. 挂载光盘并安装corosync所需安装包

[root@www1 ~]# mkdir /mnt/cdrom 
[root@www1 ~]# mount /dev/cdrom /mnt/cdrom 
mount: block device /dev/cdrom is write-protected, mounting read-only

[root@www2 ~]# scp *.rpm www2:/root

在www2上拷贝上传的rpm包到www1的root目录下: 
[root@www1 ~]# yum localinstall -y *.rpm –nogpgcheck

6.编辑corosync的配置文档

[root@www1 ~]# cd /etc/corosync/ 
[root@www1 corosync]# ll 
total 20 
-rw-r--r-- 1 root root 5384 Jul 28  2010 amf.conf.example 
-rw-r--r-- 1 root root  436 Jul 28  2010 corosync.conf.example 
drwxr-xr-x 2 root root 4096 Jul 28  2010 service.d 
drwxr-xr-x 2 root root 4096 Jul 28  2010 uidgid.d

[root@www1 corosync]# cp corosync.conf.example corosync.conf 
[root@www1 corosync]# vim corosync.conf

compatibility: whitetank  (表示兼容corosync 0.86的版本,向后兼容,兼容老的版本,一些 
                           新的功能可能无法实用)

(图腾的意思  ,多个节点传递心跳时的相关协议的信息) 
totem { 
        version: 2  版本号 
        secauth: off  是否代开安全认证 
        threads: 0   多少个现成认证  0 无限制 
        interface { 
                ringnumber: 0   
                bindnetaddr: 192 168.2.0  通过哪个网络地址进行通讯,可以给个主机地址(给成192.168.2.0 
                mcastaddr: 226.94.1.1 
                mcastport: 5405 
        }   
}

logging { 
        fileline: off 
        to_stderr: no  是否发送标准出错 
        to_logfile: yes  日志 
        to_syslog: yes   系统日志  (建议关掉一个),会降低性能 
        logfile: /var/log/cluster/corosync.log  (手动创建目录) 
        debug: off  排除时可以起来 
        timestamp: on 日志中是否记录时间

      一下是openais的东西,可以不用代开 
        logger_subsys { 
                subsys: AMF 
                debug: off 
        }   
}

amf { 
        mode: disabled 

补充一些东西,前面只是底层的东西,因为要用pacemaker

service { 
        ver: 0 
        name: pacemaker 

虽然用不到openais ,但是会用到一些子选项

aisexec { 
        user: root 
        group: root 
}

7.为了便面其他主机加入该集群,需要认证,生成一个authkey

[root@www1 corosync]# corosync-keygen

[root@www1 corosync]# ll 
total 28 
-rw-r--r-- 1 root root 5384 Jul 28  2010 amf.conf.example 
-r-------- 1 root root  128 Oct 24 13:59 authkey 
-rw-r--r-- 1 root root  538 Oct 24 13:56 corosync.conf 
-rw-r--r-- 1 root root  436 Jul 28  2010 corosync.conf.example 
drwxr-xr-x 2 root root 4096 Jul 28  2010 service.d 
drwxr-xr-x 2 root root 4096 Jul 28  2010 uidgid.d

[root@www1 corosync]# scp -p authkey corosync.conf www2:/etc/corosync/

8.该目录必须提前创建

[root@www1 ~]# mkdir /var/log/cluster

[root@www1 corosync]# ssh www2 'mkdir  /var/log/cluster

9.启动corosync服务

[root@www1 corosync]# service corosync start 
Starting Corosync Cluster Engine (corosync):               [  OK  ] 
[root@www1 corosync]# ssh www2 'service corosync start' 
root@www2's password: 
Starting Corosync Cluster Engine (corosync): 
[  OK  ]

10.检测corosync是否无误

验证corosync引擎是否正常启动了

[root@www1 corosync]# grep -i  -e "corosync cluster engine" -e "configuration file" /var/log/messages 
Oct 24 11:09:04 www1 smartd[3260]: Opened configuration file /etc/smartd.conf 
Oct 24 11:09:04 www1 smartd[3260]: Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices 
Oct 24 17:08:33 www1 corosync[26362]:   [MAIN  ] Corosync Cluster Engine ('1.2.7'): started and ready to provide service. 
Oct 24 17:08:33 www1 corosync[26362]:   [MAIN  ] Successfully read main configuration file '/etc/corosync/corosync.conf'.

查看初始化成员节点通知是否发出

[root@www1 corosync]# grep -i totem /var/log/messages 
Oct 24 17:08:33 www1 corosync[26362]:   [TOTEM ] Initializing transport (UDP/IP). 
Oct 24 17:08:33 www1 corosync[26362]:   [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). 
Oct 24 17:08:33 www1 corosync[26362]:   [TOTEM ] The network interface is down. 
Oct 24 17:08:34 www1 corosync[26362]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.

[root@www2 ~]# grep -i totem /var/log/messages 
Oct 24 17:09:07 www2 corosync[28610]:   [TOTEM ] Initializing transport (UDP/IP). 
Oct 24 17:09:07 www2 corosync[28610]:   [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). 
Oct 24 17:09:07 www2 corosync[28610]:   [TOTEM ] The network interface is down. 
Oct 24 17:09:08 www2 corosync[28610]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.

检查过程中是否有错误产生

[root@www1 corosync]# grep -i error:  /var/log/messages  |grep -v unpack_resources

[root@www2 ~]# grep -i error:  /var/log/messages  |grep -v unpack_resources

不显示任何信息证明正确无误!

检查pacemaker时候已经启动了

[root@www1 corosync]# grep -i totem /var/log/messages 
Oct 24 17:08:33 www1 corosync[26362]:   [TOTEM ] Initializing transport (UDP/IP). 
Oct 24 17:08:33 www1 corosync[26362]:   [TOTEM ] Initializing transmit/receive sec 
Oct 24 17:08:33 www1 corosync[26362]:   [TOTEM ] The network interface is down. 
Oct 24 17:08:34 www1 corosync[26362]:   [TOTEM ] A processor joined or left the me 
[root@www1 corosync]# grep -i error:  /var/log/messages  |grep -v unpack_resources 
[root@www1 corosync]# grep -i pcmk_startup /var/log/messages 
Oct 24 17:08:34 www1 corosync[26362]:   [pcmk  ] info: pcmk_startup: CRM: Initialized 
Oct 24 17:08:34 www1 corosync[26362]:   [pcmk  ] Logging: Initialized pcmk_startup 
Oct 24 17:08:34 www1 corosync[26362]:   [pcmk  ] info: pcmk_startup: Maximum core file size is: 4294967295 
Oct 24 17:08:34 www1 corosync[26362]:   [pcmk  ] info: pcmk_startup: Service: 9 
Oct 24 17:08:34 www1 corosync[26362]:   [pcmk  ] info: pcmk_startup: Local hostname: www1.gjp.com

[root@www2 ~]# grep -i pcmk_startup /var/log/messages 

前集群的节点上启动另外一个节点

[root@www1 ~]# /etc/init.d/corosync start 
Starting Corosync Cluster Engine (corosync):               [  OK  ] 
[root@www1 ~]# ssh www2  '/etc/init.d/corosync start' 
root@www2's password: 
Starting Corosync Cluster Engine (corosync): [  OK  ] 

[root@www2 corosync]# crm status 
============ 
Last updated: Wed Oct 24 20:11:19 2012 
Stack: openais 
Current DC: www1.gjp.com - partition with quorum 
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 
2 Nodes configured, 2 expected votes 
0 Resources configured. 
============

Online: [ www1.gjp.com www2.gjp.com ]

提示:集群的节点之间的时间应该是同步的,

提供高可用服务 
在corosync中,定义服务可以用两种借口

1.图形接口  (使用hb—gui) 
2.crm  (pacemaker 提供,是一个shell)

image

用于查看cib的相关信息

如何验证该文件的语法错误

[root@www1 corosync]# crm_verify  -L 
crm_verify[4329]: 2012/10/25_14:59:35 ERROR: unpack_resources: Resource start-up disabled since no STONITH resources have been defined 
crm_verify[4329]: 2012/10/25_14:59:35 ERROR: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option 
crm_verify[4329]: 2012/10/25_14:59:35 ERROR: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity 
Errors found during check: config not valid 
  -V may provide more details

可以看到有stonith错误,在高可用的环境里面,会禁止实用任何支援 
可以禁用stonith

[root@www1 corosync]# crm 
crm(live)# configure 
crm(live)configure#  property stonith-enabled=false 
crm(live)configure# commit 
crm(live)configure# show 
node www1.gjp.com 
node www2.gjp.com 
property $id="cib-bootstrap-options" \ 
    dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \ 
    cluster-infrastructure="openais" \ 
    expected-quorum-votes="2" \ 
    stonith-enabled="false"

再次进行检查

[root@www1 corosync]# crm_verify  -L

    没有错误了 
    系统上有专门的stonith命令

stonith   -L   显示stonith所指示的类型 
crm可以使用交互式模式  
可以执行help 
保存在cib里面,以xml的格式

11.资源的配置

集群的资源类型有4种 
primitive   本地主资源 (只能运行在一个节点上) 
group     把多个资源轨道一个组里面,便于管理 
clone    需要在多个节点上同时启用的  (如ocfs2  ,stonith ,没有主次之分) 
master    有主次之分,如drbd

现在用的资源

ip地址  http服务  共享存储 
用资源代理进行配置 
ocf  lsb的 
使用list可以查看

[root@www1 corosync]# crm 
crm(live)# help

This is the CRM command line interface program.

Available commands:

    cib              manage shadow CIBs 
    resource         resources management 
    configure        CRM cluster configuration 
    node             nodes management 
    options          user preferences 
    ra               resource agents information center 
    status           show cluster status 
    quit,bye,exit    exit the program 
    help             show help 
    end,cd,up        go back one level

crm(live)# ra 
image

(是/etc/init.d目录下的)

crm(live)ra# list ocf heartbeat

实用info或者meta 用于显示一个资源的详细信息

  meta ocf:heartbeat:IPaddr  各个子项用:分开

crm(live)ra# meta ocf:heartbeat:IPaddr 

image

配置一个资源,可以在configuration 下面进行配置

1.先资源名字

image

crm(live)configure# commit 
crm(live)configure# end 
crm(live)# status 
============ 
Last updated: Thu Oct 25 15:18:54 2012 
Stack: openais 
Current DC: www1.gjp.com - partition with quorum 
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 
2 Nodes configured, 2 expected votes 
1 Resources configured. 
============

Online: [ www1.gjp.com www2.gjp.com ]

webip    (ocf::heartbeat:IPaddr):    Started www1.gjp.com

可以看出该资源在node1上启动

[root@www1 corosync]# ifconfig |less

image

[root@www1 corosync]# mount /dev/cdrom /mnt/cdrom 
mount: block device /dev/cdrom is write-protected, mounting read-only 
[root@www1 corosync]# yum install httpd -y

[root@www1 corosync]# service httpd status 
httpd is stopped 
[root@www1 corosync]# chkconfig --list |grep httpd 
httpd              0:off    1:off    2:off    3:off    4:off    5:off    6:off 
[root@www1 corosync]# crm 
crm(live)# ra 
crm(live)ra# classes 
heartbeat 
lsb 
ocf / heartbeat pacemaker 
stonith

定义web服务资源 
  在两个节点上都要进行安装 
  安装完毕后,可以查看httpd的lsb脚本

[root@www1 corosync]# crm ra list lsb

[root@www1 corosync]# crm 
或者

crm(live)# ra 
crm(live)ra# list lsb

image

crm(live)ra# end 
crm(live)# configure 
crm(live)configure# primitive webserver lsb:httpd

定义httpd的资源 
crm(live)configure# show 
node www1.gjp.com 
node www2.gjp.com 
primitive webip ocf:heartbeat:IPaddr \ 
    params ip="192.168.2.66" 
primitive webserver lsb:httpd 
property $id="cib-bootstrap-options" \ 
    dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \ 
    cluster-infrastructure="openais" \ 
    expected-quorum-votes="2" \ 
    stonith-enabled="false" 
crm(live)configure# commit 
crm(live)configure# end 
crm(live)# status 
============ 
Last updated: Thu Oct 25 16:06:46 2012 
Stack: openais 
Current DC: www1.gjp.com - partition with quorum 
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 
2 Nodes configured, 2 expected votes 
2 Resources configured. 
============

Online: [ www1.gjp.com www2.gjp.com ]

webip    (ocf::heartbeat:IPaddr):    Started www1.gjp.com 
webserver    (lsb:httpd):    Started www1.gjp.com

Failed actions: 
    webserver_monitor_0 (node=www2.gjp.com, call=3, rc=5, status=complete): not installed

如果www2.gjp.com上面已安装http服务,则会出现,ip在www1上,服务在www2上运行!

[root@www1 ~]# service httpd status 
httpd (pid  4897) is running...

[root@www1 ~]# echo "www1.gjp.com">/var/www/html/index.html 
[root@www1 ~]# crm 
crm(live)# configure 
crm(live)configure# help group

The `group` command creates a group of resources.

Usage: 
............... 
        group <name> <rsc> [<rsc>...] 
          [meta attr_list] 
          [params attr_list]

        attr_list :: [$id=<id>] <attr>=<val> [<attr>=<val>...] | $id-ref=<id> 
............... 
Example: 
............... 
        group internal_www disk0 fs0 internal_ip apache \ 
          meta target_role=stopped 
...............

crm(live)configure# group web webip webserver 
crm(live)configure# commit 
image

 

客户机测试:

image 

[root@www1 ~]# crm status 
============ 
Last updated: Thu Oct 25 16:34:28 2012 
Stack: openais 
Current DC: www1.gjp.com - partition with quorum 
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 
2 Nodes configured, 2 expected votes 
1 Resources configured. 
============

Online: [ www1.gjp.com www2.gjp.com ]

Resource Group: web 
     webip    (ocf::heartbeat:IPaddr):    Started www1.gjp.com 
     webserver    (lsb:httpd):    Started www1.gjp.com

Failed actions: 
    webserver_monitor_0 (node=www2.gjp.com, call=3, rc=5, status=complete): not installed

模拟www1已经死掉: 
[root@www1 ~]# service corosync stop 
Signaling Corosync Cluster Engine (corosync) to terminate: [  OK  ] 
Waiting for corosync services to unload:.......            [  OK  ]

[root@www1 ~]# service httpd status 
httpd is stopped

[root@www2 Server]# crm status 
============ 
Last updated: Thu Oct 25 16:43:01 2012 
Stack: openais 
Current DC: www2.gjp.com - partition WITHOUT quorum 
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 
2 Nodes configured, 2 expected votes 
1 Resources configured. 
============

Online: [ www2.gjp.com ] 
OFFLINE: [ www1.gjp.com ]

Failed actions: 
    webserver_monitor_0 (node=www2.gjp.com, call=3, rc=5, status=complete): not installed

发现总有这个错误提示:

安装提示在www2上安装http服务,必须重启服务,否则,不能识别到,错误仍在!

[root@www2 Server]# service corosync stop 
Signaling Corosync Cluster Engine (corosync) to terminate: [  OK  ] 
Waiting for corosync services to unload:^[[A.^H.....       [  OK  ] 
[root@www2 Server]# service corosync start 
Starting Corosync Cluster Engine (corosync):               [  OK  ] 
[root@www2 Server]# crm status 
============ 
Last updated: Thu Oct 25 16:47:18 2012 
Stack: openais 
Current DC: www1.gjp.com - partition with quorum 
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 
2 Nodes configured, 2 expected votes 
1 Resources configured. 
============

Online: [ www1.gjp.com www2.gjp.com ]

Resource Group: web 
     webip    (ocf::heartbeat:IPaddr):    Started www1.gjp.com 
     webserver    (lsb:httpd):    Started www1.gjp.com

解决www2接管不了服务的问题:

[root@www1 ~]# service corosync start 
Starting Corosync Cluster Engine (corosync):               [  OK  ] 
[root@www1 ~]# service httpd status 
httpd (pid  5233) is running...

image

在www2上创建网站:

[root@www2 Server]# echo "www2.gjp.com " >/var/www/html/index.html

一旦www1死掉:

[root@www1 ~]# service corosync stop 
Signaling Corosync Cluster Engine (corosync) to terminate: [  OK  ] 
Waiting for corosync services to unload:.......            [  OK  ]

image

能够正常访问:

[root@www2 Server]# service httpd status 
httpd (pid  4656) is running...

[root@www2 Server]# crm status 
============ 
Last updated: Thu Oct 25 17:12:16 2012 
Stack: openais 
Current DC: www2.gjp.com - partition WITHOUT quorum 
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 
2 Nodes configured, 2 expected votes 
1 Resources configured. 
============

Online: [ www2.gjp.com ] 
OFFLINE: [ www1.gjp.com ]

Resource Group: web 
     webip    (ocf::heartbeat:IPaddr):    Started www2.gjp.com 
     webserver    (lsb:httpd):    Started www2.gjp.com

如果www1恢复了,不能把权利争夺过来!观看如下配置:

[root@www1 ~]# service corosync start 
Starting Corosync Cluster Engine (corosync):               [  OK  ] 
[root@www1 ~]# crm status 
============ 
Last updated: Thu Oct 25 17:20:44 2012 
Stack: openais 
Current DC: www2.gjp.com - partition with quorum 
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 
2 Nodes configured, 2 expected votes 
1 Resources configured. 
============

Online: [ www1.gjp.com www2.gjp.com ]

Resource Group: web 
     webip    (ocf::heartbeat:IPaddr):    Started www2.gjp.com 
     webserver    (lsb:httpd):    Started www2.gjp.com

image

随便刷新都定位到www2上!

除非www2上的corosync服务死掉

[root@www2 Server]# service corosync stop 
Signaling Corosync Cluster Engine (corosync) to terminate: [  OK  ] 
Waiting for corosync services to unload:.......            [  OK  ]

image

image

www2.gjp.com上的配置:

[root@gjp99 ~]# cat /etc/sysconfig/network 
NETWORKING=yes 
NETWORKING_IPV6=yes 
HOSTNAME=www2.gjp.com 
[root@gjp99 ~]# hostname www2.gjp.com

[root@www2 ~]# hwclock -s 
[root@www2 ~]# clock 
Tue 23 Oct 2012 05:20:32 PM CST  -0.018132 seconds

[root@www2 .ssh]# cat /etc/hosts 
# Do not remove the following line, or various programs 
# that require network functionality will fail. 
127.0.0.1   localhost.localdomain  localhost 
::1        localhost6.localdomain6 localhost6 
192.168.2.1     www1.gjp.com      www1 
192.168.2.2     www2.gjp.com      www2

[root@www2 ~]# ping www1.gjp.com 
PING www1.gjp.com (192.168.2.1) 56(84) bytes of data. 
64 bytes from www1.gjp.com (192.168.2.1): icmp_seq=1 ttl=64 time=1.11 ms 
64 bytes from www1.gjp.com (192.168.2.1): icmp_seq=2 ttl=64 time=0.506 ms

名称已经能够相互解析!

[root@www2 ~]# cat /etc/yum.repos.d/rhel-debuginfo.repo 
[rhel-server] 
name=Red Hat Enterprise Linux Server 
baseurl=file:///mnt/cdrom/Server 
enabled=1 
gpgcheck=1 
gpgkey=file:///mnt/cdrom/RPM-GPG-KEY-redhat-release

[rhel-cluster] 
name=Red Hat Enterprise Linux Cluster 
baseurl=file:///mnt/cdrom/Cluster 
enabled=1 
gpgcheck=1 
gpgkey=file:///mnt/cdrom/RPM-GPG-KEY-redhat-release

要保证光驱已连接:

image

[root@www2 ~]# mkdir /mnt/cdrom 
[root@www2 ~]# mount /dev/cdrom /mnt/cdrom

[root@www2 ~]# yum grouplist all 
Loaded plugins: rhnplugin, security 
This system is not registered with RHN. 
RHN support will be disabled. 
Setting up Group Process 
rhel-cluster                                                                | 1.3 kB     00:00     
rhel-cluster/primary                                                        | 6.5 kB     00:00     
rhel-server                                                                 | 1.3 kB     00:00     
rhel-server/primary                                                         | 732 kB     00:00     
rhel-cluster/group                                                          | 101 kB     00:00     
rhel-server/group                                                           | 1.0 MB     00:00     
Done

实现同一网段内的无障碍通讯!

[root@www2 ~]# ssh-keygen -t rsa 
Generating public/private rsa key pair. 
Enter file in which to save the key (/root/.ssh/id_rsa): 
/root/.ssh/id_rsa already exists. 
Overwrite (y/n)? 
[root@www2 ~]# cd .ssh/ 
[root@www2 .ssh]# ls 
id_rsa  id_rsa.pub 
[root@www2 .ssh]# ssh-copy-id -i id_rsa.pub www1 
10 
The authenticity of host 'www1 (192.168.2.1)' can't be established. 
RSA key fingerprint is 87:be:8b:a4:bd:11:11:10:c2:ec:2d:ef:02:68:f6:0e. 
Are you sure you want to continue connecting (yes/no)? yes 
Warning: Permanently added 'www1,192.168.2.1' (RSA) to the list of known hosts. 
root@www1's password: 
Now try logging into the machine, with "ssh 'www1'", and check in:

  .ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

[root@www2 .ssh]# scp /etc/yum.repos.d/rhel-debuginfo.repo  www1:/etc/yum.repos.d/ 
rhel-debuginfo.repo                                              100%  318     0.3KB/s   00:00    
[root@www2 .ssh]# date 
Wed Oct 24 11:30:30 CST 2012 
[root@www2 .ssh]# ssh www1 'date' 
Wed Oct 24 11:30:40 CST 2012

[root@www1 ~]# ssh-keygen -t rsa

[root@www1 .ssh]# ssh-copy-id -i id_rsa.pub www2

 

上传所需软件包:

image

[root@www2 ~]# mount /dev/cdrom /mnt/cdrom 
mount: block device /dev/cdrom is write-protected, mounting read-only 
[root@www2 ~]# yum localinstall -y *.rpm –nogpgcheck

 

验证corosync引擎是否正常启动了

[root@www2 ~]#  grep -i  -e "corosync cluster engine" -e "configuration file" /var/log/messages 
Oct 24 11:09:03 www2 smartd[3259]: Opened configuration file /etc/smartd.conf 
Oct 24 11:09:03 www2 smartd[3259]: Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices 
Oct 24 17:09:07 www2 corosync[28610]:   [MAIN  ] Corosync Cluster Engine ('1.2.7'): started and ready to provide service. 
Oct 24 17:09:07 www2 corosync[28610]:   [MAIN  ] Successfully read main configuration file '/etc/corosync/corosync.conf'.

[root@www2 Server]# yum install httpd -y

 

DRBD的配置:

www1 的配置:

[root@www1 ~]# fdisk /dev/sda

The number of cylinders for this disk is set to 2610. 
There is nothing wrong with that, but this is larger than 1024, 
and could in certain setups cause problems with: 
1) software that runs at boot time (e.g., old versions of LILO) 
2) booting and partitioning software from other OSs 
   (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): n 
Command action 
   e   extended 
   p   primary partition (1-4) 

Selected partition 4 
First cylinder (1354-2610, default 1354): 
Using default value 1354 
Last cylinder or +size or +sizeM or +sizeK (1354-2610, default 2610): 
Using default value 2610

Command (m for help): p

Disk /dev/sda: 21.4 GB, 21474836480 bytes 
255 heads, 63 sectors/track, 2610 cylinders 
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System 
/dev/sda1   *           1          13      104391   83  Linux 
/dev/sda2              14        1288    10241437+  83  Linux 
/dev/sda3            1289        1353      522112+  82  Linux swap / Solaris 
/dev/sda4            1354        2610    10096852+   5  Extended

Command (m for help): n 
First cylinder (1354-2610, default 1354): p 
First cylinder (1354-2610, default 1354): 
Using default value 1354 
Last cylinder or +size or +sizeM or +sizeK (1354-2610, default 2610): +2g

Command (m for help): p

Disk /dev/sda: 21.4 GB, 21474836480 bytes 
255 heads, 63 sectors/track, 2610 cylinders 
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System 
/dev/sda1   *           1          13      104391   83  Linux 
/dev/sda2              14        1288    10241437+  83  Linux 
/dev/sda3            1289        1353      522112+  82  Linux swap / Solaris 
/dev/sda4            1354        2610    10096852+   5  Extended 
/dev/sda5            1354        1597     1959898+  83  Linux

Command (m for help): w 
The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy. 
The kernel still uses the old table. 
The new table will be used at the next reboot. 
Syncing disks.

[root@www1 ~]# partprobe /dev/sda 
[root@www1 ~]# cat /proc/partitions 
major minor  #blocks  name

   8     0   20971520 sda 
   8     1     104391 sda1 
   8     2   10241437 sda2 
   8     3     522112 sda3 
   8     4          0 sda4 
   8     5    1959898 sda5

在节点2上做同样配置

安装drbd,用来构建分布式存储。

这里要选用适合自己系统的版本进行安装,我用到的是

drbd83-8.3.8-1.el5.centos.i386.rpm

kmod-drbd83-8.3.8-1.el5.centos.i686.rpm

image

[root@www1 ~]# yum localinstall -y drbd83-8.3.8-1.el5.centos.i386.rpm –nogpgcheck

[root@www1 ~]# yum localinstall -y kmod-drbd83-8.3.8-1.el5.centos.i686.rpm --nogpgcheck

在节点2上做同样操作

[root@www1 ~]# cp /usr/share/doc/drbd83-8.3.8/drbd.conf  /etc 
cp: overwrite `/etc/drbd.conf'? y  必须选择覆盖!   
[root@www1 ~]# scp /etc/drbd.conf  www2:/etc/

[root@www1 ~]# vim /etc/drbd.d/global_common.conf 
[root@www1 ~]# cat /etc/drbd.d/global_common.conf

global { 
        usage-count yes; 
        # minor-count dialog-refresh disable-ip-verification 
}

common { 
        protocol C;

        startup { 
                wfc-timeout  120; 
                degr-wfc-timeout 120; 
         } 
        disk { 
                  on-io-error detach; 
                  fencing resource-only;

          } 
        net { 
                cram-hmac-alg "sha1"; 
                shared-secret  "mydrbdlab"; 
         } 
        syncer { 
                  rate  100M; 
         }

}

[root@www1 ~]# vim /etc/drbd.d/web.res 
[root@www1 ~]# cat /etc/drbd.d/web.res 
resource  web { 
        on www1.gjp.com { 
        device   /dev/drbd0; 
        disk    /dev/sda5; 
        address  192.168.2.1:7789; 
        meta-disk       internal; 
        }  

        on www2.gjp.com { 
        device   /dev/drbd0; 
        disk    /dev/sda5; 
        address  192.168.2.2:7789; 
        meta-disk       internal; 
        }   
}

开始初始化

双方借点上都要执行

drbdadm   create-md web

在双方的节点上启动服务

service drbd start

查看状态

[root@www1 ~]# drbdadm create-md web

[root@www1 ~]# service drbd start

启动时必须双方一块启动!

 

www2的配置:

[root@www2 Server]# fdisk /dev/sda

The number of cylinders for this disk is set to 2610. 
There is nothing wrong with that, but this is larger than 1024, 
and could in certain setups cause problems with: 
1) software that runs at boot time (e.g., old versions of LILO) 
2) booting and partitioning software from other OSs 
   (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): p

Disk /dev/sda: 21.4 GB, 21474836480 bytes 
255 heads, 63 sectors/track, 2610 cylinders 
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System 
/dev/sda1   *           1          13      104391   83  Linux 
/dev/sda2              14        1288    10241437+  83  Linux 
/dev/sda3            1289        1353      522112+  82  Linux swap / Solaris

Command (m for help): n 
Command action 
   e   extended 
   p   primary partition (1-4) 

Selected partition 4 
First cylinder (1354-2610, default 1354): 
Using default value 1354 
Last cylinder or +size or +sizeM or +sizeK (1354-2610, default 2610): 
Using default value 2610

Command (m for help): p

Disk /dev/sda: 21.4 GB, 21474836480 bytes 
255 heads, 63 sectors/track, 2610 cylinders 
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System 
/dev/sda1   *           1          13      104391   83  Linux 
/dev/sda2              14        1288    10241437+  83  Linux 
/dev/sda3            1289        1353      522112+  82  Linux swap / Solaris 
/dev/sda4            1354        2610    10096852+   5  Extended

Command (m for help): n 
First cylinder (1354-2610, default 1354): 
Using default value 1354 
Last cylinder or +size or +sizeM or +sizeK (1354-2610, default 2610): +2g

Command (m for help): p

Disk /dev/sda: 21.4 GB, 21474836480 bytes 
255 heads, 63 sectors/track, 2610 cylinders 
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System 
/dev/sda1   *           1          13      104391   83  Linux 
/dev/sda2              14        1288    10241437+  83  Linux 
/dev/sda3            1289        1353      522112+  82  Linux swap / Solaris 
/dev/sda4            1354        2610    10096852+   5  Extended 
/dev/sda5            1354        1597     1959898+  83  Linux

Command (m for help): w 
The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy. 
The kernel still uses the old table. 
The new table will be used at the next reboot. 
Syncing disks.

[root@www2 Server]# partprobe /dev/sda 
[root@www2 Server]# cat /proc/partitions 
major minor  #blocks  name

   8     0   20971520 sda 
   8     1     104391 sda1 
   8     2   10241437 sda2 
   8     3     522112 sda3 
   8     4          0 sda4 
   8     5    1959898 sda5

[root@www1 ~]# scp drbd83-8.3.8-1.el5.centos.i386.rpm kmod-drbd83-8.3.8-1.el5.centos.i686.rpm www2:/root 
root@www2's password: 
drbd83-8.3.8-1.el5.centos.i386.rpm              100%  217KB 216.7KB/s   00:00    
kmod-drbd83-8.3.8-1.el5.centos.i686.rpm         100%  123KB 123.0KB/s   00:00 

[root@www2 ~]# rpm -ivh drbd83-8.3.8-1.el5.centos.i386.rpm 
warning: drbd83-8.3.8-1.el5.centos.i386.rpm: Header V3 DSA signature: NOKEY, key ID e8562897 
Preparing...                ########################################### [100%] 
   1:drbd83                 warning: /etc/drbd.conf created as /etc/drbd.conf.rpmnew 
########################################### [100%] 
[root@www2 ~]# rpm -ivh kmod-drbd83-8.3.8-1.el5.centos.i686.rpm 
warning: kmod-drbd83-8.3.8-1.el5.centos.i686.rpm: Header V3 DSA signature: NOKEY, key ID e8562897 
Preparing...                ########################################### [100%] 
   1:kmod-drbd83            ########################################### [100%] 
[root@www2 ~]# cp /usr/share/doc/drbd83-8.3.8/drbd.conf   /etc/ 
cp: overwrite `/etc/drbd.conf'? y

[root@www2 ~]# scp www1:/etc/drbd.d/global_common.conf  /etc/drbd.d/global_common.conf 
global_common.conf                              100%  505     0.5KB/s   00:00

[root@www2 ~]# scp www1:/etc/drbd.d/web.res  /etc/drbd.d/web.res 
web.res                                         100%  348     0.3KB/s   00:00

[root@www2 ~]# drbdadm   create-md web

[root@www2 ~]# service drbd start 
Starting DRBD resources: [ 
web 
Found valid meta data in the expected location, 2006929408 bytes into /dev/sda5. 
d(web) s(web) n(web) ].

[root@www1 ~]# service drbd start 
Starting DRBD resources: [ ]. 
[root@www1 ~]# cat /proc/drbd 
version: 8.3.8 (api:88/proto:86-94) 
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16 
0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r---- 
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:195980

都为second 状态,没有同步

也可以

drbd-overview 
[root@www1 ~]# drbdadm   -- --overwrite-data-of-peer primary web

[root@www1 ~]# vim /etc/drbd.d/global_common.conf

可调整同步速率:rate

[root@www1 ~]# cat /proc/drbd 
version: 8.3.8 (api:88/proto:86-94) 
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16 
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r---- 
    ns:259716 nr:0 dw:0 dr:267904 al:0 bm:15 lo:1 pe:31 ua:256 ap:0 ep:1 wo:b oos:1701048 
    [=>..................] sync'ed: 13.4% (1701048/1959800)K delay_probe: 25 
    finish: 0:00:37 speed: 45,120 (23,520) K/sec

[root@www1 ~]# cat /proc/drbd 
version: 8.3.8 (api:88/proto:86-94) 
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16 
0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---- 
    ns:1959800 nr:0 dw:0 dr:1959800 al:0 bm:120 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

[root@www1 ~]# drbd-overview 
  0:web  Connected Primary/Secondary UpToDate/UpToDate C r----

[root@www2 ~]# cat /proc/drbd 
version: 8.3.8 (api:88/proto:86-94) 
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16 
0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r---- 
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:1959800 
[root@www2 ~]# cat /proc/drbd 
version: 8.3.8 (api:88/proto:86-94) 
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16 
0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r---- 
    ns:0 nr:1959800 dw:1959800 dr:0 al:0 bm:120 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:

创建文件系统(在主节点上实现)

mkfs -t ext3  -L drbdweb  /dev/drbd0

[root@www1 ~]# mkfs -t ext3  -L drbdweb  /dev/drbd0 
mke2fs 1.39 (29-May-2006) 
Filesystem label=drbdweb 
OS type: Linux 
Block size=4096 (log=2) 
Fragment size=4096 (log=2) 
245280 inodes, 489950 blocks 
24497 blocks (5.00%) reserved for the super user 
First data block=0 
Maximum filesystem blocks=503316480 
15 block groups 
32768 blocks per group, 32768 fragments per group 
16352 inodes per group 
Superblock backups stored on blocks: 
    32768, 98304, 163840, 229376, 294912

Writing inode tables: done                            
Creating journal (8192 blocks): done 
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 37 mounts or 
180 days, whichever comes first.  Use tune2fs -c or -i to override.

[root@www1 ~]# mkdir /web 
[root@www1 ~]# mount /dev/drbd0 /web/ 
[root@www1 ~]# mount 
/dev/sda2 on / type ext3 (rw) 
proc on /proc type proc (rw) 
sysfs on /sys type sysfs (rw) 
devpts on /dev/pts type devpts (rw,gid=5,mode=620) 
/dev/sda1 on /boot type ext3 (rw) 
tmpfs on /dev/shm type tmpfs (rw) 
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) 
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) 
/dev/hdc on /mnt/cdrom type iso9660 (ro) 
/dev/drbd0 on /web type ext3 (rw)

[root@www1 ~]# cd /web 
[root@www1 web]# echo "web1 " >index.html 
[root@www1 web]# ll 
total 20 
-rw-r--r-- 1 root root     6 Oct 25 21:11 index.html 
drwx------ 2 root root 16384 Oct 25 20:57 lost+found

 

[root@www2 ~]# mkdir /web2 
[root@www2 ~]# mount /dev/drbd0 /web2 
mount: block device /dev/drbd0 is write-protected, mounting read-only 
mount: Wrong medium type

从设备没有任何权限!

[root@www1 ~]# umount /web 
[root@www1 ~]# drbdadm secondary web 
[root@www1 ~]# cat /proc/drbd 
version: 8.3.8 (api:88/proto:86-94) 
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16 
0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r---- 
    ns:2024140 nr:0 dw:64340 dr:1959937 al:24 bm:135 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

[root@www2 ~]# drbdadm primary web 
[root@www2 ~]# mount /dev/drbd0 /web2 
[root@www2 ~]# ll /web2 
total 20 
-rw-r--r-- 1 root root     6 Oct 25 21:11 index.html 
drwx------ 2 root root 16384 Oct 25 20:57 lost+found

[root@www2 ~]# cat /proc/drbd 
version: 8.3.8 (api:88/proto:86-94) 
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16 
0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---- 
    ns:40 nr:2024140 dw:2024180 dr:221 al:1 bm:120 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 
[root@www2 ~]# cd /web2 
[root@www2 web2]# touch gjp.txt 
[root@www2 web2]# ll 
total 20 
-rw-r--r-- 1 root root     0 Oct 25 21:16 gjp.txt 
-rw-r--r-- 1 root root     6 Oct 25 21:11 index.html 
drwx------ 2 root root 16384 Oct 25 20:57 lost+found

注意:还原为www1为主,www2为辅,则必须把www2上的挂载点卸掉!然后再设置主备!

[root@www1 ~]# cat /proc/drbd 
version: 8.3.8 (api:88/proto:86-94) 
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16 
0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---- 
    ns:2024140 nr:96 dw:64436 dr:1959937 al:24 bm:135 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

[root@www1 ~]# cd /var/www/html 
[root@www1 html]# ll 
total 4 
-rw-r--r-- 1 root root  0 Oct 25 16:14 gjp1 
-rw-r--r-- 1 root root 13 Oct 25 16:23 index.html 
[root@www1 html]# mv index.html /web/

mv: overwrite `/web/index.html'? y

必须覆盖,原来的index.html是随便写的,不是网站

[root@www1 html]# cd /web/ 
[root@www1 web]# ll 
total 20 
-rw-r--r-- 1 root root     0 Oct 25 21:16 gjp.txt 
-rw-r--r-- 1 root root    13 Oct 25 16:23 index.html 
drwx------ 2 root root 16384 Oct 25 20:57 lost+found

[root@www1 web]# vim /etc/httpd/conf/httpd.conf

image

将其修改为:

image

在www2上改变为

image

下面实现COROSYNC自动调用brdb

brdb自动挂载挂载点, 由于访问的网站已经放到挂载点 /web下,所以全都能够自动实现!

修改如下:

由于两台corosync使用的是同一个配置文件,所以两台设备上的挂载点必须相同

即在www2上建立挂载点/web   并修改httpd.conf的网站默认目录/web

corosync 如何与drbd绑定?

把drbd添加到corosync服务上

代码添加

crm configure primitive drbd_web_FS ocf:heartbeat:Filesystem params device="/dev/drbd0" directory="/web" fstype="ext3"

crm configure primitive httpd_drbd_web ocf:heartbeat:drbd params drbd_resource="web" op monitor interval="60s" role="Master" timeout="40s" op monitor interval="70s" role="Slave" timeout="40s"

crm configure master MS_Webdrbd httpd_drbd_web meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"

crm configure colocation drbd_web_FS_on_MS_Webdrbd inf: drbd_web_FS MS_Webdrbd:Master

crm configure order drbd_web_FS_after_MS_Webdrbd inf: MS_Webdrbd:promote drbd_web_FS:start

crm configure property no-quorum-policy="ignore"

配置查看:

image

[root@www1 ~]# cd /etc/drbd.d/ 
[root@www1 drbd.d]# vim global_common.conf

global { 
        usage-count no;   //注意该为no  
        # minor-count dialog-refresh disable-ip-verification 
}

common { 
        protocol C;

handlers { 
                pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b &gt; /proc/sysrq-trigger ; reboot -f"; 
                pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b &gt; /proc/sysrq-trigger ; reboot -f"; 
                local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o &gt; /proc/sysrq-trigger ; halt -f";  
                fence-peer "/usr/lib/drbd/crm-fence-peer.sh";  
                split-brain "/usr/lib/drbd/notify-split-brain.sh root";  
                out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";  
                before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";  
                after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh; 

        startup { 
                wfc-timeout  120; 
                degr-wfc-timeout 120; 
         }   
        disk { 
                  on-io-error detach;

                 fencing resource-only;

          } 
        net { 
                cram-hmac-alg "sha1"; 
                shared-secret  "mydrbdlab"; 
         } 
        syncer { 
                  rate  100M; 
         }

}

 

由于我们在/etc/drbd.d/global_common.conf配置文件中开启了资源隔离和脑列处理机制,所以在crm的配置文件cib中将会自动出现一个位置约束配置,当主节点宕机之后,禁止从节点变为主节点,以免当主节点恢复的时候产生脑裂,进行资源争用,但是我们此时只是为了验证资源能够流转,所以将这个位置约束删除:

[root@www1 drbd.d]# crm configure edit

image

两台都要删除这两行!

node www1.gjp.com \ 
        attributes standby="on"   
node www2.gjp.com \ 
        attributes standby="off" 
primitive drbd_web_FS ocf:heartbeat:Filesystem \ 
        params device="/dev/drbd0" directory="/web" fstype="ext3" 
primitive httpd_drbd_web ocf:heartbeat:drbd \ 
        params drbd_resource="web" \ 
        op monitor interval="60s" role="Master" timeout="40s" \ 
        op monitor interval="70s" role="Slave" timeout="40s" 
primitive webip ocf:heartbeat:IPaddr \ 
        params ip="192.168.2.66" 
primitive webserver lsb:httpd 
group web webip webserver 
ms MS_Webdrbd httpd_drbd_web \ 
        meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" 
colocation drbd_web_FS_on_MS_Webdrbd inf: drbd_web_FS MS_Webdrbd:Master 
order drbd_web_FS_after_MS_Webdrbd inf: MS_Webdrbd:promote drbd_web_FS:start 
property $id="cib-bootstrap-options" \ 
        dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \ 
        cluster-infrastructure="openais" \ 
        expected-quorum-votes="2" \ 
        stonith-enabled="false" \ 
        no-quorum-policy="ignore"

[root@www1 drbd.d]# crm status 
============ 
Last updated: Sun Oct 28 15:55:51 2012 
Stack: openais 
Current DC: www1.gjp.com - partition with quorum 
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 
2 Nodes configured, 2 expected votes 
3 Resources configured. 
============

Online: [ www1.gjp.com www2.gjp.com ]

Resource Group: web 
     webip    (ocf::heartbeat:IPaddr):    Started www1.gjp.com 
     webserver    (lsb:httpd):    Started www1.gjp.com 
drbd_web_FS    (ocf::heartbeat:Filesystem):    Started www1.gjp.com 
Master/Slave Set: MS_Webdrbd [httpd_drbd_web] 
     Masters: [ www1.gjp.com ] 
     Stopped: [ httpd_drbd_web:1 ]

[root@www1 drbd.d]# service drbd status

drbd driver loaded OK; device status: 
version: 8.3.8 (api:88/proto:86-94) 
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16 
m:res  cs            ro               ds                         p                 mounted  fstype 
0:web  WFConnection  Primary/Unknown  UpToDate/Outdated  C  /web     ext3

[root@www1 drbd.d]# crm status 
============ 
Last updated: Sun Oct 28 16:08:38 2012 
Stack: openais 
Current DC: www1.gjp.com - partition with quorum 
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 
2 Nodes configured, 2 expected votes 
3 Resources configured. 
============

Online: [ www1.gjp.com www2.gjp.com ]

Resource Group: web 
    webip    (ocf::heartbeat:IPaddr):    Started www1.gjp.com 
     webserver    (lsb:httpd):    Started www1.gjp.com 
drbd_web_FS
    (ocf::heartbeat:Filesystem):    Started www2.gjp.com 
Master/Slave Set: MS_Webdrbd [httpd_drbd_web] 
     Masters: [ www2.gjp.com ] 
     Slaves: [ www1.gjp.com ]

发现出现脑裂现象,解决如下:

image 

[root@www1 drbd.d]# watch -n 1 'crm status'

image

已可以同步:查看挂载点

[root@www1 drbd.d]# mount 
/dev/sda2 on / type ext3 (rw) 
proc on /proc type proc (rw) 
sysfs on /sys type sysfs (rw) 
devpts on /dev/pts type devpts (rw,gid=5,mode=620) 
/dev/sda1 on /boot type ext3 (rw) 
tmpfs on /dev/shm type tmpfs (rw) 
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) 
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) 
/dev/drbd0 on /web type ext3 (rw)

image

观看www2上的状态:

[root@www2 drbd.d]# mount 
/dev/sda2 on / type ext3 (rw) 
proc on /proc type proc (rw) 
sysfs on /sys type sysfs (rw) 
devpts on /dev/pts type devpts (rw,gid=5,mode=620) 
/dev/sda1 on /boot type ext3 (rw) 
tmpfs on /dev/shm type tmpfs (rw) 
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) 
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)

[root@www2 drbd.d]# service httpd status 
httpd is stopped 
[root@www2 drbd.d]# service drbd status 
drbd driver loaded OK; device status: 
version: 8.3.8 (api:88/proto:86-94) 
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.cntos.org, 2010-06-04 08:04:16 
m:res  cs          ro                 ds                 p      mounted  fstype 
0:web  StandAlone  Secondary/Unknown  UpToDate/Outdated  r----

模拟www1死掉了!

image

[root@www2 drbd.d]# crm status 
============ 
Last updated: Sun Oct 28 17:25:27 2012 
Stack: openais 
Current DC: www1.gjp.com - partition with quorum 
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 
2 Nodes configured, 2 expected votes 
3 Resources configured. 
============

Node www1.gjp.com: standby 
Online: [ www2.gjp.com ]

Resource Group: web 
     webip    (ocf::heartbeat:IPaddr):    Started www2.gjp.com 
     webserver    (lsb:httpd):    Started www2.gjp.com 
Master/Slave Set: MS_Webdrbd [httpd_drbd_web] 
     Masters: [ www2.gjp.com ] 
     Stopped: [ httpd_drbd_web:0 ] 
drbd_web_FS    (ocf::heartbeat:Filesystem):    Started www2.gjp.com 
[root@www2 drbd.d]# service httpd status 
httpd (pid  8509) is running...

image 

能够正常访问!

eth0      Link encap:Ethernet  HWaddr 00:0C:29:99:12:74  
          inet addr:192.168.2.2  Bcast:192.168.2.255  Mask:255.255.255.0 
          inet6 addr: fe80::20c:29ff:fe99:1274/64 Scope:Link 
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1 
          RX packets:192191 errors:0 dropped:0 overruns:0 frame:0 
          TX packets:103068 errors:0 dropped:0 overruns:0 carrier:0 
          collisions:0 txqueuelen:1000 
          RX bytes:121841514 (116.1 MiB)  TX bytes:13390418 (12.7 MiB) 
          Interrupt:67 Base address:0x2000

eth0:0    Link encap:Ethernet  HWaddr 00:0C:29:99:12:74  
          inet addr:192.168.2.66  Bcast:192.168.2.255  Mask:255.255.255.0 
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1 
          Interrupt:67 Base address:0x2000

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0 
          inet6 addr: ::1/128 Scope:Host

[root@www2 drbd.d]# service drbd status 
drbd driver loaded OK; device status: 
version: 8.3.8 (api:88/proto:86-94) 
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16 
m:res  cs          ro               ds                 p      mounted  fstype 
0:web  StandAlone  Primary/Unknown  UpToDate/Outdated  r----  ext3

出处:http://blog.51cto.com/guojiping/1036951

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值