高可用集群的部署

实验环境的准备:

准备三台rhel6.5的虚拟机三台,真机作测试,做好解析。

解析

[root@server3 ~]# cat /etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4

::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

172.25.50.10 server1.example.com

172.25.50.20 server2.example.com

172.25.50.30 server3.example.com

172.25.50.250 real50.example.com

 yum源的配置

[root@server3 ~]# cat /etc/yum.repos.d/redhat6.repo

[Server]

name=rhel6.5 Server

baseurl=http://172.25.50.250/rhel6.5

gpgcheck=0

 

[HighAvailability]

name=rhel6.5 HighAvailability

baseurl=http://172.25.50.250/rhel6.5/HighAvailability

gpgcheck=0

 

[LoadBalancer]

name=rhel6.5 LoadBalancer

baseurl=http://172.25.50.250/rhel6.5/LoadBalancer

gpgcheck=0

 

[ResilientStorage]

name=rhel6.5 ResilientStorage

baseurl=http://172.25.50.250/rhel6.5/ResilientStorage

gpgcheck=0

 

[ScalableFileSystem]

name=rhel6.5 ScalableFileSystem

baseurl=http://172.25.50.250/rhel6.5/ScalableFileSystem

gpgcheck=0

 

[root@server3 ~]# yum repolist

Loaded plugins: product-id, subscription-manager

This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.

HighAvailability                                         | 3.9 kB     00:00     

HighAvailability/primary_db                              |  43 kB     00:00     

LoadBalancer                                             | 3.9 kB     00:00     

LoadBalancer/primary_db                                  | 7.0 kB     00:00     

ResilientStorage                                         | 3.9 kB     00:00     

ResilientStorage/primary_db                              |  47 kB     00:00     

ScalableFileSystem                                       | 3.9 kB     00:00     

ScalableFileSystem/primary_db                            | 6.8 kB     00:00     

Server                                                   | 3.9 kB     00:00     

Server/primary_db                                        | 3.1 MB     00:00     

repo id                          repo name                                status

HighAvailability                 rhel6.5 HighAvailability                    56

LoadBalancer                     rhel6.5 LoadBalancer                         4

ResilientStorage                 rhel6.5 ResilientStorage                    62

ScalableFileSystem               rhel6.5 ScalableFileSystem                   7

Server                           rhel6.5 Server                           3,690

repolist: 3,819

 

三台6.5系统的虚拟机都进行相同的配置。

安装软件

server1

[root@server1 yum.repos.d]# yum install ricci -y

[root@server1 yum.repos.d]# passwd ricci

更改用户 ricci 的密码 。

新的 密码:#westos

无效的密码: 它基于字典单词

无效的密码: 过于简单

重新输入新的 密码:#westos

passwd: 所有的身份验证令牌已经成功更新。

 

[root@server1 yum.repos.d]#/etc/init.d/ricci start#启动服务

[root@server1 yum.repos.d]# /etc/init.d/ricci start

Starting system message bus:                               [  OK  ]

Starting oddjobd:                                          [  OK  ]

generating SSL certificates...  done

Generating NSS database...  done

启动 ricci:                                               [确定]

[root@server1 yum.repos.d]#

Broadcast message from root@server1.example.com

(unknown) at 14:48 ...

 

The system is going down for reboot NOW!

Connection to 172.25.50.10 closed by remote host.

Connection to 172.25.50.10 closed.

 

 

server2

[root@server2 yum.repos.d]# yum install ricci -y

[root@server2 yum.repos.d]# passwd ricci

更改用户 ricci 的密码 。

新的 密码:#westos

无效的密码: 它基于字典单词

无效的密码: 过于简单

重新输入新的 密码:#westos

passwd: 所有的身份验证令牌已经成功更新。

 

[root@server2 yum.repos.d]#/etc/init.d/ricci start#启动服务

 

[root@server2 yum.repos.d]# /etc/init.d/ricci start

Starting system message bus:                               [  OK  ]

Starting oddjobd:                                          [  OK  ]

generating SSL certificates...  done

Generating NSS database...  done

启动 ricci:                                               [确定]

[root@server1 yum.repos.d]#

Broadcast message from root@server1.example.com

(unknown) at 14:48 ...

 

The system is going down for reboot NOW!

Connection to 172.25.50.20 closed by remote host.

Connection to 172.25.50.20 closed.

 

 

 

server3

安装软件luci

[root@server3 ~]# yum install luci -y

 

[root@server3 ~]# /etc/init.d/luci start

Adding following auto-detected host IDs (IP addresses/domain names), corresponding to `server3.example.com' address, to the configuration of self-managed certificate `/var/lib/luci/etc/cacert.config' (you can change them by editing `/var/lib/luci/etc/cacert.config', removing the generated certificate `/var/lib/luci/certs/host.pem' and restarting luci):

(none suitable found, you can still do it manually as mentioned above)

 

Generating a 2048 bit RSA private key

writing new private key to '/var/lib/luci/certs/host.pem'

Starting saslauthd:                                        [  OK  ]

Start luci...                                              [确定]

Point your web browser to https://server3.example.com:8084 (or equivalent) to access luci

 

在真机的浏览器上:

https;//server3.example.com:8084

输入登陆主机的root账户和密码-->Create-->输入集群节点的信息

这里的密码是:ricci用户的密码(westos

选项如下图所示:

wKioL1jFZryzaid2AAGiIepUITY111.png-wh_50

wKiom1jFZtuABdKNAAGpjPS1pzc244.png-wh_50

 

名词须知: cman -- 集群管理器

  rgmanger -- 资源管理器

  fence -- 电源控制机

  corosync

 

点击Fence Dvice

Fence virt(Multicast Mode) --> Name:vmfence --> sumbit

wKioL1jFZvXRcZPsAAGrJoDdwBo081.png-wh_50

 

在浏览器图形界面 选择 Nodes --> 选中 server1.example.com --> 选择 Add Fence Method --> Method Name:fence-1 --> 查找对应的 uuid 填入第一个空     ##此处用 uuid 的原因是因为真机不能识别主机名

在浏览器图形界面 选择 Nodes --> 选中 server2.example.com --> 选择 Add Fence Method --> Method Name:fence-2 --> 查找对应的 uuid 填入第一个空##此处用 uuid 的原因是因为真机不能识别主机名

填写的uuid(查看uuid命令:virsh list --uuid,这里的uuidvmmanager中的顺序一致)

wKiom1jFZxCC9W-pAAE3N4cB2x4715.png-wh_50

 

在真机上扎安装软件

fence-virtd-multicast-0.3.2-1.el7.x86_64

fence-virtd-0.3.2-2.el7.x86_64

fence-virtd-libvirt-0.3.2-2.el7.x86_64

 

在真机上

#mkdir /etc/cluster

[root@foundation50 cluster]# fence_virtd -c

Module search path [/usr/lib64/fence-virt]:

 

Available backends:

    libvirt 0.1

Available listeners:

    multicast 1.2

    serial 0.4

 

Listener modules are responsible for accepting requests

from fencing clients.

 

Listener module [multicast]:

 

The multicast listener module is designed for use environments

where the guests and hosts may communicate over a network using

multicast.

 

The multicast address is the address that a client will use to

send fencing requests to fence_virtd.

 

Multicast IP Address [225.0.0.12]:

 

Using ipv4 as family.

 

Multicast IP Port [1229]:

 

Setting a preferred interface causes fence_virtd to listen only

on that interface.  Normally, it listens on all interfaces.

In environments where the virtual machines are using the host

machine as a gateway, this *must* be set (typically to virbr0).

Set to 'none' for no interface.

 

Interface [br0]: ##不是br0的话写成br0

 

The key file is the shared key information which is used to

authenticate fencing requests.  The contents of this file must

be distributed to each physical host and virtual machine within

a cluster.

 

Key File [/etc/cluster/fence_xvm.key]:

 

Backend modules are responsible for routing requests to

the appropriate hypervisor or management layer.

 

Backend module [libvirt]:

 

Configuration complete.

 

=== Begin Configuration ===

fence_virtd {

listener = "multicast";

backend = "libvirt";

module_path = "/usr/lib64/fence-virt";

}

 

listeners {

multicast {

key_file = "/etc/cluster/fence_xvm.key";

address = "225.0.0.12";

interface = "br0";

family = "ipv4";

port = "1229";

}

 

}

 

backends {

libvirt {

uri = "qemu:///system";

}

 

}

 

=== End Configuration ===

Replace /etc/fence_virt.conf with the above [y/N]? y

 

[root@foundation50 etc]# dd if=/dev/urandom of=/etc/cluster/fence_xvm.key bs=128 count=1

记录了1+0 的读入

记录了1+0 的写出

128字节(128 B)已复制,0.000185659 秒,689 kB/

 

# scp fence_xvm.key root@172.25.50.10:/etc/cluster/

# scp fence_xvm.key root@172.25.50.20:/etc/cluster/

如果server1.server2上没有这个目录就创建一个

最后重启systemctl restart fence_virtd 这个服务

测试:

server1 中执行命令: fence_node server2.example.com  ##此处一定要用域名

或者

server2 中执行命令: fence_node server1.example.com  ##此处一定要用域名

 

 

高可用集群的配置  

#在真机重启之后,需要先将sevrer3fence_virtd 服务打开

systemctl start fence_virtd

在图形界面上选择Failover Domain选项

点击add-->填写名字,下面的全选-->create

wKioL1jFZzXATjpvAAHpM9kz4-Q461.png-wh_50

设置优先级。越小的,优先级越高

 

再选择resources选项-->add-->ip address-->设置虚拟ip172.25.50.100-->掩码为:24-->Monitor Link:对勾  最后的一行为延迟时间:5 --> Submit

wKiom1jFZ1yQ7Lv8AAFX_e0zgW4108.png-wh_50

 

选择 Resources --> add --> Script --> Name:httpd --> Full Path to Script File: /etc/init.d/httpd --> Submit

 

选择 Service Groups --> add --> Name:apache --> 全对勾 --> Failover Domain:webfile --> Recovery Policy:Relocate --> Add Resource --> 先添加 ip address,再添加Script --> submit## Run Excluslve -- 运行独占(只允许此服务运行)

wKioL1jFZ3Lj3sxfAAGWY3kwT3M399.png-wh_50

测试:

 

[root@server1 cluster]# clustat

Cluster Status for lyitx @ Wed Feb 15 17:08:17 2017

Member Status: Quorate

 

 Member Name                            ID   Status

 ------ ----                            ---- ------

 server1.example.com                        1 Online, Local, rgmanager

 server2.example.com                        2 Online, rgmanager

 

 Service Name                  Owner (Last)                  State         

 ------- ----                  ----- ------                  -----         

 service:apache                server1.example.com           started       

测试:

[root@server1 cluster]# /etc/init.d/httpd stop

停止 httpd:                                               [确定]

 

在把server2主机上用clustat 命令查看,

 

[root@server2 ~]# clustat

Cluster Status for lyitx @ Wed Feb 15 17:09:04 2017

Member Status: Quorate

 

 Member Name                            ID   Status

 ------ ----                            ---- ------

 server1.example.com                        1 Online, rgmanager

 server2.example.com                        2 Online, Local, rgmanager

 

 Service Name                  Owner (Last)                  State         

 ------- ----                  ----- ------                  -----         

 service:apache                server1.example.com           started    

   

[root@server2 ~]# clustat

Cluster Status for lyitx @ Wed Feb 15 17:09:10 2017

Member Status: Quorate

 

 Member Name                            ID   Status

 ------ ----                            ---- ------

 server1.example.com                        1 Online, rgmanager

 server2.example.com                        2 Online, Local, rgmanager

 

 Service Name                  Owner (Last)                  State         

 ------- ----                  ----- ------                  -----         

 service:apache                none                          recovering    

[root@server2 ~]# clustat

Cluster Status for lyitx @ Wed Feb 15 17:09:11 2017

Member Status: Quorate

 

 Member Name                            ID   Status

 ------ ----                            ---- ------

 server1.example.com                        1 Online, rgmanager

 server2.example.com                        2 Online, Local, rgmanager

 

 Service Name                  Owner (Last)                  State         

 ------- ----                  ----- ------                  -----         

 service:apache                server2.example.com           starting      

[root@server2 ~]# clustat

Cluster Status for lyitx @ Wed Feb 15 17:09:12 2017

Member Status: Quorate

 

 Member Name                            ID   Status

 ------ ----                            ---- ------

 server1.example.com                        1 Online, rgmanager

 server2.example.com                        2 Online, Local, rgmanager

 

 Service Name                  Owner (Last)                  State         

 ------- ----                  ----- ------                  -----         

 service:apache                server2.example.com           starting      

经过观察后发现:服务从server1上切换到了server2

 

测试2

server2上: ip link set down eth0 #将网卡关闭

这时候在sevrer1上:

[root@server1 cluster]# clustat

Cluster Status for lyitx @ Wed Feb 15 17:21:05 2017

Member Status: Quorate

 

 Member Name                            ID   Status

 ------ ----                            ---- ------

 server1.example.com                        1 Online, Local, rgmanager

 server2.example.com                        2 Offline

 

 Service Name                  Owner (Last)                  State         

 ------- ----                  ----- ------                  -----         

 service:apache                server1.example.com           started       

[root@server1 cluster]# clustat

Cluster Status for lyitx @ Wed Feb 15 17:21:11 2017

Member Status: Quorate

 

 Member Name                            ID   Status

 ------ ----                            ---- ------

 server1.example.com                        1 Online, Local, rgmanager

 server2.example.com                        2 Online

 

 Service Name                  Owner (Last)                  State         

 ------- ----                  ----- ------                  -----         

 service:apache                server1.example.com           started  

测试3

 

[root@server1 cluster]# echo c > /proc/sysrq-trigger

 

server2中查看

[root@server2 ~]# clustat

Cluster Status for lyitx @ Wed Feb 15 17:30:31 2017

Member Status: Quorate

 

 Member Name                              ID   Status

 ------ ----                              ---- ------

 server1.example.com                          1 Offline

 server2.example.com                          2 Online, Local, rgmanager

 

 Service Name                    Owner (Last)                    State         

 ------- ----                    ----- ------                    -----         

 service:apache                  server1.example.com             started       

[root@server2 ~]# clustat

Cluster Status for lyitx @ Wed Feb 15 17:30:34 2017

Member Status: Quorate

 

 Member Name                              ID   Status

 ------ ----                              ---- ------

 server1.example.com                          1 Offline

 server2.example.com                          2 Online, Local, rgmanager

 

 Service Name                    Owner (Last)                    State         

 ------- ----                    ----- ------                    -----         

 service:apache                  server2.example.com             starting