有调度器就可以试着进行高可用
1.高可用环境配置
[root@server4 ~]# yum install haproxy -y
[root@server1 ~]# scp /etc/haproxy/haproxy.cfg root@172.25.254.14:/etc/haproxy/
把1的配置文件直接scp过来,省事
[root@server4 ~]# systemctl start haproxy.service
搞定
keepalived+lvs配合很好,但是haproxy并不是
[root@server1 ~]# systemctl status keepalived.service
● keepalived.service - LVS and VRRP High Availability Monitor
Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)
Active: inactive (dead)
首先确保keepalived是关闭的 ,在1和4
在1和4这两个高可用节点安装
[root@chihao yum.repos.d]# cd /var/www/html/rhel7.6/
[root@chihao rhel7.6]# ls
addons EULA GPL isolinux media.repo repodata RPM-GPG-KEY-redhat-release
EFI extra_files.json images LiveOS Packages RPM-GPG-KEY-redhat-beta TRANS.TBL
[root@chihao rhel7.6]# cd addons/
[root@chihao addons]# ls
HighAvailability ResilientStorage
首先要解决一些高可用软件找不到的问题,7的挂载点中有一个叫做HighAvailability的软件包,所以yum源要做修改
添加上如图所示的内容
[root@server1 ~]# yum install -y pacemaker pcs psmisc policycoreutils-python
[root@server1 ~]# scp /etc/yum.repos.d/dvd.repo root@172.25.254.14:/etc/yum.repos.d/
[root@server4 ~]# yum install -y pacemaker pcs psmisc policycoreutils-python
安装以上软件,并且给4也做相同的事
关闭火墙
[root@server1 ~]# systemctl enable --now pcsd.service
[root@server4 ~]# systemctl enable --now pcsd.service
2.集群创建
hacluster:x:189:189:cluster user:/home/hacluster:/sbin/nologin
当安装好这些软件后,会自动建立一个新用户,需要给这个用户一个密码
[root@server1 ~]# echo yume |passwd --stdin hacluster
Changing password for user hacluster.
passwd: all authentication tokens updated successfully.
做认证
[root@server1 ~]# pcs cluster auth 172.25.254.11 172.25.254.14
Username: hacluster
Password:
172.25.254.14: Authorized
172.25.254.11: Authorized
认证完成后就可以用这两个节点创建集群了
[root@server1 ~]# pcs cluster auth 172.25.254.11 172.25.254.14
Username: hacluster
Password:
172.25.254.14: Authorized
172.25.254.11: Authorized
[root@server1 ~]# pcs cluster setup --name mycluster 172.25.254.11 172.25.254.14
Destroying cluster on nodes: 172.25.254.11, 172.25.254.14...
172.25.254.14: Stopping Cluster (pacemaker)...
172.25.254.11: Stopping Cluster (pacemaker)...
172.25.254.14: Successfully destroyed cluster
172.25.254.11: Successfully destroyed cluster
Sending 'pacemaker_remote authkey' to '172.25.254.11', '172.25.254.14'
172.25.254.11: successful distribution of the file 'pacemaker_remote authkey'
172.25.254.14: successful distribution of the file 'pacemaker_remote authkey'
Sending cluster config files to the nodes...
172.25.254.11: Succeeded
172.25.254.14: Succeeded
Synchronizing pcsd certificates on nodes 172.25.254.11, 172.25.254.14...
172.25.254.14: Success
172.25.254.11: Success
Restarting pcsd on the nodes in order to reload the certificates...
172.25.254.14: Success
172.25.254.11: Success
这样我们两个节点的集群就创建成功了,现在要把这个集群启动
[root@server1 ~]# pcs cluster start --all
172.25.254.11: Starting Cluster (corosync)...
172.25.254.14: Starting Cluster (corosync)...
172.25.254.11: Starting Cluster (pacemaker)...
172.25.254.14: Starting Cluster (pacemaker)...
启动集群
[root@server1 ~]# pcs cluster enable --all
172.25.254.11: Cluster Enabled
172.25.254.14: Cluster Enabled
corosync:此程序为集群心跳,心跳信息的传递,说通俗就是心跳来确定哪个主机挂了,另一个来接管
pacemaker:此程序为集群资源管理器,vip,服务等
查看集群状态
[root@server1 ~]# pcs status
Cluster name: mycluster
WARNINGS:
No stonith devices and stonith-enabled is not false
Corosync and pacemaker node names do not match (IPs used in setup?)
Stack: corosync
Current DC: server1 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Thu Aug 5 11:59:05 2021
Last change: Thu Aug 5 11:55:15 2021 by hacluster via crmd on server1
2 nodes configured
0 resources configured
Online: [ server1 server4 ]
No resources
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
解决报错
[root@server1 ~]# pcs property set stonith-enabled=false
创建资源:
[root@server1 ~]# pcs resource create --help
Usage: pcs resource create...
create <resource id> [<standard>:[<provider>:]]<type> [resource options]
[op <operation action> <operation options> [<operation action>
<operation options>]...] [meta <meta options>...]
[clone [<clone options>] | master [<master options>] |
--group <group id> [--before <resource id> | --after <resource id>]
| bundle <bundle id>] [--disabled] [--no-default-ops] [--wait[=n]]
Create specified resource. If clone is used a clone resource is
created. If master is specified a master/slave resource is created.
If --group is specified the resource is added to the group named. You
can use --before or --after to specify the position of the added
resource relatively to some resource already existing in the group.
If bundle is used, the resource will be created inside of the specified
bundle. If --disabled is specified the resource is not started
automatically. If --no-default-ops is specified, only monitor
operations are created for the resource and all other operations use
default settings. If --wait is specified, pcs will wait up to 'n'
seconds for the resource to start and then return 0 if the resource is
started, or 1 if the resource has not yet started. If 'n' is not
specified it defaults to 60 minutes.
Example: Create a new resource called 'VirtualIP' with IP address
192.168.0.99, netmask of 32, monitored everything 30 seconds,
on eth2:
pcs resource create VirtualIP ocf:heartbeat:IPaddr2 \
ip=192.168.0.99 cidr_netmask=32 nic=eth2 \
op monitor interval=30s
创建资源的时候使用的是其自带的脚本
[root@server1 ~]# pcs resource create vip ocf:heartbeat:IPaddr2 ip=172.25.254.100 cidr_netmask=24 op monitor interval=30s
[root@server1 ~]# pcs status
Cluster name: mycluster
WARNINGS:
Corosync and pacemaker node names do not match (IPs used in setup?)
Stack: corosync
Current DC: server1 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Fri Aug 6 11:51:47 2021
Last change: Fri Aug 6 11:51:41 2021 by root via cibadmin on server1
2 nodes configured
1 resource configured
Online: [ server1 server4 ]
Full list of resources:
vip (ocf::heartbeat:IPaddr2): Started server1
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
做完这些就可以访问172.25.254.100了
当把1的pcs待命时,上线的就是server4
[root@server1 ~]# pcs node standby
[root@server1 ~]# pcs status
Cluster name: mycluster
WARNINGS:
Corosync and pacemaker node names do not match (IPs used in setup?)
Stack: corosync
Current DC: server1 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Fri Aug 6 11:56:16 2021
Last change: Fri Aug 6 11:56:12 2021 by root via cibadmin on server1
2 nodes configured
1 resource configured
Node server1: standby
Online: [ server4 ]
Full list of resources:
vip (ocf::heartbeat:IPaddr2): Started server4
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
再添加一个haproxy服务
[root@server1 ~]# pcs resource create haproxy systemd:haproxy op monitor interval=60s
[root@server1 ~]# pcs status
Cluster name: mycluster
WARNINGS:
Corosync and pacemaker node names do not match (IPs used in setup?)
Stack: corosync
Current DC: server1 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Fri Aug 6 12:00:05 2021
Last change: Fri Aug 6 12:00:00 2021 by root via cibadmin on server1
2 nodes configured
2 resources configured
Online: [ server1 server4 ]
Full list of resources:
vip (ocf::heartbeat:IPaddr2): Started server4
haproxy (systemd:haproxy): Started server1
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
虽然已经添加,但是很明显vip和haproxy不在同一台机器上启动,所以要把两个服务放在一起,可以添加到同一个组里
[root@server1 ~]# pcs resource group add hagroup vip haproxy
[root@server1 ~]# pcs status
Cluster name: mycluster
WARNINGS:
Corosync and pacemaker node names do not match (IPs used in setup?)
Stack: corosync
Current DC: server1 (version 1.1.19-8.el7-c3c624ea3d) - partition with quorum
Last updated: Fri Aug 6 12:04:52 2021
Last change: Fri Aug 6 12:04:46 2021 by root via cibadmin on server1
2 nodes configured
2 resources configured
Online: [ server1 server4 ]
Full list of resources:
Resource Group: hagroup
vip (ocf::heartbeat:IPaddr2): Started server4
haproxy (systemd:haproxy): Started server4
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
hagroup是任意取名
此时若让4的pcs待命,资源就会都切到1上
[root@server4 ~]# pcs node standby
Resource Group: hagroup
vip (ocf::heartbeat:IPaddr2): Started server1
haproxy (systemd:haproxy): Started server1
重启4,资源还是1上,尽量要减少切换这样的损失,因为哪怕时间再短也会断开
目前,除非两个节点都standby,否则在server端访问100都是可以访问的 ,这就是高可用的实现