18.1 集群介绍
Linux集群概述
- 根据功能划分为两大类:高可用和负载均衡
- 高可用集群通常为两台服务器,一台工作,另外一台作为冗余,当提供服务的机器宕机,冗余将接替继续提供服务 //通常对于大企业来说。可用程度达到99.99%或者 是5个9
- 实现高可用的开源软件有:heartbeat、keepalived //centos6 bug 多,而且很久没有更新了,不建议继续使用 ;keepalived不仅有高可用还有负载均衡
- 负载均衡集群,需要有一台服务器作为分发器,它负责把用户的请求分发给后端的服务器处理,在这个集群里,除了分发器外,就是给用户提供服务的服务器了,这些服务器数量至少为2
- 实现负载均衡的开源软件有LVS、keepalived、haproxy、nginx,商业的有F5、Netscaler
18.2 keepalived介绍
keepalived介绍
- 在这里我们使用keepalived来实现高可用集群,因为heartbeat在centos6上有一些问题,影响实验效果
heartbeat 切换的时候会不是很及时
- keepalived通过VRRP(Virtual Router Redundancy Protocl 中文为:虚拟路由器冗余协议)来实现高可用。
虚拟路由冗余协议(Virtual Router Redundancy Protocol,简称VRRP)是由IETF提出的解决局域网中配置静态网关出现单点失效现象的路由协议,1998年已推出正式的RFC2338协议标准。VRRP广泛应用在边缘网络中,它的设计目标是支持特定情况下IP数据流量失败转移不会引起混乱,允许主机使用单路由器,以及及时在实际第一跳路由器使用失败的情形下仍能够维护路由器间的连通性。
- 在这个协议里会将多台功能相同的路由器组成一个小组,这个小组里会有1个master角色和N(N>=1)个backup角色。
- master会通过组播的形式向各个backup发送VRRP协议的数据包,当backup收不到master发来的VRRP数据包时,就会认为master宕机了。此时就需要根据各个backup的优先级来决定谁成为新的mater。
- Keepalived要有三个模块,分别是core、check和vrrp。其中core模块为keepalived的核心,负责主进程的启动、维护以及全局配置文件的加载和解析,check模块负责健康检查,vrrp模块是来实现VRRP协议的。
18.3/18.4/18.5 用keepalived配置高可用集
搭建高可用的前提,是先要有一个工具,然后需要有一个服务去让工具实现高可用,这个实验,就是让nginx作为一个服务,让它成为一个高可用的对象;因为nginx在企业里使用量比较大,所以就使用他来做服务对象
环境准备
master :192.168.133.131 (已经做过lnmp)
backup:192.168.133.130 (lamp,没有nginx服务)
两台机器都安装keepalived
执行yum install -y keepalived
为了方便做实验
检查两台机器的selinux,iptables两个防火墙情况,selinux需要关闭,iptables需要关闭firewalld
对backup机器安装nginx服务
yum install -y nginx
master机器配置
服务工具准备好以后,就配置keepalived 默认的配置文件路径在
/etc/keepalived/keepalived.conf
global_defs {
notification_email { //邮件
aming@aminglinux.com
}
notification_email_from root@aminglinux.com
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id LVS_DEVEL
}
vrrp_script chk_nginx {
script "/usr/local/sbin/check_ng.sh" //检查服务是否正常,通过脚本实现,检查服务健康状态
interval 3 //检查时间
}
vrrp_instance VI_1 {
state MASTER //定义master相关
interface ens33 //通过那个网站使用vrrp协议,配置时,需注意你的网卡配置文件是否是哪个。因为系统ens并不是固定的。
virtual_router_id 51 //定义路由器ID ,配置的时候和从机器一致
priority 100 //权重,
advert_int 1
authentication { //认证相关信息
auth_type PASS
auth_pass aminglinux>com
}
virtual_ipaddress { //定义一个公有IP(VIP)
192.168.188.100 //更改为192.168.133.100
}
track_script {
chk_nginx
}
}
virtual_ipaddress:简称VIP,这个vip,两台机器,一个主,一个从,正常的情况是主在服务,主宕掉了,从起来了,从启动服务,从启动nginx以后,,启动以后,访问那个IP呢?把域名解析到那个IP上呢?假如解析到主上,主宕掉了,所以这个,需要定义一个公有IP(主上用的IP,从上也用的IP);这个IP是随时可以换掉,去配置的
定义一个check的脚本
#!/bin/bash
#时间变量,用于记录日志
d=`date --date today +%Y%m%d_%H:%M:%S`
#计算nginx进程数量
n=`ps -C nginx --no-heading|wc -l`
#如果进程为0,则启动nginx,并且再次检测nginx进程数量,
#如果还为0,说明nginx无法启动,此时需要关闭keepalived
if [ $n -eq "0" ]; then
/etc/init.d/nginx start
n2=`ps -C nginx --no-heading|wc -l`
if [ $n2 -eq "0" ]; then
echo "$d nginx down,keepalived will stop" >> /var/log/check_ng.log
systemctl stop keepalived //停止keepalived,涉及到一个“脑裂”知识
fi
fi
- “脑裂”
- 在高可用(HA)系统中,当联系2个节点的“心跳线”断开时,本来为一整体、动作协调的HA系统,就分裂成为2个独立的个体。
- 由于相互失去了联系,都以为是对方出了故障。两个节点上的HA软件像“裂脑人”一样,争抢“共享资源”、争起“应用服务”,就会发生严重——或者共享资源被瓜分、2边“服务”都起不来了;或者2边“服务”都起来了,但同时读写“共享存储”,导致数据损坏
脚本创建完以后还要调整权限;如果不调整权限的话,文件就没有办法自动加载
启动keepalived
systemctl start keepalived
检查服务启动状态
[root@aminglinux-02 bin]# ps aux |grep keepalived
root 2552 0.0 0.0 111708 1308 ? Ss 12:34 0:00 /usr/sbin/keepalived -D
root 2553 0.0 0.1 111708 2560 ? S 12:34 0:00 /usr/sbin/keepalived -D
root 2554 0.0 0.0 111708 1528 ? S 12:34 0:00 /usr/sbin/keepalived -D
root 2564 0.0 0.0 112664 976 pts/0 S+ 12:34 0:00 grep --color=auto keepalived
[root@aminglinux-02 bin]# ps aux |grep nginx
root 1233 0.0 0.0 45484 1256 ? Ss 10:44 0:00 nginx: master process /usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf
nobody 1235 0.0 0.2 47972 4152 ? S 10:44 0:00 nginx: worker process
nobody 1236 0.0 0.2 47972 3896 ? S 10:44 0:00 nginx: worker process
root 2566 0.0 0.0 112664 972 pts/0 R+ 12:35 0:00 grep --color=auto nginx
先停止nginx 看看是否会自动启动
[root@aminglinux-02 bin]# date
2017年 09月 04日 星期一 12:37:31 CST
[root@aminglinux-02 bin]# /etc/init.d/nginx stop
Stopping nginx (via systemctl): [ 确定 ]
[root@aminglinux-02 bin]# !ps
ps aux |grep nginx
root 2627 0.0 0.0 45484 1276 ? Ss 12:38 0:00 nginx: master process /usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf
nobody 2631 0.0 0.2 47972 3912 ? S 12:38 0:00 nginx: worker process
nobody 2632 0.0 0.2 47972 3912 ? S 12:38 0:00 nginx: worker process
root 2640 0.0 0.0 112664 968 pts/0 R+ 12:38 0:00 grep --color=auto nginx
通过时间信息,可以查看到,在停止nginx之后,因为check_ng的检测脚本又重新把nginx自动启动起来了
查看当前的网卡情况
[root@aminglinux-02 bin]# ip add
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:c4:13:b8 brd ff:ff:ff:ff:ff:ff
inet 192.168.133.131/24 brd 192.168.133.255 scope global ens32
valid_lft forever preferred_lft forever
inet 192.168.133.100/32 scope global ens32
valid_lft forever preferred_lft forever
inet6 fe80::6e6a:61ff:f17c:5942/64 scope link
valid_lft forever preferred_lft forever
发现网卡多了一个ip,这个就是vip ,高可用专用的IP,用于让从机器解析web服务的IP
backup 机器配置
配置backup机器的keepalived配置
global_defs {
notification_email {
aming@aminglinux.com
}
notification_email_from root@aminglinux.com
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id LVS_DEVEL
}
vrrp_script chk_nginx {
script "/usr/local/sbin/check_ng.sh"
interval 3
}
vrrp_instance VI_1 {
state BACKUP //这个。和master不一样的名字
interface ens32
virtual_router_id 51 //和主机器 一直
priority 90 //比主机器小的数值
advert_int 1
authentication {
auth_type PASS
auth_pass aminglinux>com
}
virtual_ipaddress {
192.168.133.100 //公用IP
}
track_script {
chk_nginx
}
}
配置check 检测脚本
#时间变量,用于记录日志
d=`date --date today +%Y%m%d_%H:%M:%S`
#计算nginx进程数量
n=`ps -C nginx --no-heading|wc -l`
#如果进程为0,则启动nginx,并且再次检测nginx进程数量,
#如果还为0,说明nginx无法启动,此时需要关闭keepalived
if [ $n -eq "0" ]; then
systemctl start nginx //启动命令不一样,因为从是yum安装的,所以使用的systemctl命令启动
n2=`ps -C nginx --no-heading|wc -l`
if [ $n2 -eq "0" ]; then
echo "$d nginx down,keepalived will stop" >> /var/log/check_ng.log
systemctl stop keepalived
fi
fi
更改脚本权限
chmod 755 /usr/local/sbin/check_ng.sh
启动keepalived
systemctl start keepalived
检测 keepalived 启动状况
[root@localhost ~]# ps aux |grep keepalived
root 3728 0.0 0.0 111708 1304 ? Ss 12:51 0:00 /usr/sbin/keepalived -D
root 3729 0.0 0.1 111708 2556 ? S 12:51 0:00 /usr/sbin/keepalived -D
root 3730 0.0 0.0 111708 1640 ? S 12:51 0:00 /usr/sbin/keepalived -D
root 3798 0.0 0.0 112664 980 pts/0 S+ 12:51 0:00 grep --color=auto keepalived
现在主和从的keepalived都配置好了,主和从机器上都nginx,那么如何区分这个nginx
查看主机器,先的nginx配置文件,的default主机配置
[root@aminglinux-02 bin]# cat /usr/local/nginx/conf/vhost/aaa.com.conf
server
{
listen 80 default_server;
server_name aaa.com;
index index.html index.htm index.php;
root /data/wwwroot/default;
location ~ \.php$
{
include fastcgi_params;
fastcgi_pass unix:/tmp/aming.sock;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME /data/wwwroot/default$fastcgi_script_name;
}
}
这个就是默认的虚拟主机,对默认的索引页做个配置
vim /data/wwwroot/default/index.html //内容如下
Master Master
This is the default site.
查看从机器,因为从机器是yum安装的
所以默认的索引页在
vim /usr/share/nginx/html/index.html //修改为
Backup Backup
这是在浏览器访问主机器的页面为
在浏览器访问从机器
访问VIP 地址的时候
因为,keepalived服务器启用,页面优先调用的服务是主机器上的页面,所以这是访问到的页面是主机器的默认索引页
测试高可用
模拟,主机器宕机环境,最快,最简单直接的方法,就是直接关闭keepalived服务 尝试关闭主机上kepalived服务
[root@aminglinux-02 bin]# ls /etc/sh
[root@aminglinux-02 bin]# ip add
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:c4:13:b8 brd ff:ff:ff:ff:ff:ff
inet 192.168.133.131/24 brd 192.168.133.255 scope global ens32
valid_lft forever preferred_lft forever
inet 192.168.133.100/32 scope global ens32
valid_lft forever preferred_lft forever
inet6 fe80::6e6a:61ff:f17c:5942/64 scope link
valid_lft forever preferred_lft forever
[root@aminglinux-02 bin]# systemctl stop keepalived
[root@aminglinux-02 bin]# ip add
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:c4:13:b8 brd ff:ff:ff:ff:ff:ff
inet 192.168.133.131/24 brd 192.168.133.255 scope global ens32
valid_lft forever preferred_lft forever
inet6 fe80::6e6a:61ff:f17c:5942/64 scope link
valid_lft forever preferred_lft forever.
查看日志
[root@aminglinux-02 bin]# tail /var/log/messages
Sep 4 12:38:55 aminglinux-02 Keepalived_vrrp[2606]: VRRP_Instance(VI_1) setting protocol VIPs.
Sep 4 12:38:55 aminglinux-02 Keepalived_vrrp[2606]: VRRP_Instance(VI_1) Sending gratuitous ARPs on ens32 for 192.168.133.100
Sep 4 12:38:55 aminglinux-02 Keepalived_healthcheckers[2605]: Netlink reflector reports IP 192.168.133.100 added
Sep 4 12:39:00 aminglinux-02 Keepalived_vrrp[2606]: VRRP_Instance(VI_1) Sending gratuitous ARPs on ens32 for 192.168.133.100
Sep 4 13:17:51 aminglinux-02 Keepalived[2604]: Stopping Keepalived v1.2.13 (05/25,2017)
Sep 4 13:17:51 aminglinux-02 systemd: Stopping LVS and VRRP High Availability Monitor...
Sep 4 13:17:51 aminglinux-02 Keepalived_vrrp[2606]: VRRP_Instance(VI_1) sending 0 priority
Sep 4 13:17:51 aminglinux-02 Keepalived_vrrp[2606]: VRRP_Instance(VI_1) removing protocol VIPs.
Sep 4 13:17:51 aminglinux-02 Keepalived_healthcheckers[2605]: Netlink reflector reports IP 192.168.133.100 removed
Sep 4 13:17:51 aminglinux-02 systemd: Stopped LVS and VRRP High Availability Monitor.
关闭以后VIP地址马上就释放出去了
查看从机器
[root@localhost html]# ip add
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:7d:ea:88 brd ff:ff:ff:ff:ff:ff
inet 192.168.133.130/24 brd 192.168.133.255 scope global ens32
valid_lft forever preferred_lft forever
inet6 fe80::daff:1b44:6a0f:1211/64 scope link
valid_lft forever preferred_lft forever
[root@localhost html]# ip add
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:7d:ea:88 brd ff:ff:ff:ff:ff:ff
inet 192.168.133.130/24 brd 192.168.133.255 scope global ens32
valid_lft forever preferred_lft forever
inet 192.168.133.100/32 scope global ens32
valid_lft forever preferred_lft forever
inet6 fe80::daff:1b44:6a0f:1211/64 scope link
valid_lft forever preferred_lft forever
查看日志
[root@localhost html]# tail /var/log/messages
Sep 4 13:01:01 localhost systemd: Started Session 15 of user root.
Sep 4 13:01:01 localhost systemd: Starting Session 15 of user root.
Sep 4 13:10:01 localhost systemd: Started Session 16 of user root.
Sep 4 13:10:01 localhost systemd: Starting Session 16 of user root.
Sep 4 13:17:52 localhost Keepalived_vrrp[3730]: VRRP_Instance(VI_1) Transition to MASTER STATE
Sep 4 13:17:53 localhost Keepalived_vrrp[3730]: VRRP_Instance(VI_1) Entering MASTER STATE
Sep 4 13:17:53 localhost Keepalived_vrrp[3730]: VRRP_Instance(VI_1) setting protocol VIPs.
Sep 4 13:17:53 localhost Keepalived_vrrp[3730]: VRRP_Instance(VI_1) Sending gratuitous ARPs on ens32 for 192.168.133.100
Sep 4 13:17:53 localhost Keepalived_healthcheckers[3729]: Netlink reflector reports IP 192.168.133.100 added
Sep 4 13:17:58 localhost Keepalived_vrrp[3730]: VRRP_Instance(VI_1) Sending gratuitous ARPs on ens32 for 192.168.133.100
因为主机器宕机,从机器很快的就加入了vip地址
这个时候访问vip地址的时候,看到的页面是
从机器上的默认索引页,证明整个实验成功