一)概述
在本篇文章里,我们会涉及两部份内容,一个是LVS,另一个则是keepalived.
即我们用LVS和keepalived实现了负载均衡及高可用的服务器.
LVS有实现三种IP负载均衡技术和八种连接调度算法.并且LVS集群采用三层结构,即负载调度器,服务器池,共享存储.
1)负载调度器
负载调度器是LVS集群的唯一入口,它采用IP负载均衡技术,基于内容分发技术或两者并结合.
在IP负载均衡技术中,需要服务器池拥有相同的内容提供相同的服务.当客户请求到达时,调度器只根据服务器负载情况和设定调度算法从服务器池中选出一台机器,将请求转发给选出的机器,并记录这个调度.当这个请求的其他报文到达,也会被转发到前面选出的服务器.
在基于内容分发技术中,服务器可以提供不同的服务,当客户请求到达时,调度器可根据请求的内容选择服务器执行请求.
2)服务器池
服务器池也就是real server,是真正处理应用的服务器.
3)共享存储
它为服务器池提供一个共享的存储区,这样很容易使得服务器池拥有相同的内容,提供相同的服务.
keepalive
Keepalived在这里主要用作RealServer的健康状态检查以及Master主机和Backup主机之间failover的实现.
二)测试环境介绍
负载调度服务器(master): 10.1.1.160
负载调度服务器(slave): 10.1.1.162
vip为10.1.1.166
real server1:10.1.1.163
real server2:10.1.1.164
测试机:10.1.1.165
以上5台服务器我们均安装debian 5.0.
我们首先在负载调度服务器10.1.1.160及10.1.1.162安装lvs及keepalived
在real server安装apache2.0
三)keepalived/lvs的安装配置
1)在负载调度服务器(10.1.1.160)安装keepalived和ipvsadm,如下:
安装keepalived
apt-get install keepalived
安装ipvsadm
apt-get install ipvsadm
修改并创建keepalived配置文件如下:
vi /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id LVS_1
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.1.1.166
}
}
virtual_server 10.1.1.166 80 {
delay_loop 6
lb_algo rr
lb_kind DR
# persistence_timeout 60
protocol TCP
real_server 10.1.1.163 80 {
TCP_CHECK {
connect_timeout 10
nb_get_retry 3
delay_before_retry 3
connect_port 80
}
}
real_server 10.1.1.164 80 {
TCP_CHECK {
connect_timeout 10
nb_get_retry 3
delay_before_retry 3
connect_port 80
}
}
}
注:这里我们采用的IP负载均衡技术是DR.
2)在负载调度服务器(10.1.1.162)安装keepalived和ipvsadm,如下:
安装keepalived
apt-get install keepalived
安装ipvsadm
apt-get install ipvsadm
! Configuration File for keepalived
global_defs {
router_id LVS_2
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority 50
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.1.1.166
}
}
virtual_server 10.1.1.166 80 {
delay_loop 6
lb_algo rr
lb_kind DR
# persistence_timeout 60
protocol TCP
real_server 10.1.1.163 80 {
TCP_CHECK {
connect_timeout 10
nb_get_retry 3
delay_before_retry 3
connect_port 80
}
}
real_server 10.1.1.164 80 {
TCP_CHECK {
connect_timeout 10
nb_get_retry 3
delay_before_retry 3
connect_port 80
}
}
}
3)配置real server
3.1)在real server上创建新的网络介质,这里为lo:0 10.1.1.166
ifconfig lo:0 10.1.1.166 broadcast 10.1.1.166 netmask255.255.255.255 up
3.2)关闭ARP广播响应
echo "1">/proc/sys/net/ipv4/conf/lo/arp_ignore
echo "2">/proc/sys/net/ipv4/conf/lo/arp_announce
echo "1">/proc/sys/net/ipv4/conf/all/arp_ignore
echo "2">/proc/sys/net/ipv4/conf/all/arp_announce
3.3)安装apache
apt-get install apache2
echo "real server no1" >>/var/www/index.html
注:两台real server执行同样的操作.
5)测试
5.1)启动keepalived服务:
lvs1:
/etc/init.d/keepalived restart
lvs2:
/etc/init.d/keepalived restart
5.2)测试机测试:
ping 10.1.1.166
PING 10.1.1.166 (10.1.1.166) 56(84) bytes of data.
64 bytes from 10.1.1.166: icmp_req=1 ttl=64 time=0.225 ms
64 bytes from 10.1.1.166: icmp_req=2 ttl=64 time=0.179 ms
64 bytes from 10.1.1.166: icmp_req=3 ttl=64 time=0.163 ms
64 bytes from 10.1.1.166: icmp_req=4 ttl=64 time=0.226 ms
64 bytes from 10.1.1.166: icmp_req=5 ttl=64 time=0.218 ms
在lvs1上抓包如下:
tcpdump -p icmp -i eth0
tcpdump: verbose output suppressed, use -v or -vv for full protocoldecode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535bytes
09:45:12.572695 IP 10.1.1.165 > 10.1.1.166: ICMPecho request, id 17181, seq 4, length 64
09:45:12.572713 IP 10.1.1.166 > 10.1.1.165: ICMPecho reply, id 17181, seq 4, length 64
09:45:13.572693 IP 10.1.1.165 > 10.1.1.166: ICMPecho request, id 17181, seq 5, length 64
09:45:13.572708 IP 10.1.1.166 > 10.1.1.165: ICMPecho reply, id 17181, seq 5, length 64
09:45:14.572724 IP 10.1.1.165 > 10.1.1.166: ICMPecho request, id 17181, seq 6, length 64
09:45:14.572741 IP 10.1.1.166 > 10.1.1.165: ICMPecho reply, id 17181, seq 6, length 64
09:45:15.572738 IP 10.1.1.165 > 10.1.1.166: ICMPecho request, id 17181, seq 7, length 64
09:45:15.572756 IP 10.1.1.166 > 10.1.1.165: ICMPecho reply, id 17181, seq 7, length 64
09:45:16.572694 IP 10.1.1.165 > 10.1.1.166: ICMPecho request, id 17181, seq 8, length 64
09:45:16.572710 IP 10.1.1.166 > 10.1.1.165: ICMPecho reply, id 17181, seq 8, length 64
说明现在lvs是在lvs1的服务器.
四)keepalived主/从通讯分析
1)vrrp协议与主/从切换机制
keepalived的master与slave是通过vrrp2协议进行通讯.以决定各自的状态及vip等相关信息,MASTER会发送广播包,广播地址为224.0.0.18.
我们通过抓包如下:
tcpdump -X -n -vvv 'dst 224.0.0.18'
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capturesize 65535 bytes
09:43:04.295639 IP (tos 0x0, ttl 255, id 51508, offset 0, flags[none], proto VRRP (112), length 40)
0x0000:
4500 0028 c934 0000 ff70 067e 0a01 01a0
E..(.4...p.~....
0x0010:
e000 0012 2133 c801 0101 a7c0 0a01 01a6
....!3..........
0x0020:
3131 3131 0000 0000
1111....
09:43:05.295686 IP (tos 0x0, ttl 255, id 55831, offset 0, flags[none], proto VRRP (112), length 40)
0x0000:
4500 0028 da17 0000 ff70 f598 0a01 01a2
E..(.....p......
0x0010:
e000 0012 2134 6401 0101 0bc0 0a01 01a6
....!4d.........
0x0020:
3131 3131 0000 0000
1111....
09:43:05.296837 IP (tos 0x0, ttl 255, id 51509, offset 0, flags[none], proto VRRP (112), length 40)
0x0000:
4500 0028 c935 0000 ff70 067d 0a01 01a0
E..(.5...p.}....
0x0010:
e000 0012 2133 c801 0101 a7c0 0a01 01a6
....!3..........
0x0020:
3131 3131 0000 0000
1111....
以10.1.1.160服务器发广播数据为例,如下:
10.1.1.160 > 224.0.0.18: VRRPv2, Advertisement, vrid51, prio 200, authtype simple, intvl 1s, length 20, addrs:10.1.1.166 auth "1111^@^@^@^@"
0x0000:
4500 0028 c934 0000 ff70 067e 0a01 01a0
E..(.4...p.~....
0x0010:
e000 0012 2133 c801 0101 a7c0 0a01 01a6
....!3..........
0x0020:
3131 3131 0000 0000
1111....
vrrpv2的协议的消息从这里开始:
0x0014: 2133 c801 0101 a7c0 0a01 01a6
....!3..........
0x0020: 3131 3131 0000 0000
version: 版本号4位,在RFC中定义为2,所以这里是2.
type: 类型,4位,目前只定义一种类型,就是Advertisement,表示通告信息,取值为1.所以这里是1
VirtualID:虚拟路由器ID,8位,因为在lvs1中的keepalived定义的virtual_router_id为51,所以转换为16进制就是33.
Priority:优先级,8位,因为在lvs1中的keepalived定义的Priority为200,所以转换为16进制就是C8
count ip addrs:VRRP包中的IP地址数量,8位.这里只有一个ip地址,所以就是01
authtype:认证类型,8位,在RFC3768中认证功能已经取消.所以该字段为01,其实这样只对老版本的兼容.如果取消则为00.
adver int:通告包的发送间隔时间,缺省为1秒,我们的配置也是1秒,所以这里的值为01
checksum:检验和,16位.这里的校验数据范围只是VRRP数据,并不包括IP头.
ip address:vip地址,这里是16位,我们的vip地址为10.1.1.166,所以转换为十六进制就是0a0101a6
auth data:验证的密码,密码的最大长度为8个字符,也就是32位,不足32位的,以0补全,所以这里就是3131 31310000 0000
2)keepalived的vrrp配置
这里是master的配置,如下:
! Configuration File for keepalived
global_defs {
router_id LVS1
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 200
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.1.1.166
}
}
这里是backup的配置,如下:
! Configuration File for keepalived
global_defs {
router_id LVS2
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority
90
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.1.1.166
}
}
注:
global_defs{}是全局配置.
router_id是虚拟路由器ID,可以是任意值,建议是当前的主机名.
vrrp_instance 实例名{}是配置VRRP的实例,我们这里只做最基本的介绍.
state MASTER:代表当前的keepalived所在的服务器是主机还是备用机.如果是备用机则用BACKUP.
问题一:
如果我们这里两台机器都是MASTER,谁是主谁是备呢?
答案是要看两台机器的优先级(priority配置项).state并不在vrrp协议中定义,所以决定权在priority配置项.
下面是把两台机器的keepalived都改成MASTER.如下:
lvs1:
Sep
6 13:45:45 10 kernel: [ 7290.447277] IPVS:sync thread started: state = MASTER, mcast_ifn = eth0, syncid =51
Sep
6 13:45:46 10 Keepalived_vrrp:VRRP_Instance(VI_1) Transition to MASTER STATE
Sep
6 13:45:47 10 Keepalived_vrrp:VRRP_Instance(VI_1) Entering MASTER STATE
Sep
6 14:44:57 10 Keepalived_vrrp:VRRP_Instance(VI_1) Received lower prio advert, forcing newelection
lvs2:
Sep
6 14:44:56 debian kernel: [536121.748395]IPVS: sync thread started: state = MASTER, mcast_ifn = eth0, syncid= 51
Sep
6 14:44:57 debian Keepalived_vrrp:VRRP_Instance(VI_1) Transition to MASTER STATE
Sep
6 14:44:57 debian Keepalived_vrrp:VRRP_Instance(VI_1) Received higher prio advert
Sep
6 14:44:57 debian Keepalived_vrrp:VRRP_Instance(VI_1) Entering BACKUP STATE
注意:
我们的MASTER在lvs1上,这时将lvs2更改为MASTER,并重启keepalived,导致有两个MASTER使用同一个virtul_router_id,所以要通过优先级决定,谁是主,谁是备.
就有了下面的日志输出:
lvs1:
Sep
6 14:44:57 10 Keepalived_vrrp:VRRP_Instance(VI_1) Received lower prio advert, forcing newelection
lvs2:
Sep
6 14:44:57 debian Keepalived_vrrp:VRRP_Instance(VI_1) Received higher prio advert
Sep
6 14:44:57 debian Keepalived_vrrp:VRRP_Instance(VI_1) Entering BACKUP STATE
如果优先级再相同呢?
答案是两个keepalived都将成为MASTER,并且也都会配置VIP.这样会导致地址冲突.
问题二:
如果MASTER的keepalived被停掉,BACKUP是如何接管的?
首先MASTER在运行时会向本网段发送VRRPv2组播报文,如下:
tcpdump -X -n -vvv 'dst 224.0.0.18'
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capturesize 65535 bytes
16:54:47.816024 IP (tos 0x0, ttl 255, id 2250, offset 0, flags[none], proto VRRP (112), length 40)
0x0000:
4500 0028 08ca 0000 ff70 c6e8 0a01 01a0
E..(.....p......
0x0010:
e000 0012 2133 6401 0101 0bc1 0a01 01a6
....!3d.........
0x0020:
3131 3131 0000 0000
1111....
注:
组播报文我们之前分析过.这里要说明的是BACKUP是不发组播报文的.
但是如果MASTER当掉,这时BACKUP在确认没有收到MASTER的组播报文后,会主动发送组播报文,声明自己的keepalived状态,随后启用VIP.正式接管keepliaved.
问题三:
在MASTER被当掉,而又再次启用后,BACKUP处于什么状态,keepalived如何处理?
在上面的配置中,如果lvs1当掉,lvs2会接管vip,状态升级为MASTER,但如果之前的lvs1恢复后,它会重新接管VIP,并更新状态为MASTER.
而lvs2会降级为BACKUP.
有办法在lvs1恢复后,不切换系统吗?
答案是肯定的.
nopreempt选项会解决这个问题.
修改lvs1相关配置如下:
cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id LVS1
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.1.1.166
}
}
这里修改state为BACKUP,也就是说两台keepalived有两个BACKUP.
修改lvs2相关配置如下:
cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
}
vrrp_instance VI_1 {
virtual_ipaddress {
}
在这里加入nopreempt选项,同时将优先级调整为150,即高于lvs1的优先级100.
下面我们模拟backup的切换.
现在MASTER在lvs1上,日志如下:
Sep
7 10:54:10 10 Keepalived_vrrp:VRRP_Instance(VI_1) Transition to MASTER STATE
Sep
7 10:54:11 10 Keepalived_vrrp:VRRP_Instance(VI_1) Entering MASTER STATE
Sep
7 10:54:11 10 kernel: [80003.605718] IPVS:stopping backup sync thread 5160 ...
Sep
7 10:54:11 10 kernel: [80003.606177] IPVS:sync thread started: state = MASTER, mcast_ifn = eth0, syncid =51
我们关闭lvs1的keepalived服务如下:
/etc/init.d/keepalived stop
观察lvs2的message日志,如下:
tail -f /var/log/message
Sep
7 10:53:58 debian Keepalived_vrrp:VRRP_Instance(VI_1) Transition to MASTER STATE
Sep
7 10:53:59 debian Keepalived_vrrp:VRRP_Instance(VI_1) Entering MASTER STATE
Sep
7 10:54:06 debian Keepalived_vrrp:Terminating VRRP child process on signal
Sep
7 10:54:06 debian Keepalived_healthcheckers:Terminating Healthchecker child process on signal
注:我们看到lvs2由BACKUP的状态变为MASTER.
此时我们开启lvs1的keepalived服务,如下:
/etc/init.d/keepalived start
查看lvs1的日志,如下:
Sep
7 11:08:52 10 Keepalived_vrrp:VRRP_Instance(VI_1) Entering BACKUP STATE
Sep
7 11:08:52 10 Keepalived_healthcheckers:Using LinkWatch kernel netlink reflector...
Sep
7 11:08:52 10 kernel: [80885.206211] IPVS:sync thread started: state = BACKUP, mcast_ifn = eth0, syncid =51
注:我们看到lvs1的状态在重启keepalived之后依然是BACKUP.
这里理一下思路:
为什么要配置两个BACKUP状态呢?因为要保证互不抢占.
而为什么一台要比另一个的优先级高呢?因为我们在高优先级的服务器上配置了nopreempt,导致高的优先级也不会抢占低的优先级.
也就是说只有在一台keepalived失败的时候,另一台才会接管.
interface eth0:代表当前进行vrrp通讯的网络接口卡.
virtual_router_id:代表组播ID.
事实上在一组MASTER/BACKUP实例中,virtual_router_id一定要相同,如果不同,则MASTER/BACKUP都会发送组播数据包.
即vip在两台机器上都会生效.导致地址冲突.
priority 100:代表优先级,即高优先级成为MASTER.
如果state为MASTER,而优先级还比另一台为BACKUP的低,那么它就直接降级为BACKUP.
优先级不能相同,如果相同,则两个keepalived都会生效.并发送组播包.
advert_int 1:VRRP组播周期秒数.
将advert_int调整为5秒,即5秒发一次组播包,如下:
tcpdump vrrp
tcpdump: verbose output suppressed, use -v or -vv for full protocoldecode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535bytes
14:14:51.683320 IP 10.1.1.160 > vrrp.mcast.net:VRRPv2, Advertisement, vrid 51, prio 150, authtype simple, intvl5s, length 20
14:14:56.684241 IP 10.1.1.160 > vrrp.mcast.net:VRRPv2, Advertisement, vrid 51, prio 150, authtype simple, intvl5s, length 20
14:15:01.685193 IP 10.1.1.160 > vrrp.mcast.net:VRRPv2, Advertisement, vrid 51, prio 150, authtype simple, intvl5s, length 20
14:15:06.686163 IP 10.1.1.160 > vrrp.mcast.net:VRRPv2, Advertisement, vrid 51, prio 150, authtype simple, intvl5s, length 20
14:15:11.687132 IP 10.1.1.160 > vrrp.mcast.net:VRRPv2, Advertisement, vrid 51, prio 150, authtype simple, intvl5s, length 20
这里注意,如果master/backup的组播周期不一至,比如master为5秒,backup为1秒,结果是backup生效,master的keepalived失效,此时只有backup在发组播包.
在master端查看日志如下:
tail -f /var/log/message
Sep
7 14:21:16 10 Keepalived_vrrp: advertissementinterval mismatch mine=5000000 rcved=1
Sep
7 14:21:16 10 Keepalived_vrrp: Sync instanceneeded on eth0 !!!
Sep
7 14:21:16 10 Keepalived_vrrp:VRRP_Instance(VI_1) Dropping received VRRP packet...
authentication {
auth_type PASS
auth_pass 1111
}
确认MASTER/BACKUP的验证方式及口令.
注意:如果MASTER/BACKUP口令不一致,会导致keepalived处理失败,如下:
ep
7 14:34:43 debian Keepalived_vrrp: bogus VRRPpacket received on eth0 !!!
Sep
7 14:34:43 debian Keepalived_vrrp:VRRP_Instance(VI_1) Dropping received VRRP packet...
Sep
7 14:34:44 debian Keepalived_vrrp: receive aninvalid passwd!
virtual_ipaddress {
10.1.1.166
}
VRRP HA虚拟地址,也就是vip.
这里要注意的是,VIP在定义域里可以有多个,如下:
virtual_ipaddress {
}
查看vip地址,如下:
ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu16436 qdisc noqueue state UNKNOWN
2: eth0:<BROADCAST,MULTICAST,UP,LOWER_UP> mtu1500 qdisc pfifo_fast state UP qlen 1000
五)通过自定义脚本检查
vrrp_script 脚本名称 {}
我们可以通过脚本/命令检查系统,如果发现执行失败,则进行master/backup的切换.
下面是加了脚本的lvs1,如下:
! Configuration File for keepalived
global_defs {
router_id LVS1
}
vrrp_script chk_nfs {
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 150
advert_int 1
preempt_delay 300
track_script {
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.1.1.166
}
}
下面是加了脚本的lvs2,如下:
! Configuration File for keepalived
global_defs {
}
vrrp_script chk_nfs {
}
vrrp_instance VI_1 {
}
注:
1)我们通过/bin/pidof nfsd检查系统中是否运行了nfsd服务,检查的时间间隔为10秒.
2)如果lvs1(master)脚本运行3次都失败,keepalived在当前的优先级下减90,如果脚本执行成功,则恢复优先级.
测试如下:
我们在lvs1上关闭nfs服务.
/etc/init.d/nfs-kernel-server stop
查看lvs1日志,如下:
Sep
7 16:41:22 10 Keepalived_vrrp:VRRP_Instance(VI_1) forcing a new MASTER election
Sep
7 16:41:23 10 Keepalived_vrrp:VRRP_Instance(VI_1) Transition to MASTER STATE
Sep
7 16:41:24 10 Keepalived_vrrp:VRRP_Instance(VI_1) Entering MASTER STATE
Sep
7 16:49:16 10 kernel: [ 5736.924654] nfsd:last server has exited, flushing export cache
Sep
7 16:49:42 10 Keepalived_vrrp:VRRP_Script(chk_nfs) failed
Sep
7 16:49:43 10 Keepalived_vrrp:VRRP_Instance(VI_1) Received higher prio advert
Sep
7 16:49:43 10 Keepalived_vrrp:VRRP_Instance(VI_1) Entering BACKUP STATE
此时查看lvs2上面的日志,如下:
Sep
7 16:49:08 debian Keepalived_vrrp:VRRP_Script(chk_nfs) succeeded
Sep
7 16:49:43 debian Keepalived_vrrp:VRRP_Instance(VI_1) forcing a new MASTER election
Sep
7 16:49:44 debian Keepalived_vrrp:VRRP_Instance(VI_1) Transition to MASTER STATE
Sep
7 16:49:45 debian Keepalived_vrrp:VRRP_Instance(VI_1) Entering MASTER STATE
此时启动lvs1上面的nfs,如下:
/etc/init.d/nfs-kernel-server start
查看lvs1日志,如下:
Sep
7 17:21:52 10 Keepalived_vrrp:VRRP_Script(chk_nfs) succeeded
Sep
7 17:21:52 10 Keepalived_vrrp:VRRP_Instance(VI_1) forcing a new MASTER election
Sep
7 17:21:53 10 Keepalived_vrrp:VRRP_Instance(VI_1) Transition to MASTER STATE
Sep
7 17:21:54 10 Keepalived_vrrp:VRRP_Instance(VI_1) Entering MASTER STATE
注:我们看到lvs1在这里提升优先级升级为MASTER.