1. dr原理
- 当用户请求到达director server,此时请求的数据报文会先到内核空间的prerouting链。此时报文的源ip为cip,目标ip为vip
- prerouting检查发现数据包的目标ip是本机,将数据包送至input链。
- ipvs比对数据包请求的服务是否为集群服务,若是,将请求报文中的源mac地址修改为dip的mac地址,将目标mac地址修改rip的mac地址,然后将数据包发送至postrouting链。此时的源ip和目的ip均未修改,仅修改了源mac地址为dip的mac地址,目标mac地址为rip的mac地址
- 由于ds和rs在同一个网络中,所以通过二层来传输。postrouting链检查目标mac地址为rip的mac地址,那么此时数据包将会发至real server。
- rs发现请求报文的mac地址是自己的mac地址,就接受此报文。处理完成之后,将相应报文通过lo接口传送给eth0网卡然后向外发出。此时的源ip地址为vip,目标ip为cip
- 相应报文最终送达客户端。
2. lvs-dr模型的特性
特点1:保证前端路由器将目标地址vip报文统统发给director server,而不是rs。
rs跟director server必须在同一个物理网络中
所有的请求报文经由director server,但相应报文必须不能经过director server
不支持地址转换,也不支持端口映射
rs可以是大多数常见的操作系统
rs的网关决不允许指向dip(因为我们不允许他经过director)rs上的lo接口配置vip的ip地址
缺陷:rs和ds必须在同一机房中
3. 实验
当前server1是director server,server2和server3是real server
[root@server1 ~]# yum install ipvsadm -y
[root@server1 ~]# ipvsadm -ln # 查看ipvsadm的编写策略
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
[root@server1 ~]# ipvsadm -A -t 172.25.60.100:80 -s rr
[root@server1 ~]# ipvsadm -a -t 172.25.60.100:80 -r 172.25.60.2:80 -g
[root@server1 ~]# ipvsadm -a -t 172.25.60.100:80 -r 172.25.60.3:80 -g
[root@server1 ~]# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 172.25.60.100:80 rr
-> 172.25.60.2:80 Route 1 0 0
-> 172.25.60.3:80 Route 1 0 0
命令解释:
ipvsadm -A -t 172.25.60.100:80 -s rr ##调度策略,rr轮询
-A --add-service 添加一条新的虚拟服务
-t TCP|UDP协议的虚拟服务
-s 调度算法
ipvsadm -a -t 172.25.60.100:80 -r 172.25.60.2:80 -g
-a 在一个虚拟服务中添加一个新的真实服务器
-g|-m|-i lvs模式为:dr|nat|tun
-t 说明虚拟服务器提供的是tcp的服务
[root@server1 ~]# ip addr add 172.25.60.100/32 dev eth0
[root@server1 ~]# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 52:54:00:a9:ae:f7 brd ff:ff:ff:ff:ff:ff
inet 172.25.60.253/24 brd 172.25.60.255 scope global eth0
valid_lft forever preferred_lft forever
inet 172.25.60.100/32 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fea9:aef7/64 scope link
valid_lft forever preferred_lft forever
给real server中添加vip:
[root@server3 ~]# ip addr add 172.25.60.100/32 dev eth0
[root@server3 ~]# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 52:54:00:9b:bb:d5 brd ff:ff:ff:ff:ff:ff
inet 172.25.60.3/24 brd 172.25.60.255 scope global eth0
valid_lft forever preferred_lft forever
inet 172.25.60.100/32 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe9b:bbd5/64 scope link
valid_lft forever preferred_lft forever
[root@server2 ~]# ip addr add 172.25.60.100/32 dev eth0
[root@server2 ~]# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 52:54:00:80:38:f1 brd ff:ff:ff:ff:ff:ff
inet 172.25.60.2/24 brd 172.25.60.255 scope global eth0
valid_lft forever preferred_lft forever
inet 172.25.60.100/32 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe80:38f1/64 scope link
valid_lft forever preferred_lft forever
在server2和server3中开启80端口(httpd)
[root@server2 ~]# systemctl start httpd
[root@server3 ~]# systemctl restart httpd
[root@server2 html]# cat /var/www/html/index.html
server2
[root@server3 ~]# cat /var/www/html/index.html
server3
测试:
[kiosk@foundation60 images]$ curl 172.25.60.100
server2
[kiosk@foundation60 images]$ curl 172.25.60.100
server2
提出问题:为什么没有实现轮询?
4. ARP协议详解
arp协议是“address resolution protocol”(地址解析协议)的编写。其作用是在以太网(局域网)环境中,数据传输所依赖的是mac地址而非ip地址,而将已知ip地址转换为mac地址的工作是由apr协议来完成的。
在局域网中,网络实现传输的是‘帧’,帧理念是有目标主机的mac地址的。在以太网中,一个主机和另一个主机进行直接通信,必须要知道目标主机的mac地址。但这个目标mac地址是如何获得的?它就是通过地址解析协议获得的。所谓“地址解析”就是主机在发送帧前将目标ip地址转换成目标mac地址的过程。arp协议的基本功能就是通过目标设备的ip地址,查询目标设备的mac地址,以保证通信的顺利进行。
5. APR请求
任何时候,当主机需要找出这个网络中另一个主机的物理地址时,它就可以发送一个apr请求报文,这个报文包好了发送方的mac地址和ip地址以及接受方的ip地址。因为发送方不知道接收方的物理地址,所以查询分组会在网络层中进行广播。
[kiosk@foundation60 images]$ arp -an | grep 100 # 查询地址100对应的mac地址
? (172.25.60.100) at 52:54:00:80:38:f1 [ether] on br0
正是server2的mac地址,说明广播后被server2接受
[root@server2 html]# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 52:54:00:80:38:f1 brd ff:ff:ff:ff:ff:ff
inet 172.25.60.2/24 brd 172.25.60.255 scope global eth0
valid_lft forever preferred_lft forever
inet 172.25.60.100/32 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe80:38f1/64 scope link
valid_lft forever preferred_lft forever
6. ARP响应
局域网中的每一台主机都会接受并处理arp请求报文,然后进行验证,查看接收方的ip地址是不是自己的地址,只有验证成功的主机才会返回一个arp响应报文,这个响应报文包含接收方的ip地址和物理地址。这个报文利用收到的arp请求报文中的请求房屋里地址以单播的方式直接发送给arp请求报文的请求方。
[root@foundation60 images]# arp -d 172.25.60.100 # 清除
[root@foundation60 images]# curl 172.25.60.100 # 这回被server3枪到了
server3.www.westos.org
添加arp策略:
[root@server3 ~]# yum whatprovides arptables
Loaded plugins: product-id, search-disabled-repos, subscription-manager
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
arptables-0.0.4-8.el7.x86_64 : User space tool to set up tables of ARP
: rules in kernel
Repo : rhel7.3
[root@server3 ~]# yum install arptables-0.0.4-8.el7.x86_64 -y
[root@server2 html]# yum install arptables-0.0.4-8.el7.x86_64 -y
让调度器响应:方法一推荐
[root@server2 html]# arptables -A INPUT -d 172.25.60.100 -j DROP
# 来的arp,如果目的ip时vip的,丢弃
[root@server2 html]# arptables -A OUTPUT -s 172.25.60.100 -j mangle --mangle-ip-s 172.25.60.2
# 发出去的arp包,如果源ip时vip的,改成realserver的ip
[root@server2 html]# arptables -nL
Chain INPUT (policy ACCEPT)
-j DROP -d 172.25.60.100
Chain OUTPUT (policy ACCEPT)
-j mangle -s 172.25.60.100 --mangle-ip-s 172.25.60.2
Chain FORWARD (policy ACCEPT)
[root@server3 ~]# arptables -A INPUT -d 172.25.60.100 -j DROP
[root@server3 ~]# arptables -A OUTPUT -s 172.25.60.100 -j mangle --mangle-ip-s 172.25.60.3
[root@server3 ~]# arptables -nL
Chain INPUT (policy ACCEPT)
-j DROP -d 172.25.60.100
Chain OUTPUT (policy ACCEPT)
-j mangle -s 172.25.60.100 --mangle-ip-s 172.25.60.3
Chain FORWARD (policy ACCEPT)
测试:实现了轮询
[root@foundation60 images]# arp -d 172.25.60.100
[root@foundation60 images]# curl 172.25.60.100
server2
[root@foundation60 images]# curl 172.25.60.100
server3.www.westos.org
[root@foundation60 images]# curl 172.25.60.100
server2
[root@foundation60 images]# curl 172.25.60.100
server3.www.westos.org
[root@server1 ~]# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 172.25.60.100:80 rr
-> 172.25.60.2:80 Route 1 0 3
-> 172.25.60.3:80 Route 1 0 3
方法二:在real server中开启
arp_ignore和arp_announce参数都和arp协议相关,主要用于控制系统返回arp响应和发送arp请求时的动作。特别是在lvs的dr场景下,他们的配置直接影响到dr转发是否正常:
arp_ignore参数的作用是控制系统在收到外部arp请求时,是否要返回arp响应:只响应目的ip地址为接收网卡上的本地地址arp请求
arp_announce的作用是控制系统在对外发送arp请求时,如何选择arp请求数据包的源ip地址:忽略ip数据包的源ip地址,选择该发送网卡上最合适的本地地址作为arp请求的源ip地址。
sysctl -w net.ipv4.conf.lo.arp_ignore=1
sysctl -w net.ipv4.conf.lo.arp_announce=2
sysctl -w net.ipv4.conf.all.arp_ignore=1
sysctl -w net.ipv4.conf.all.arp_announce=2
sysctl -p # 立即生效
7. 当一台real server挂掉后的解决方法(后端健康服务检查)
配置高可用的源解决依赖性问题:
[root@foundation60 addons]# pwd #高可用源的位置
/var/www/html/rhel7.3/addons
[root@foundation60 addons]# ls
HighAvailability ResilientStorage
[root@server1 yum.repos.d]# cat /etc/yum.repos.d/yum.repo
[rhel7.3]
name=rhel7.3
gpgcheck=0
baseurl=http://172.25.60.250/rhel7.3
[HighAvailability]
name=HighAvailability
gpgcheck=0
baseurl=http://172.25.60.250/rhel7.3/addons/HighAvailability
ldirectord-3.9.5-3.1.x86_64.rpm高可用的软件
[root@server1 ~]# yum install ldirectord-3.9.5-3.1.x86_64.rpm -y
[root@server1 ~]# rpm -pql ldirectord-3.9.5-3.1.x86_64.rpm # 查看生成的文件路径
warning: ldirectord-3.9.5-3.1.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 7b709911: NOKEY
/etc/ha.d
/etc/ha.d/resource.d
/etc/ha.d/resource.d/ldirectord
/etc/init.d/ldirectord
/etc/logrotate.d/ldirectord
/usr/lib/ocf/resource.d/heartbeat/ldirectord
/usr/sbin/ldirectord
/usr/share/doc/ldirectord-3.9.5
/usr/share/doc/ldirectord-3.9.5/COPYING
/usr/share/doc/ldirectord-3.9.5/ldirectord.cf
/usr/share/man/man8/ldirectord.8.gz
[root@server1 ~]# cp /usr/share/doc/ldirectord-3.9.5/ldirectord.cf /etc/ha.d/
[root@server1 ~]# cd /etc/ha.d/
[root@server1 ha.d]# ls
ldirectord.cf resource.d shellfuncs
[root@server1 ~]# cp /usr/share/doc/ldirectord-3.9.5/ldirectord.cf /etc/ha.d/
[root@server1 ~]# cd /etc/ha.d/
[root@server1 ha.d]# ls
ldirectord.cf resource.d shellfuncs
[root@server1 ha.d]# vim ldirectord.cf
#
# Sample ldirectord configuration file to configure various virtual services.
#
# Ldirectord will connect to each real server once per second and request
# /index.html. If the data returned by the server does not contain the
# string "Test Message" then the test fails and the real server will be
# taken out of the available pool. The real server will be added back into
# the pool once the test succeeds. If all real servers are removed from the
# pool then localhost:80 is added to the pool as a fallback measure.
# Global Directives
checktimeout=3
checkinterval=1
#fallback=127.0.0.1:80
#fallback6=[::1]:80
autoreload=yes
#logfile="/var/log/ldirectord.log"
#logfile="local0"
#emailalert="admin@x.y.z"
#emailalertfreq=3600
#emailalertstatus=all
quiescent=no
# Sample for an http virtual service
virtual=172.25.60.100:80
real=172.25.60.2:80 gate
real=172.25.60.3:80 gate
fallback=127.0.0.1:80 gate
service=http
scheduler=rr
#persistent=600
#netmask=255.255.255.255
protocol=tcp
checktype=negotiate
checkport=80
request="index.html"
#receive="Test Page"
#virtualhost=www.x.y.z
命令解释:
[root@server1 ha.d]# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 172.25.60.100:80 rr
-> 172.25.60.2:80 Route 1 0 0
-> 172.25.60.3:80 Route 1 0 0
[root@server1 ha.d]# ipvsadm -C
[root@server1 ha.d]# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
[root@server1 ha.d]# systemctl start ldirectord
[root@server1 ha.d]# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 172.25.60.100:80 rr
-> 172.25.60.2:80 Route 1 0 0
-> 172.25.60.3:80 Route 1 0 0
测试:
[root@foundation60 addons]# curl 172.25.60.100
server3.www.westos.org
[root@foundation60 addons]# curl 172.25.60.100
server2
[root@server1 ha.d]# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 172.25.60.100:80 rr
-> 172.25.60.2:80 Route 1 0 4
-> 172.25.60.3:80 Route 1 0 4
将server2的httpd停止后
[root@server2 html]# systemctl stop httpd
[root@server1 ha.d]# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 172.25.60.100:80 rr
-> 172.25.60.3:80 Route 1 0 5
[root@foundation60 addons]# curl 172.25.60.100
server3.www.westos.org
[root@foundation60 addons]# curl 172.25.60.100
server3.www.westos.org