深入分析LVS (数据报层面)

这段时间随着互联网企业陆续开始招聘,一直都忙着找工作,基本上有运维岗位的我都投了简历,网申了呢,现在有些公司还处于面试或者等待消息的状况。现在就YY 一面中被问到的一个LVS 问题来展开,其他面试笔试经历稍后全部结束了会再总结一下。

为什么挑在YY 一面中被问到的问题展开呢?因为那次面试给我的感触很深,YY今年的运维岗位竞争压力很大,全国招10个人,而广州这边我问了下面试官只招4个。而我的学校既不是985,也不是211,只是普通的2A本科学校。不开玩笑,在YY笔试现场,我从门外贴着的签到表看到,全场除了一个韩山师范学院的童鞋外,就我们学校最差了,在场的华工中大不说,还有什么哈工,甚至香港大学。这对于我来说,压力真心很大,但还是顶着压力去尝试了。一面给我的感觉就是,一点都不比腾讯面试中的二面简单,感觉整场主动权都在面试官那里,他控得死死的。基本上,我简历上的东西他全部都问了,而且是一点一点,问得很深。接下来我要分析的LVS便是他问的其中一个问题,目前已经历了YY的3轮面试(技术初试+技术复试+HR面试),正在等结果。

前面说多了,现在就一面中被问到的问题来分析,问题是这样的:LVS 你熟悉吗?有没有自己抓过数据报分析呢?

问题一:Master 和 BACKUP 之间怎么知道谁优先级高,谁做Master 谁做BACKUP?

问题二:当Master 宕机了,BACKUP如何接管服务?

问题三:若两台服务器都为Master ,那么谁将充当真正的Master?


题外话:说真的,LVS我是接触挺久了,实习的时候做项目也用过,但是我对它的了解仅仅是理论上认知(算法,原理等等)以及简单的使用,却从未涉及到数据报层面,所以一面回来,我最大的感触就是自己太肤浅了。


下面是实验过程:

至于lvs+keepalived 的安装这里就不重复了,只贴出配置,需要的朋友可以参考我之前的lvs+keepalive 实现高可用集群那篇文章。


实验架构如下:

160734854.jpg

IP 分配情况:

node1 192.168.30.111

node2 192.168.30.112

web1 192.168.30.113

web2 192.168.30.114

VIP 192.168.30.254


一、配置
1. keepalived 的配置如下:
[root@node1 ~]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
                                                                                                                                                                                                                                                                                         
global_defs {
   notification_email {
        pmghong@163.com
   }
   notification_email_from Alexandre.Cassen@firewall.loc
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id LVS_1
}
                                                                                                                                                                                                                                                                                         
vrrp_instance VI_1 {
    state MASTER          // 从服务器 上改为BACKUP
    interface eth0
    virtual_router_id 51    // 必须保证两个id 一致
    priority 100        // 从服务器的优先级应低于主服务器的优先级(数值越大,优先级越高)
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.30.254
    }
}
                                                                                                                                                                                                                                                                                         
virtual_server 192.168.30.254 80 {
    delay_loop 6
    lb_algo rr
    lb_kind DR
    #nat_mask 255.255.255.0
    persistence_timeout 50
    protocol TCP
                                                                                                                                                                                                                                                                                         
    real_server 192.168.30.113 80 {
        weight 1
        TCP_CHECK {
            connect_timeout 3
            nb_get_retry 3
            connect_port 80
        }
    }
    real_server 192.168.30.114 80 {
        weight 1
        TCP_CHECK {
            connect_timeout 3
            nb_get_retry 3
            connect_port 80
        }
    }
}


2. RealServer 的配置
在RS1和RS2上分别创建并运行下面的脚本即可
#!/bin/bash
VIP=192.168.30.254
case $1 in
start)
    echo "Start LVS of DR"
    /sbin/ifdown eth1
    ifconfig eth0:0 192.168.30.254 netmask 255.255.255.255 broadcast 192.168.30.254
    route add -host 192.168.30.254 dev eth0:0
    #route add default gw 192.168.30.200
                                                                                                                                                                                                                                                                                  
    echo "1" > /proc/sys/net/ipv4/conf/lo/arp_ignore
    echo "2" > /proc/sys/net/ipv4/conf/lo/arp_announce
    echo "1" > /proc/sys/net/ipv4/conf/all/arp_ignore
    echo "2" > /proc/sys/net/ipv4/conf/all/arp_announce
    sysctl -p > /dev/null 2>&1
    ;;
stop)
    echo "Stop LVS of DR"
    /sbin/ifconfig eth0:0 down
    echo "0" > /proc/sys/net/ipv4/conf/lo/arp_ignore
    echo "0" > /proc/sys/net/ipv4/conf/lo/arp_announce
    echo "0" > /proc/sys/net/ipv4/conf/all/arp_ignore
    echo "0" > /proc/sys/net/ipv4/conf/all/arp_announce
    sysctl -p > /dev/null 2>&1
    ;;
*)
    echo "Usage:$0 {start|stop}"
    exit 1
esac

启动完毕,分别安装上apache,即可:

[root@web1 ~]# yum -y install httpd
[root@web1 ~]# service httpd start
[root@web1 ~]# echo "<h1>Web1</h1>" > /var/www/html/index.html


3. 通过tcpdump 抓包观察
在测试端上ping LVS Master ,并在Master 上执行tcpdump 抓包

161223869.png


[root@node1 ~]# tcpdump -p icmp -i eth0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
19:12:36.745383 IP 192.168.30.1 > 192.168.30.254: ICMP echo request, id 1024, seq 7424, length 40
19:12:36.745419 IP 192.168.30.254 > 192.168.30.1: ICMP echo reply, id 1024, seq 7424, length 40
19:12:37.745694 IP 192.168.30.1 > 192.168.30.254: ICMP echo request, id 1024, seq 7680, length 40
19:12:37.745748 IP 192.168.30.254 > 192.168.30.1: ICMP echo reply, id 1024, seq 7680, length 40
19:12:38.746653 IP 192.168.30.1 > 192.168.30.254: ICMP echo request, id 1024, seq 7936, length 40
19:12:38.746692 IP 192.168.30.254 > 192.168.30.1: ICMP echo reply, id 1024, seq 7936, length 40
19:12:39.746569 IP 192.168.30.1 > 192.168.30.254: ICMP echo request, id 1024, seq 8192, length 40
19:12:39.746602 IP 192.168.30.254 > 192.168.30.1: ICMP echo reply, id 1024, seq 8192, length 40

由此可以看出VIP 在Master 上


问题一:Master 和 BACKUP 之间怎么知道谁优先级高,谁做Master 谁做BACKUP?
[root@node1 ~]# tcpdump vrrp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
19:21:14.929139 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
19:21:15.930098 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
19:21:16.931275 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
19:21:17.932976 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
19:21:18.933830 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
由此可见 keepalived的master与slave是通过vrrp2协议进行通讯.以决定各自的状态及vip等相关信息,MASTER会发送广播包,广播地址为224.0.0.18.

[root@node1 ~]# tcpdump -X -n -vvv 'dst 224.0.0.18'
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
19:22:57.102636 IP (tos 0xc0, ttl 255, id 973, offset 0, flags [none], proto VRRP (112), length 40)
192.168.30.111 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20, addrs: 192.168.30.254 auth "1111^@^@^@^@"
0x0000: 45c0 0028 03cd 0000 ff70 f7ae c0a8 1e6f E..(.....p.....o
0x0010: e000 0012 2133 6401 0101 37c1 c0a8 1efe ....!3d...7.....
0x0020: 3131 3131 0000 0000 1111....
19:22:58.103593 IP (tos 0xc0, ttl 255, id 974, offset 0, flags [none], proto VRRP (112), length 40)
192.168.30.111 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20, addrs: 192.168.30.254 auth "1111^@^@^@^@"
0x0000: 45c0 0028 03ce 0000 ff70 f7ad c0a8 1e6f E..(.....p.....o
0x0010: e000 0012 2133 6401 0101 37c1 c0a8 1efe ....!3d...7.....

0x0020: 3131 3131 0000 0000 1111....


以192.168.30.111服务器发广播数据为例,如下:
192.168.30.111 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20, addrs: 192.168.30.254 auth "1111^@^@^@^@"
0x0000: 45c0 0028 03ce 0000 ff70 f7ad c0a8 1e6f E..(.....p.....o
0x0010: e000 0012 2133 6401 0101 37c1 c0a8 1efe ....!3d...7.....

0x0020: 3131 3131 0000 0000 1111....


vrrpv2的协议的消息从这里开始:
192.168.30.111 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20, addrs: 192.168.30.254 auth "1111^@^@^@^@"
0x0000: 45c0 0028 03ce 0000 ff70 f7ad c0a8 1e6f E..(.....p.....o
0x0010: e000 0012 2133 6401 0101 37c1 c0a8 1efe ....!3d...7.....

0x0020: 3131 3131 0000 0000 1111....


字段解释:

version:版本号4位,在RFC中定义为2,所以这里是2.
type:类型,4位,目前只定义一种类型,就是Advertisement,表示通告信息,取值为1.所以这里是1
Virtual ID:虚拟路由器ID,8位,因为在lvs1中的keepalived定义的virtual_router_id为51,所以转换为16进制就是33.
Priority:优先级,8位,因为在lvs1中的keepalived定义的Priority为100,所以转换为16进制就是64
count ip addrs:VRRP包中的IP地址数量,8位.这里只有一个ip地址,所以就是01
auth type:认证类型,8位,在RFC3768中认证功能已经取消.所以该字段为01,其实这样只对老版本的兼容.如果取消则为00.
adver int:通告包的发送间隔时间,缺省为1秒,我们的配置也是1秒,所以这里的值为01
checksum:检验和,16位.这里的校验数据范围只是VRRP数据,并不包括IP头.
ip address:vip地址,这里是16位,我们的vip地址为192.168.30.254,所以转换为十六进制就是c0a8 1efe

auth data:验证的密码,密码的最大长度为8个字符,也就是32位,不足32位的,以0补全,所以这里就是3131 3131 0000 0000


问题二:当Master 宕机了,BACKUP如何接管服务?
MASTER 在运行的时候会不断向本网段的发送VRRPv2 的组播报文,如下:
192.168.30.111 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20, addrs: 192.168.30.254 auth "1111^@^@^@^@"
0x0000: 45c0 0028 03cd 0000 ff70 f7ae c0a8 1e6f E..(.....p.....o
0x0010: e000 0012 2133 6401 0101 37c1 c0a8 1efe ....!3d...7.....

0x0020: 3131 3131 0000 0000 1111....


注:BACKUP是不发组播报文的,如下所示,在BACKUP服务器上抓包只看到MASTER的VRRP 包
[root@node2 ~]# tcpdump dst 224.0.0.18
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
20:36:18.975572 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
20:36:19.976512 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
20:36:20.977505 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20

20:36:21.978394 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20


但是如果MASTER宕掉,这时BACKUP在确认没有收到MASTER的组播报文后,会主动发送组播报文,声明自己的keepalived状态,随后启用VIP。正式接管keepliaved。


[root@node2 ~]# tcpdump dst 224.0.0.18
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
20:37:52.100343 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
20:37:53.090777 IP node1 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 0, authtype simple, intvl 1s, length 20
20:37:53.779780 IP node2 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 80, authtype simple, intvl 1s, length 20
20:37:54.794518 IP node2 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 80, authtype simple, intvl 1s, length 20

20:37:55.795738 IP node2 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 80, authtype simple, intvl 1s, length 20


查看日志可以看到BACKUP 已经接管VIP
[root@node2 ~]# tailf /var/log/messages
Oct 18 20:36:18 node2 kernel: device eth0 entered promiscuous mode
Oct 18 20:36:22 node2 kernel: device eth0 left promiscuous mode
Oct 18 20:37:51 node2 kernel: device eth0 entered promiscuous mode
Oct 18 20:37:53 node2 Keepalived_vrrp[40622]: VRRP_Instance(VI_1) Transition to MASTER STATE
Oct 18 20:37:54 node2 Keepalived_vrrp[40622]: VRRP_Instance(VI_1) Entering MASTER STATE
Oct 18 20:37:54 node2 Keepalived_vrrp[40622]: VRRP_Instance(VI_1) setting protocol VIPs.
Oct 18 20:37:54 node2 Keepalived_healthcheckers[40621]: Netlink reflector reports IP 192.168.30.254 added
Oct 18 20:37:54 node2 Keepalived_vrrp[40622]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.30.254
Oct 18 20:37:56 node2 kernel: device eth0 left promiscuous mode

Oct 18 20:37:59 node2 Keepalived_vrrp[40622]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.30.254


问题三:若两台服务器都为Master ,那么谁将充当真正的Master
将BACKUP上的状态也改为MASTER
[root@node2 ~]# vim /etc/keepalived/keepalived.conf

state MASTER


重启keepalived
[root@node2 ~]# service keepalived restart
Stopping keepalived: [ OK ]

Starting keepalived: [ OK ]


观察日志
[root@node2 ~]# tailf /var/log/messages
Oct 18 19:35:03 node2 Keepalived_vrrp[40602]: VRRP_Instance(VI_1) Transition to MASTER STATE
Oct 18 19:35:03 node2 Keepalived_vrrp[40602]: VRRP_Instance(VI_1) Received higher prio advert

Oct 18 19:35:03 node2 Keepalived_vrrp[40602]: VRRP_Instance(VI_1) Entering BACKUP STATE


从上面可以看到,它确实是以MASTER 启动,但是接收到了更高优先级,于是变成BACKUP。
若此时优先级相同会出现什么情况?

由于两个MASTER 不分上下,所以他们会互相抢占VIP,导致IP冲突。



问题四:如果宕掉的MASTER恢复工作,是否会接管工作?
MASTER 恢复工作后,会接管BACKUP 上面的工作,而BACKUP 又从MASTER 降级为BACKUP。
[root@node1 ~]# tailf /var/log/messages
Oct 18 20:41:04 node1 Keepalived_vrrp[46564]: VRRP_Instance(VI_1) Transition to MASTER STATE
Oct 18 20:41:05 node1 Keepalived_vrrp[46564]: VRRP_Instance(VI_1) Entering MASTER STATE
Oct 18 20:41:05 node1 Keepalived_vrrp[46564]: VRRP_Instance(VI_1) setting protocol VIPs.

Oct 18 20:41:05 node1 Keepalived_vrrp[46564]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.30.254


如何让MASTER 恢复后不抢占VIP?
在 node1 上修改state 为BACKUP
[root@node1 ~]# vim /etc/keepalived/keepalived.conf
vrrp_instance VI_1 {
state BACKUP
interface eth0
priority 100
virtual_router_id 51
authentication {
auth_type PASS
auth_pass 1111

}


在node2 上加入nopreempt 参数,并将优先级修改为150(高于node1 即可)
[root@node2 ~]# vim /etc/keepalived/keepalived.conf
vrrp_instance VI_1 {
state BACKUP
interface eth0
priority 150
nopreempt
virtual_router_id 51
authentication {
auth_type PASS
auth_pass 1111
}
分别重启node1,node2 的keepalived
[root@node1 ~]# service keepalived restart

[root@node2 ~]# service keepalived restart


先关闭node1 上面的keepalived

[root@node1 ~]# service keepalived stop


再开启node1 的keepalived
[root@node1 ~]# service keepalived start
[root@node1 ~]# tailf /var/log/messages
Oct 18 20:59:50 node1 Keepalived_vrrp[46653]: VRRP_Instance(VI_1) Entering BACKUP STATE
Oct 18 20:59:50 node1 Keepalived_healthcheckers[46651]: Using LinkWatch kernel netlink reflector...

Oct 18 20:59:50 node1 Keepalived_vrrp[46653]: VRRP sockpool: [ifindex(2), proto(112), fd(10,11)]


可以看到node1启动后还是BACKUP状态,而不会抢占MASTER

配置两个BACKUP状态,保证互不抢占.

为什么一台要比另一个的优先级高呢?因为我们在高优先级的服务器上配置了nopreempt,导致高的优先级也不会抢占低的优先级,也就是说只有在一台keepalived失败的时候,另一台才会接管。


再关闭node2
[root@node2 ~]# service keepalived stop
可以看到node1 变成了MASTER
[root@node1 ~]# tailf /var/log/messages
Oct 18 21:15:34 node1 Keepalived_vrrp[46653]: VRRP_Instance(VI_1) Transition to MASTER STATE
Oct 18 21:15:35 node1 Keepalived_vrrp[46653]: VRRP_Instance(VI_1) Entering MASTER STATE
Oct 18 21:15:35 node1 Keepalived_vrrp[46653]: VRRP_Instance(VI_1) setting protocol VIPs.
Oct 18 21:15:35 node1 Keepalived_vrrp[46653]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.30.254
Oct 18 21:15:35 node1 Keepalived_healthcheckers[46651]: Netlink reflector reports IP 192.168.30.254 added
Oct 18 21:15:40 node1 Keepalived_vrrp[46653]: VRRP_Instance(VI_1) Sending gratuitous ARPs on eth0 for 192.168.30.254


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值