vrrp介绍

VRRP介绍

  1. VRRP概念

虚拟路由冗余协议VRRP(Virtual Router Redundancy Protocol)是一种用于提高网络可靠性的容错协议。通过VRRP,可以在主机的下一跳设备出现故障时,及时将业务切换到备份设备,从而保障网络通信的连续性和可靠性。

(1)VRRP路由器

VRRP路由器(VRRP Router)是运行VRRP协议的设备,它可能属于一个或多个虚拟路由器

(2)虚拟路由器

虚拟路由器(Virtual Router)又称VRRP备份组,由一个Master设备和多个Backup设备组成,被当作一个共享局域网内主机的缺省网关

(3)Master路由器

Master路由器(Virtual Router Master)是承担转发报文任务的VRRP设备

(4)Backup路由器

Backup路由器(Virtual Router Backup)是一组没有承担转发任务的VRRP设备,当Master设备出现故障时,它们将通过竞选成为新的Master设备

(5)VRID

VRID是虚拟路由器的标识

(6)虚拟IP地址

虚拟IP地址(Virtual IP Address)是虚拟路由器的IP地址,一个虚拟路由器可以有一个或多个IP地址,由用户配置

(7)IP地址拥有者

如果一个VRRP设备将虚拟路由器IP地址作为真实的接口地址,则该设备被称为IP地址拥有者(IP Address Owner),如果IP地址拥有者是可用的,通常它将成为Master

(8)虚拟MAC地址

虚拟MAC地址(Virtual MAC Address)是虚拟路由器根据虚拟路由器ID生成的MAC地址,当虚拟路由器回应ARP请求时,使用虚拟MAC地址,而不是接口的真实MAC地址

DBGvpp# show vrrp vr

[0] sw_if_index 1 VR ID 2 IPv4

state Initialize flags: preempt yes accept yes unicast no

priority: configured 200 adjusted 200

timers: adv interval 100 master adv 0 skew 0 master down 0

virtual MAC 00:00:5e:00:01:02 固定部分+vrid

addresses 2.2.2.222

peer addresses

tracked interfaces

  1. VRRP RFC

目前,VRRP协议包括两个版本:VRRPv2(RFC3768)和VRRPv3(RFC5798)。VRRPv2仅适用于IPv4网络,VRRPv3适用于IPv4和IPv6两种网络。

基于不同的网络类型,VRRP可以分为VRRP for IPv4和VRRP for IPv6(简称VRRP6)。VRRP for IPv4支持VRRPv2和VRRPv3,而VRRP for IPv6仅支持VRRPv3。

VRRPv2和VRRPv3的报文结构分别如图。

VRRPv2报文结构

VRRPv3报文结构

  1. VRRP使用场景

随着网络的快速普及和相关应用的日益深入,各种增值业务(如IPTV、视频会议等)已经开始广泛部署,基础网络的可靠性日益成为用户关注的焦点,能够保证网络传输不中断对于终端用户非常重要。

现网中的主机使用缺省网关与外部网络联系时,如果Gateway出现故障,与其相连的主机将与外界失去联系,导致业务中断。

局域网缺省网关示意图

VRRP的出现很好地解决了这个问题。VRRP将多台设备组成一个虚拟设备,通过配置虚拟设备的IP地址为缺省网关,实现缺省网关的备份。当网关设备发生故障时,VRRP机制能够选举新的网关设备承担数据流量,从而保障网络的可靠通信。如下图所示,当Master设备故障时,发往缺省网关的流量将由Backup设备进行转发。

  1. VRRP工作原理

    1. VRRP状态机

VRRP协议中定义了三种状态机:初始状态(Initialize)、活动状态(Master)、备份状态(Backup)。其中,只有处于Master状态的设备才可以转发那些发送到虚拟IP地址的报文。下表详细描述了三种状态。

状态说明
Initialize该状态为VRRP不可用状态,在此状态时设备不会对VRRP通告报文做任何处理。通常设备启动时或设备检测到故障时会进入Initialize状态。
Master当VRRP设备处于Master状态时,它将会承担虚拟路由设备的所有转发工作,并定期向整个虚拟内发送VRRP通告报文。
Backup当VRRP设备处于Backup状态时,它不会承担虚拟路由设备的转发工作,并定期接受Master设备的VRRP通告报文,判断Master的工作状态是否正常。
  1. VRRP选举

Master设备选举过程

(1)Initialize

该状态为VRRP不可用状态,在此状态时设备不会对VRRP报文做任何处理

  • 通常刚配置VRRP时或设备检测到故障时会进入Initialize状态
  • 收到接口up的消息后,如果设备的优先级为255,则直接成为Master设备,如果设备的优先级小于255,则会先切换至Backup状态

(2)Master

当VRRP设备处于Master状态时,它将会做下列工作

  • 定时(Advertisement Interval)发送VRRP通告报文
  • 以虚拟MAC地址响应对虚拟IP地址的ARP请求
  • 转发目的MAC地址为虚拟MAC地址的IP报文
  • 如果它是这个虚拟IP地址的拥有者,则接收目的IP地址为这个虚拟IP地址的IP报文,否则,丢弃这个IP报文
  • 如果收到比自己优先级大的报文,立即成为Backup
  • 如果收到与自己优先级相等的VRRP报文且本地接口IP地址小于对端接口IP,立即成为Backup
  1. Backup

当VRRP设备处于Backup状态时,它将会做下列工作

  • 接收Master设备发送的VRRP通告报文,判断Master设备的状态是否正常
  • 对虚拟IP地址的ARP请求,不做响应

丢弃目的IP地址为虚拟IP地址的IP报文

  • 如果收到优先级和自己相同或者比自己大的报文,则重置Master_Down_Interval定时器,不进一步比较IP地址
  • Master_Down_Interval定时器:Backup设备在该定时器超时后仍未收到通告报文,则会转换为Master状态,计算公式如下:Master_Down_Interval=(3* Advertisement_Interval) + Skew_time,其中,Skew_Time=(256–Priority)/256
  • 如果收到比自己优先级小的报文且该报文优先级是0时,定时器时间设置为Skew_time(偏移时间),如果该报文优先级不是0,丢弃报文,立刻成为Master
    1. 抢占模式

VRRP设备的工作方式有如下两种:

抢占模式:在抢占模式下,如果Backup设备的优先级比当前Master设备的优先级高,则主动将自己切换成Master。

非抢占模式:在非抢占模式下,只要Master设备没有出现故障,Backup设备即使随后被配置了更高的优先级也不会成为Master设备

  1. VRRP报文格式

  2. VRRP转发

如下图所示,路由器A、B、C通过配置VRRP组成一个虚拟路由器。虚拟路由器的IP地址可以与设备上某台设备的实际IP地址一致(实际上直接指定此设备为Master),也可以与它们的地址在同一个网段但不一致。在本例中,我们以前一种方式来举例说明,虚拟路由器的IP地址为路由器A的IP地址(注意:虚拟路由器的IP地址可以与设备上某台设备的实际IP地址一致,也可以与它们呢的地址在同一个网段但不一致。在本例中,我们以前一种方式来举例说明)。由于虚拟路由器的IP地址与路由器A的IP地址相同,因此路由器A为Master设备,路由器B、C为Backup设备。Client13的默认网关为10.10.0.1。作为Master设备,路由器A处理着Client13发往默认网关10.10.0.1的报文。

当Master设备出现故障时,路由器B和路由器C会选举出新的Master设备。新的Master设备开始响应对虚拟IP地址的ARP响应,并定期发送VRRP通告报文。

VRRP的详细工作过程如下:

VRRP备份组中的设备根据优先级选举出Master。Master设备通过发送免费ARP报文,将虚拟MAC地址通知给与它连接的设备或者主机,从而承担报文转发任务。

Master设备周期性向备份组内所有Backup设备发送VRRP通告报文,通告其配置信息(优先级等)和工作状况。

如果Master设备出现故障,VRRP备份组中的Backup设备将根据优先级重新选举新的Master。

VRRP备份组状态切换时,Master设备由一台设备切换为另外一台设备,新的Master设备会立即发送携带虚拟路由器的虚拟MAC地址和虚拟IP地址信息的免费ARP报文,刷新与它连接的设备或者主机的MAC表项,从而把用户流量引到新的Master设备上来,整个过程对用户完全透明。

原Master设备故障恢复时,若该设备为IP地址拥有者(优先级为255),将直接切换至Master状态。若该设备优先级小于255,将首先切换至Backup状态,且其优先级恢复为故障前配置的优先级。

Backup设备的优先级高于Master设备时,由Backup设备的工作方式(抢占方式和非抢占方式)决定是否重新选举Master。

  1. VRRP组网

    1. 主备备份

  2. 负载均衡

  3. 防环路和最优路径

  4. VPP-VRRP

    1. VRRP初始化

|

static clib_error_t *

vrrp_init (vlib_main_t * vm)

{

vrrp_main_t *vmp = &vrrp_main;

clib_error_t *error = 0;

ip4_main_t *im4 = &ip4_main;

ip4_add_del_interface_address_callback_t cb4;

vlib_node_t *intf_output_node;

clib_memset (vmp, 0, sizeof (*vmp));

/*

初始化路由查找,是否一定需要初始化

*/

if ((error = vlib_call_init_function (vm, ip4_lookup_init)) ||

(error = vlib_call_init_function (vm, ip6_lookup_init)))

return error**;

vmp->vlib_main = vm;

vmp->vnet_main = vnet_get_main ();

/*

1)Vrrp 协议报文从接口直接发送出去

2)免费Arp/nd 报文 也直接从接口直接发送出去

*/

intf_output_node = vlib_get_node_by_name (vm, (u8 *) “interface-output”);

vmp->intf_output_node_idx = intf_output_node->index;

error = vrrp_plugin_api_hookup (vm);

if (error)

return error;

/*保存vvrp key

typedef struct vrrp_vr_key

{

u32 sw_if_index;

u8 vr_id;

u8 is_ipv6;

} vrrp_vr_key_t;

*/

mhash_init (&vmp->vr_index_by_key, sizeof (u32), sizeof (vrrp_vr_key_t));

/*

Vr index

*/

vmp->vrrp4_arp_lookup = hash_create (0, sizeof (uword));

vmp->vrrp6_nd_lookup = hash_create_mem (0, sizeof (vrrp6_nd_key_t),

sizeof (uword));

/*

注册IP地址操作回调接口

*/

cb4.function = vrrp_ip4_add_del_interface_addr;

cb4.function_opaque = 0;

vec_add1 (im4->add_del_interface_address_callbacks, cb4);

/*??*/

vrrp_ip6_delegate_id = ip6_link_delegate_register (&vrrp_ip6_delegate_vft);

return error;

}**

|
| :- |
  1. VRRP环境搭建

其中混杂模式使vswitch不按照Mac地址表进行转发,而是按照vmware管理的虚拟机网卡地址列表进行转发。MAC地址更改则保证虚拟机出来的报文,携带的源mac地址只能是vmware管理的虚拟机网卡mac地址。而对于VRRP应用,是由多台主机虚拟使用0000-5e00-xxxx的mac地址来仲裁VRRP虚IP。此安全选项此对于VRRP的场景,或其他使用虚拟mac地址的场景,会导致以太网帧直接被丢弃,到不了物理交换机上。

从抓包分析来看,数据报文在交换机上收到后被丢弃,vpp无法接收到数据报文,但是arp学习成功的。

  1. 配置命令

    1. VRRP配置命令行

/* *INDENT-OFF* */

VLIB_CLI_COMMAND (vrrp_vr_add_command, static) =

{

.path = “vrrp vr add”,

.short_help =

“vrrp vr add [vr_id ] [ipv6] [priority ] [interval ] [no_preempt] [accept_mode] [unicast] [<ip_addr> …]”,

.function = vrrp_vr_add_command_fn,

};

/* *INDENT-OFF* */

VLIB_CLI_COMMAND (vrrp_proto_start_stop_command, static) =

{

.path = “vrrp proto”,

.short_help =

“vrrp proto (start|stop) (<intf_name>|sw_if_index ) vr_id [ipv6]”,

.function = vrrp_proto_start_stop_command_fn,

};

//(master)

vrrp vr add GigabitEthernet2/9/0 vr_id 1 priority 200 accept_mode 2.2.2.254

vrrp proto start GigabitEthernet2/9/0 vr_id 1

// (salve)

vrrp vr add GigabitEthernet2/6/0 vr_id 1 priority 100 no_preempt accept_mode 2.2.2.254

vrrp proto start GigabitEthernet2/6/0 vr_id 1

  1. VRRP配置代码分析

typedef enum vrrp_vr_flags

{

VRRP_VR_PREEMPT = 0x1, //默认抢占模式,

VRRP_VR_ACCEPT = 0x2, //虚地址是否配置到接口

VRRP_VR_UNICAST = 0x4, //单播模式(非组播发送),需要配置对端IP,类似于OSPF/BGP //单播建邻居

VRRP_VR_IPV6 = 0x8, //IPv6,vrrp3

} vrrp_vr_flags_t;

  1. 保存VRPP基础保持
  2. 保存VRRP运行时配置
  3. 保存VRRP接口
  4. 配置组播路由

static int

vrrp_intf_enable_disable_mcast (u8 enable, u32 sw_if_index**,** u8 is_ipv6**)

{

vrrp_main_t *vrm = &vrrp_main;

vrrp_intf_t *intf;

u32 fib_index;

const mfib_prefix_t *vrrp_prefix;

fib_protocol_t proto;

vnet_link_t link_type;

fib_route_path_t for_us = {

.frp_sw_if_index = 0xffffffff,

.frp_weight = 1,

.frp_flags = FIB_ROUTE_PATH_LOCAL,

.frp_mitf_flags = MFIB_ITF_FLAG_FORWARD,

};

fib_route_path_t via_itf = {

.frp_sw_if_index = sw_if_index,

.frp_weight = 1,

.frp_mitf_flags = MFIB_ITF_FLAG_ACCEPT,

};

intf = vrrp_intf_get (sw_if_index);

if (is_ipv6)

{

proto = FIB_PROTOCOL_IP6;

link_type = VNET_LINK_IP6;

vrrp_prefix = &all_vrrp6_routers;

}

else

{

proto = FIB_PROTOCOL_IP4;

link_type = VNET_LINK_IP4;

vrrp_prefix = &all_vrrp4_routers;

}

for_us.frp_proto = fib_proto_to_dpo (proto);

via_itf.frp_proto = fib_proto_to_dpo (proto);

fib_index = mfib_table_get_index_for_sw_if_index (proto, sw_if_index);

if (enable)

{

if (pool_elts (vrm->vrs) == 1)

mfib_table_entry_path_update (fib_index, vrrp_prefix,** MFIB_SOURCE_API**,

&for_us);

mfib_table_entry_path_update (fib_index, vrrp_prefix,** MFIB_SOURCE_API**,

&via_itf);

intf->mcast_adj_index[!** !is_ipv6] =

adj_mcast_add_or_lock (proto, link_type**,** sw_if_index**);

}

else

{

if (pool_elts (vrm->vrs) == 0)

mfib_table_entry_path_remove (fib_index, vrrp_prefix,** MFIB_SOURCE_API**,

&for_us);

mfib_table_entry_path_remove (fib_index, vrrp_prefix,** MFIB_SOURCE_API**,

&via_itf);

}

return 0;

}**

  1. vrrp协商

    1. VRRP包格式

typedef CLIB_PACKED (struct

{

/* 4 bits for version (always 2 or 3), 4 bits for type (always 1) */

u8 vrrp_version_and_type**;

/* VR ID */

u8 vr_id;

/* priority of sender on this VR. value of 0 means a master is abdicating */

u8 priority;

/* count of addresses being backed up by the VR */

u8 n_addrs;

/* max advertisement interval - first 4 bits are reserved and must be 0 */

u16 rsvd_and_max_adv_int;

/* checksum */

u16 checksum;

})** vrrp_header_t**;

typedef CLIB_PACKED (struct

{

ip4_header_t ip4;** vrrp_header_t vrrp**;

})** ip4_and_vrrp_header_t**;

typedef CLIB_PACKED (struct

{

ip6_header_t ip6;** vrrp_header_t vrrp**;

})** ip6_and_vrrp_header_t**;**

  1. 加入组播组

网络中的一台主机如果希望能够接收到来自网络中其它主机发往某一个组播组的数据报,那么这么主机必须先加入该组播组,然后就可以从组地址接收数据包。

|

int

vrrp_vr_multicast_group_join (vrrp_vr_t * vr)

{

vlib_main_t *vm = vlib_get_main ();

vlib_buffer_t *b;

vlib_frame_t *f;

vnet_main_t *vnm = vnet_get_main ();

vrrp_intf_t *intf;

u32 bi = 0, *to_next;

int n_buffers = 1;

u8 is_ipv6**;

u32 node_index;

if (!vnet_sw_interface_is_up (vnm, vr->config.sw_if_index))

return 0;

if (vlib_buffer_alloc (vm, &bi, n_buffers) != n_buffers)

{

clib_warning (“Buffer allocation failed for %U”,** format_vrrp_vr_key**,

vr);

return -1;

}

is_ipv6 = vrrp_vr_is_ipv6 (vr);

b = vlib_get_buffer (vm, bi);

VLIB_BUFFER_TRACE_TRAJECTORY_INIT (b);

b->flags |= VNET_BUFFER_F_LOCALLY_ORIGINATED;

vnet_buffer (b)->sw_if_index[VLIB_RX] = 0;

vnet_buffer (b)->sw_if_index[VLIB_TX] = vr->config.sw_if_index;

intf = vrrp_intf_get (vr->config.sw_if_index);

vnet_buffer (b)->ip.adj_index[VLIB_TX] = intf->mcast_adj_index[is_ipv6];

/*加入组播组核心代码*/

if (is_ipv6)

{

vrrp_icmp6_mlr_pkt_build (vr, b);

node_index = ip6_rewrite_mcast_node.index;

}

else

{

vrrp_igmp_pkt_build (vr, b);

node_index = ip4_rewrite_mcast_node.index;

}

f = vlib_get_frame_to_node (vm, node_index);

to_next = vlib_frame_vector_args (f);

to_next[0]** = bi**;

f->n_vectors = 1;

vlib_put_frame_to_node (vm, node_index,** f**);

return f->n_vectors;

}**

|
| :- |

static void

vrrp_igmp_pkt_build (vrrp_vr_t * vr, vlib_buffer_t * b**)

{

ip4_header_t *ip4;

u8 *ip4_options;

igmp_membership_report_v3_t *report;

igmp_membership_group_v3_t *group;

ip4 = vlib_buffer_get_current (b);

clib_memcpy (ip4, &igmp_ip4_mcast, sizeof (*ip4));

ip4_src_address_for_packet (&ip4_main.lookup_main, vr->config.sw_if_index,

&ip4->src_address);

vlib_buffer_chain_increase_length (b, b,** sizeof (*ip4));

vlib_buffer_advance (b, sizeof (*ip4));

ip4_options = (u8 *) (ip4 + 1);

ip4_options[0] = 0x94**;** /* 10010100 == the router alert option */

ip4_options**[1]** = 0x04**;** /* length == 4 bytes */

ip4_options**[2]** = 0x0**;** /* value == Router shall examine packet */

ip4_options**[3]** = 0x0**;** /* reserved */

vlib_buffer_chain_increase_length (b, b**,** 4**);

vlib_buffer_advance (b, 4);

report = vlib_buffer_get_current (b);

report->header.type = IGMP_TYPE_membership_report_v3;

report->header.code = 0;

report->header.checksum = 0;

report->unused = 0;

report->n_groups = clib_host_to_net_u16 (1);

vlib_buffer_chain_increase_length (b, b,** sizeof (*report));

vlib_buffer_advance (b, sizeof (*report));

group = vlib_buffer_get_current (b);

group**->type = IGMP_MEMBERSHIP_GROUP_change_to_exclude;

group->n_aux_u32s = 0;

group->n_src_addresses = 0;

group->group_address.as_u32 = clib_host_to_net_u32 (0xe0000012);

vlib_buffer_chain_increase_length (b, b,** sizeof (*group));

vlib_buffer_advance (b, sizeof (*group));

ip4**->length = clib_host_to_net_u16 (b->current_data);

ip4->checksum = ip4_header_checksum (ip4);

int payload_len = vlib_buffer_get_current (b) - ((void *) report);

report->header.checksum =

~ip_csum_fold (ip_incremental_checksum (0, report, payload_len));

vlib_buffer_reset (b);

}**

  1. 组播包构建

vrrp_adv_l2_build_multicast (vrrp_vr_t * vr, vlib_buffer_t * b)

构建L2组播头

  1. 免费ARP包

static void

vrrp4_garp_pkt_build (vrrp_vr_t * vr, vlib_buffer_t * b, ip4_address_t *ip4)

  1. ND包

static void

vrrp6_na_pkt_build (vrrp_vr_t * vr, vlib_buffer_t * b, ip6_address_t * addr6)

|

int

vrrp_garp_or_na_send (vrrp_vr_t * vr)

{

vlib_main_t *vm = vlib_get_main ();

vrrp_main_t *vmp = &vrrp_main;

vlib_frame_t *to_frame;

u32 *bi = 0;

u32 n_buffers;

u32 *to_next;

int i;

if (vec_len (vr->config.peer_addrs))

return 0; /* unicast is used in routed environments - don’t garp */

n_buffers = vec_len (vr->config.vr_addrs);

if (!n_buffers)

{

clib_warning (“Unable to send gratuitous ARP for VR %U - no addresses”,

format_vrrp_vr_key**,** vr**);

return -1;

}

/* need to send a packet for each VR address */

vec_validate (bi, n_buffers - 1);

if (vlib_buffer_alloc (vm, bi, n_buffers)** != n_buffers**)

{

clib_warning (“Buffer allocation failed for %U”,** format_vrrp_vr_key**,

vr);

vec_free (bi);

return -1;

}

to_frame = vlib_get_frame_to_node (vm, vmp->intf_output_node_idx);

to_frame->n_vectors = 0;

to_next = vlib_frame_vector_args (to_frame);

for (i = 0; i < n_buffers;** i**++)

{

vlib_buffer_t *b;

ip46_address_t *addr;

addr = vec_elt_at_index (vr->config.vr_addrs, i);

b = vlib_get_buffer (vm, bi[i]);

VLIB_BUFFER_TRACE_TRAJECTORY_INIT (b);

b->flags |= VNET_BUFFER_F_LOCALLY_ORIGINATED;

vnet_buffer (b)->sw_if_index[VLIB_RX] = 0;

vnet_buffer (b)->sw_if_index[VLIB_TX] = vr->config.sw_if_index;

if (vrrp_vr_is_ipv6 (vr))

vrrp6_na_pkt_build (vr, b, &addr->ip6);

else

vrrp4_garp_pkt_build (vr, b,** &addr->ip4);

vlib_buffer_reset (b);

to_next**[i]** = bi**[i];

to_frame->n_vectors++;

}

vlib_put_frame_to_node (vm, vmp->intf_output_node_idx,** to_frame**);

return 0;

}**

|
| :- |
  1. VRRP通告报文发送

|

int

vrrp_adv_send (vrrp_vr_t * vr, int shutdown**)

{

vlib_main_t *vm = vlib_get_main ();

vlib_frame_t *to_frame;

int i, n_buffers = 1;

u32 node_index,** *to_next, *bi = 0;

u8 is_unicast = vrrp_vr_is_unicast (vr);

/*直接从接口发送*/

node_index = vrrp_adv_next_node (vr);

if (is_unicast)

n_buffers = vec_len (vr->config.peer_addrs);

if (n_buffers < 1)

{

/* A unicast VR will not start without peers added so this should

* not happen. Just avoiding a crash if it happened somehow.

*/

clib_warning (“Unicast VR configuration corrupted for %U”,

format_vrrp_vr_key**,** vr**);

return -1;

}

vec_validate (bi, n_buffers - 1);

if (vlib_buffer_alloc (vm, bi, n_buffers)** != n_buffers**)

{

clib_warning (“Buffer allocation failed for %U”,** format_vrrp_vr_key**,

vr);

vec_free (bi);

return -1;

}

to_frame = vlib_get_frame_to_node (vm, node_index);

to_next = vlib_frame_vector_args (to_frame);

for (i = 0; i < n_buffers;** i**++)

{

vlib_buffer_t *b;

u32 bi0;

/*获取ipv4或者ipv6组播ip地址*/

const ip46_address_t *dst = vrrp_adv_mcast_addr (vr);

bi0 = vec_elt (bi, i);

b = vlib_get_buffer (vm, bi0**);

VLIB_BUFFER_TRACE_TRAJECTORY_INIT (b);

b->flags |= VNET_BUFFER_F_LOCALLY_ORIGINATED;

vnet_buffer (b)->sw_if_index[VLIB_RX] = 0;

/*指定发送接口*/

vnet_buffer (b)->sw_if_index[VLIB_TX] = vr->config.sw_if_index;

if (is_unicast)

{

dst = vec_elt_at_index (vr->config.peer_addrs, i);

vnet_buffer (b)->sw_if_index[VLIB_TX] = ~0;

}

Else

/*构造二层组播头*/

vrrp_adv_l2_build_multicast (vr, b);

/*添加三层头*/

vrrp_adv_l3_build (vr, b,** dst**);

/*添加vrrp头*/

vrrp_adv_payload_build (vr, b,** shutdown**);

vlib_buffer_reset (b);

to_next[i]** = bi0**;

}

to_frame->n_vectors = n_buffers;

vlib_put_frame_to_node (vm, node_index,** to_frame**);

vec_free (bi);

return 0;

}**

|
| :- |
  1. 状态机处理

  2. 节点注册

VLIB_REGISTER_NODE (vrrp_periodic_node) =

{

.function = vrrp_periodic_process,

.type = VLIB_NODE_TYPE_PROCESS,

.name = “vrrp-periodic-process”,

};

static uword

vrrp_periodic_process (vlib_main_t * vm,

vlib_node_runtime_t * rt**,** vlib_frame_t * f**)

{

vrrp_main_t *pm = &vrrp_main;

f64 now;

f64 timeout = 10.0;

uword *event_data = 0;

uword event_type;

u32 next_timer = ~0;

vrrp_vr_timer_t *timer;

while (1)

{

now = vlib_time_now (vm);

if (next_timer == ~0)

{

/*等待事件*/

vlib_process_wait_for_event (vm);

}

else

{

timer = pool_elt_at_index (pm->vr_timers, next_timer);

timeout = timer->expire_time - now;

/*

vlib_process_wait_for_event_or_clock先去检查non_empty_event_type_bitmap是否有置位,如有说明有事件需要去处理,则直接返回。否则将suspend状态标记置位,标识当前是suspend状态,等待event或clock

*/

vlib_process_wait_for_event_or_clock (vm, timeout);

}

/*获取事件类型*/

event_type = vlib_process_get_events (vm, (uword **) & event_data);

switch (event_type)

{

/* Handle VRRP_EVENT_VR_TIMER_UPDATE */

case VRRP_EVENT_VR_TIMER_UPDATE:

next_timer = vrrp_vr_timer_get_next ();

break;

/* Handle periodic timeouts */

case ~0:

/*进行通告报文发送*/

vrrp_vr_timer_timeout (next_timer);

next_timer = vrrp_vr_timer_get_next ();

break;

}

vec_reset_length (event_data);

}

return 0;

}**

void

vrrp_vr_timer_timeout (u32 timer_index)

{

vrrp_main_t *vmp = &vrrp_main;

vrrp_vr_timer_t *timer;

vrrp_vr_t *vr;

if (pool_is_free_index (vmp->vr_timers, timer_index))

{

clib_warning (“Timeout on free timer index %u”, timer_index**);

return;

}

timer = pool_elt_at_index (vmp->vr_timers, timer_index);

vr = pool_elt_at_index (vmp->vrs, timer->vr_index);

switch (timer->type)

{

case VRRP_VR_TIMER_ADV:

vrrp_adv_send (vr, 0);

vrrp_vr_timer_set (vr, VRRP_VR_TIMER_ADV);

break;

case VRRP_VR_TIMER_MASTER_DOWN:

vrrp_vr_transition (vr, VRRP_VR_STATE_MASTER,** NULL);

break;

default:

clib_warning (“Unrecognized timer type %d”, timer**->type);

return;

}

}**

  1. 定时器处理

Master_Down_Interval定时器:Backup设备在该定时器超时后仍未收到通告报文,则会转换为Master状态,计算公式如下:Master_Down_Interval=(3* Advertisement_Interval) + Skew_time,其中,Skew_Time=(256–Priority)/256

  1. 定时器设置

void

vrrp_vr_timer_set (vrrp_vr_t * vr, vrrp_vr_timer_type_t type**)

{

vrrp_main_t *vmp = &vrrp_main;

vlib_main_t *vm = vlib_get_main ();

vrrp_vr_timer_t *timer;

f64 now;

/* Each VR should be waiting on at most 1 timer at any given time.

* If there is already a timer set for this VR, cancel it.

*/

if (vr->runtime.timer_index != ~0)

vrrp_vr_timer_cancel (vr);

pool_get (vmp->vr_timers, timer);

vr->runtime.timer_index = timer - vmp->vr_timers;

timer->vr_index = vr - vmp->vrs;

timer->type = type;

/*当前时间*/

now = vlib_time_now (vm);

/* RFC 5798 specifies that timers are in centiseconds, so x / 100.0 */

switch (type)

{

case VRRP_VR_TIMER_ADV:

timer->expire_time = now + (vr->config.adv_interval / 100.0);

break;

case VRRP_VR_TIMER_MASTER_DOWN:

timer->expire_time = now + (vr->runtime.master_down_int / 100.0);

break;

default:

/* should never reach here */

clib_warning (“Unrecognized VRRP timer type (%d)”,** type**);

return;

}

vec_add1 (vmp->pending_timers, vr->runtime.timer_index);

/*按照超时时间升序*/

vec_sort_with_function (vmp->pending_timers, vrrp_vr_timer_compare);

/*发送TIMER UPDATE事件*/

vlib_process_signal_event (vmp->vlib_main, vrrp_periodic_node.index,

VRRP_EVENT_VR_TIMER_UPDATE,** 0**);

}**

  1. 主备切换处理

void

vrrp_vr_transition (vrrp_vr_t * vr, vrrp_vr_state_t new_state**,** void *data)

{

clib_warning (“VR %U transitioning to %U”, format_vrrp_vr_key**,** vr**,

format_vrrp_vr_state,** new_state**);

/* Don’t do anything if transitioning to the state VR is already in.

* This should never happen, just covering our bases.

*/

if (new_state == vr->runtime.state)

return;

if (new_state == VRRP_VR_STATE_MASTER)

{

/* RFC 5798 sec 6.4.1 (105) - startup event for VR with priority 255

* sec 6.4.2 (365) - master down timer fires on backup VR

*/

vrrp_vr_multicast_group_join (vr);

vrrp_adv_send (vr, 0);

vrrp_garp_or_na_send (vr);

vrrp_vr_timer_set (vr, VRRP_VR_TIMER_ADV);

}

else if (new_state == VRRP_VR_STATE_BACKUP)

{

/* RFC 5798 sec 6.4.1 (150) - startup event for VR with priority < 255

* sec 6.4.3 (735) - master preempted by higher priority VR

*/

vrrp_vr_multicast_group_join (vr);

if (vr->runtime.state == VRRP_VR_STATE_MASTER)

{

vrrp_header_t *pkt = data;

vr->runtime.master_adv_int = vrrp_adv_int_from_packet (pkt);

}

else /* INIT, INTF_DOWN */

vr->runtime.master_adv_int = vr->config.adv_interval;

vrrp_vr_skew_compute (vr);

vrrp_vr_master_down_compute (vr);

vrrp_vr_timer_set (vr, VRRP_VR_TIMER_MASTER_DOWN);

}

else if (new_state == VRRP_VR_STATE_INIT)

{

/* RFC 5798 sec 6.4.2 (345) - shutdown event for backup VR

* sec 6.4.3 (655) - shutdown event for master VR

*/

vrrp_vr_timer_cancel (vr);

if (vr->runtime.state == VRRP_VR_STATE_MASTER)

vrrp_adv_send (vr, 1);

}

else if (new_state == VRRP_VR_STATE_INTF_DOWN)

/* State is not specified by RFC. This is to avoid attempting to

* send packets on an interface that’s down and to avoid having a

* VR believe it is already the master when an interface is brought up

*/

vrrp_vr_timer_cancel (vr);

/* add/delete virtual IP addrs if accept_mode is true */

vrrp_vr_transition_addrs (vr, new_state);

/* enable/disable arp/ND input features if necessary */

vrrp_vr_transition_intf (vr, new_state);

/* add/delete virtual MAC address on NIC if necessary */

vrrp_vr_transition_vmac (vr, new_state);

vr->runtime.state = new_state;

}**

  1. VRRP通告报文处理

|

static void

vrrp_input_process_master (vrrp_vr_t * vr, vrrp_header_t * pkt**)

{

/* received priority 0, another VR is shutting down. send an adv and

* remain in the master state

*/

if (pkt->priority == 0)

{

clib_warning (“Received shutdown message from a peer on VR %U”,

format_vrrp_vr_key,** vr**);

vrrp_adv_send (vr, 0);

vrrp_vr_timer_set (vr, VRRP_VR_TIMER_ADV);

return;

}

/* if either:

* - received priority > adjusted priority, or

* - received priority == adjusted priority and peer addr > local addr

* allow the local VR to be preempted by the peer

*/

if ((pkt->priority > vrrp_vr_priority (vr)) ||

((pkt->priority == vrrp_vr_priority (vr)) &&

(vrrp_vr_addr_cmp (vr, pkt) < 0)))

{

vrrp_vr_transition (vr, VRRP_VR_STATE_BACKUP,** pkt**);

return;

}

/* if we made it this far, eiher received prority < adjusted priority or

* received == adjusted and local addr > peer addr. Ignore.

*/

return;

}

/* RFC 5798 section 6.4.2 */

static void

vrrp_input_process_backup (vrrp_vr_t * vr, vrrp_header_t * pkt)

{

vrrp_vr_config_t *vrc = &vr->config;

vrrp_vr_runtime_t *vrt = &vr->runtime;

/* master shutting down, ready for election */

if (pkt->priority == 0)

{

clib_warning (“Master for VR %U is shutting down”, format_vrrp_vr_key,

vr);

vrt->master_down_int = vrt->skew;

vrrp_vr_timer_set (vr, VRRP_VR_TIMER_MASTER_DOWN);

return;

}

/* no preempt set or adv from a higher priority router, update timers */

if (!(vrc->flags & VRRP_VR_PREEMPT) ||

(pkt->priority >= vrrp_vr_priority (vr)))

{

vrt->master_adv_int = clib_net_to_host_u16 (pkt->rsvd_and_max_adv_int);

vrt->master_adv_int &= ((u16) 0x0fff);** /* ignore rsvd bits */

vrrp_vr_skew_compute (vr);

vrrp_vr_master_down_compute (vr);

vrrp_vr_timer_set (vr, VRRP_VR_TIMER_MASTER_DOWN**);

return;

}

/* preempt set or our priority > received, continue to wait on master down */

return;

}

always_inline void

vrrp_input_process (vrrp_input_process_args_t * args)

{

vrrp_vr_t *vr;

vr = vrrp_vr_lookup_index (args->vr_index);

if (!vr)

{

clib_warning (“Error retrieving VR with index %u”,** args**->vr_index);

return;

}

switch (vr->runtime.state)

{

case VRRP_VR_STATE_INIT:

return;

case VRRP_VR_STATE_BACKUP:

/* this is usually the only state an advertisement should be received */

vrrp_input_process_backup (vr, args->pkt);

break;

case VRRP_VR_STATE_MASTER:

/* might be getting preempted. or have a misbehaving peer */

clib_warning (“Received advertisement for master VR %U”,

format_vrrp_vr_key,** vr**);

vrrp_input_process_master (vr, args->pkt);

break;

default:

clib_warning (“Received advertisement for VR %U in unknown state %d”,

format_vrrp_vr_key,** vr**,** vr**->runtime.state);

break;

}

return;

}**

|
| :- |
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
MSTP+VRRP技术是一种用于构建高可靠性以太网网络的解决方案。MSTP(Multiple Spanning Tree Protocol,多重生成树协议)是一种基于IEEE 802.1s标准的生成树协议,它可以在一个以太网网络中支持多个生成树实例,从而提供更好的网络容错性和负载均衡能力。 MSTP通过将网络划分为多个区域(也称为实例)来实现多个生成树的构建。每个区域都有一个根桥和一个生成树,这样可以避免整个网络中所有交换机都参与生成树计算,减少了计算复杂度。MSTP使用了RSTP(Rapid Spanning Tree Protocol,快速生成树协议)的快速收敛特性,可以在网络拓扑变化时快速重新计算生成树,从而提供快速的故障恢复能力。 VRRP(Virtual Router Redundancy Protocol,虚拟路由器冗余协议)是一种用于提供默认网关冗余的协议。在一个网络中,通常会有多个路由器提供默认网关服务,VRRP可以将这些路由器组成一个虚拟路由器组,对外提供一个虚拟的默认网关IP地址。当其中一个路由器发生故障时,其他路由器可以接管虚拟路由器的IP地址,从而实现无缝的故障切换。 MSTP+VRRP技术的结合可以提供更高的网络可靠性和冗余性。通过MSTP可以构建多个生成树实例,实现网络的负载均衡和容错能力;而VRRP则提供了默认网关的冗余,确保网络中的设备可以无缝切换到备用路由器上,从而保证网络的连通性。
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值