Routing TCP/IP, Volume II by Jeff Doyle
Introduction to BGP-4
CIDR
The Internet
Internet subscribers connect to an Internet service provider (ISP). These local ISPs in turn are the customers of larger ISPs that cover an entire geographic region such as a state or a group of adjacent states. These larger ISPs are called regional service providers. The regional service providers, in turn, connect to large ISPs with high-speed backbones spanning a national or global area. More commonly, these various providers are referred to as Tier III, Tier II, and Tier I providers, respectively.
CIDR
Before CIDR, if your company needed 500 host addresses, a Class C address would not have served your needs. You probably would have requested a Class B address, even though you would be wasting 65,000 host addresses. With CIDR, your needs can be met with a /23 block.
Difficulties with CIDR
Who needs BGP
An important principle to remember when working with inter-AS traffic is that each physical link actually represents two logical links: one for incoming traffic and one for outgoing traffic.
A Single-Homed Autonomous System
Static Routes Are All That Is Needed
Multihoming to a Single Autonomous System
When the redundant link is used only for backup, there is again no call for BGP. The routes can be advertised just as they were in the single-homed scenario, except that the routes associated with the backup link have the distances set high so that they are used only if the primary link fails.
If the geographical separation between the two (or more) exit points is large enough for delay variations to become significant, you might have a need for better control of the routing. You might now consider BGP.
Remember that the incoming route advertisements influence your outgoing traffic. outgoing route advertisements influence your incoming traffic.
You should use BGP only when you can realize an advantage in traffic control. Consider the incoming and outgoing traffic separately. If it is only important to control your incoming traffic, use BGP to advertise routes to your provider while still advertising only a default route into your AS. On the other hand, if it is only important to control your outgoing traffic, use BGP only to receive routes from your provider. Consider carefully the ramifications of accepting routes from your provider. The table size can be very big. “Taking partial BGP routes” is a compromise. For example, a provider might advertise only routes to its other subscribers, plus a default route to reach the rest of the Internet.
Multihoming to Multiple Autonomous Systems
The best candidates for multihoming to multiple providers are corporations and ISPs that are large enough to qualify for a provider-independent address space (or who already have one) and a public autonomous system number.
One option is to use one ISP as a primary Internet connection and the other as a backup only; another option is to default route to both providers and let the routing chips fall where they may. If neither of these solutions is likely to be acceptable, BGP is the preferred option in this scenario. If full routes are accepted from both providers, the best route for every Internet destination is chosen. Another option, full routes can be taken from the preferred provider and partial routes can be taken from the other provider. Yet another option, each provider might send its own customer routes, and the subscriber points default routes to both providers. A fourth option, each ISP might send its customer routes and also the customer routes of its upstream provider.
“Load Balancing”
Multihoming is for redundancy and increased routing efficiency, not load balancing.
BGP Basics
BGP rides on TCP with port 179. BGP uses a list of AS numbers through which a packet must pass to reach the destination. BGP is called a path vector routing protocol. The list is called AS_PATH, one of several path attibutes.
BGP does not show the details of the topologies within each AS. Because BGP sees only a tree of autonomous systems, it can be said that BGP takes a higher view of the Internet than IGP, which sees only the topology within an AS.
show ip bgp
The table shows destination networks, next-hop routers, metric, locprf, weight. Notice that each AS_PATH ends in an i, indicating that the path terminates at an IGP.
BGP Message Types
Open, Keepalive, Update, Notification
BGP States
Idle, Connect, Active, Opensent, Openconfirm, Established,
Path Attributes
ORIGIN Well-known mandatory
AS_PATH Well-known mandatory
NEXT_HOP Well-known mandatory
LOCAL_PREF Well-known discretionary
ATOMIC_AGGREGATE Well-known discretionary
AGGREGATOR Optional transitive
COMMUNITY Optional transitive
MULTI_EXIT_DISC (MED) Optional nontransitive
ORIGINATOR_ID Optional nontransitive
CLUSTER_LIST Optional nontransitive
If an internal BGP speaker receives multiple routes to the same destination, it compares the LOCAL_PREF attributes of the routes. The route with the highest LOCAL_PREF is selected.
To influence incoming traffic, the MULTI_EXIT_DISC attribute, known as the MED for short, is used. This optional nontransitive attribute is carried in EBGP updates and allows an AS to inform another AS of its preferred ingress points.
Route Dampening
Penalty, Suppress limit, Reuse limit, Half-life, Maximum suppress time
IBGP and IGP
To protect against loops, BGP does not advertise routes that have been learned from an IBGP peer to another IBGP peer.
The IBGP internetwork must be fully meshed.
Two tools for controlling the full IBGP mesh requirement, route reflectors and confederations.
IGP is used for establishing IBGP connectivity.
Managing Large-Scale BGP Peering
Route Reflector
A router reflector and its clients are known collectively as a cluster.
Route reflectors work by relaxing the rule that IBGP peers cannot advertise routes learned from other IBGP peers. To avoid possible routing loops or other routing errors, the route reflector cannot change the attributes of the routes it receives from clients.
For redundancy, a cluster can have more than one RR. The clients have physical connections to each of the route reflectors.
An AS also can have multiple clusters, with each cluster having redundant route reflectors.
A Route Reflector Can Be the Client of Another Route Reflector. Thus, you can build “nested” route reflection clusters.
To prevent routing loops, route reflectors use two BGP path attributes: ORIGINATOR_ID and CLUSTER_LIST.
Confederations
A confederation is an AS that has been subdivided into a group of member autonomous systems. A confederation ID is the AS number of the entire confederation.
Confederations add two more types to the AS_PATH, AS_CONFED_SEQUENCE and AS_CONFED_SET.
It is common practice to use the reserved range 64512 to 65535 to number the member autonomous systems.
Configuring BGP-4
Basic BGP Configuration
Peering BGP routers
Taos
router bgp 200
neighbor 192.168.1.226 remote-as 100
Vail
router bgp 100
neighbor 192.168.1.222 remote-as 100
neighbor 192.168.1.225 remote-as 200
show ip bgp neighbors
The interface from which the router ID is taken does not have to be running BGP.
clear ip bgp
bgp router-id
“Real-life” IBGP implementations use either the next-hop-self function or run an IGP in passive mode on the external interfaces.
Injecting IGP routes into BGP
For each prefix specified with the command network, BGP looks into the routing table. If an entry in the table exactly matches the network prefix, that prefix is entered into the BGP table and advertised.
router eigrp 200
passive-interface Serial0
network 192.168.1.0
network 192.168.100.0
!
router bgp 200
network 192.168.1.216 mask 255.255.255.252
network 192.168.100.0
network 192.168.200.0
neighbor 192.168.1.226 remote-as 100
IBGP Over an IGP
A single IBGP session can be created between the loopback interfaces of the routers. OSPF takes care of finding the best path for the IBGP session.
You also can establish EBGP sessions between loopback interfaces, though you rarely do. Neighbor ebgp-multihop command is needed to change the TTL of the EBGP packets to 2. And static routes are necessary so that each router knows how to find the address of its neighbor’s loopback interface to begin the TCP session.
Aggregate Routes
router eigrp 100
network 192.168.199.0
!
router bgp 100
network 192.168.192.0 mask 255.255.248.0
neighbor 192.168.1.253 remote-as 200
!
ip route 192.168.192.0 255.255.248.0 Null0
Use aggregate-address command
router eigrp 100
network 192.168.199.0
!
router bgp 100
aggregate-address 192.168.192.0 255.255.248.0 summary-only
redistribute eigrp 100
neighbor 192.168.1.253 remote-as 200
Managing BGP Connections
neighbor description
neighbor password
advertisement-interval
bgp bestpath as-path ignore
neighbor maximum-prefix
neighbor shutdown
timers bgp
Routing Policies
No other IP routing protocol offers policy features as powerful as those of BGP, and no other protocol carries as great a potential for getting you into trouble as does BGP.
clear ip bgp *
clear ip bgp soft in
Like a “hard” reset, you can specify a single neighbor, a peer group, or all BGP connections.
Filtering Routes by NLRI
The first and simplest of the route filters available to BGP are defined by the distribute-list command. This route filter is defined for each neighbor or peer group and points to an access list that defines the prefixes, or NLRI, on which the filter will act.
Filtering Routes by AS_PATH
ip as-path access-list
neighbor filter-list
Filtering with Route Maps
neighbor route-map
match ip address
match as-path
Administrative Weights
neighbor weight
neighbor 10.200.60.1 weight 50000
neighbor filter-list weight
router bgp 30
neighbor 10.200.60.1 filter-list 2 weight 60000
!
ip as-path access-list 2 permit _75$
neighbor filter-list weight
neighbor 10.200.60.1 filter-list 2 weight 60000
neighbor route-map
router bgp 30
neighbor 10.200.60.1 route-map Cervinia in
!
ip as-path access-list 2 permit _75$
ip as-path access-list 3 permit _50$
!
route-map Innsbruck permit 10
match as-path 2
set weight 40000
route-map Innsbruck permit 20
match as-path 3
set weight 60000
Local Preference
Unlike administrative weight, the LOCAL_PREF is not limited to a single router. Rather, it is communicated to IBGP peers. The attribute is not communicated to EBGP peers—hence the name local preference.
ip default local-preference
set local-preference
router bgp 30
neighbor 10.100.65.1 route-map PREF in
!
ip as-path access-list 2 permit _75$
route-map PREF permit 10
match as-path 2
set local-preference 300
route-map PREF permit 20
Multi_Exit_Disc
The MULTI_EXIT_DISC attribute, or MED, is used to influence the routing decisions in neighboring autonomous systems.
Another term for MED is metric, and another term for metric is distance. So remember “highest preference, shortest distance.”
router bgp 30
neighbor 10.100.83.1 route-map MED out
!
access-list 1 permit 172.31.0.0
route-map MED permit 10
match ip address 1
set metric 100
Prepending the AS_PATH
route-map PATH permit 10
match ip address 3
set as-path prepend 30
Route Tagging
Tags are useful when a route is redistributed from protocol A into protocol B and then redistributed back into protocol A at some other point.
Route Dampening
Route dampening is enabled under the BGP process configuration with the command bgp dampening. If you want to change the default values, the syntax is bgp dampening half-life reuse suppress max-suppress.
show ip bgp flap-statistics
show ip bgp dampened-paths
clear ip bgp flap-statistics
clear ip bgp dampening
Large-Scale BGP
Private AS Numbers
AS numbers 64512 to 65535 are reserved for private use.
neighbor remove-private-AS
BGP Confederations
IBGP is used normally within each member AS, but a special version of EBGP known as confederation EBGP is run between member autonomous systems.
router ospf 65534
network 10.34.0.0 0.0.255.255 area 65534
network 10.255.0.0 0.0.255.255 area 0
!
router bgp 65534
no synchronization
bgp confederation identifier 1200
bgp confederation peers 65533 65535
neighbor Confed peer-group
neighbor Confed ebgp-multihop 2
neighbor Confed update-source Loopback
neighbor Confed next-hop-self
neighbor MyGroup peer-group
neighbor MyGroup remote-as 65534
neighbor MyGroup update-source Loopback0
neighbor 10.33.255.1 remote-as 65533
neighbor 10.33.255.1 peer-group Confed
neighbor 10.34.255.2 peer-group MyGroup
neighbor 10.35.255.1 remote-as 65535
neighbor 10.35.255.1 peer-group Confed
Confederation EBGP is something of a hybrid between normal BGP and IBGP. Specifically, within a confederation, the following applies:
The NEXT_HOP attribute of routes external to the confederation is preserved throughout the confederation.
MULTI_EXIT_DISC attributes of routes advertised into a confederation are preserved throughout the confederation.
LOCAL_PREF attributes of routes are preserved throughout the entire confederation.
The AS numbers of the member autonomous systems are added to the AS_PATH within the confederation but are not advertised outside of the confederation.
The confederation AS numbers in an AS_PATH are used for loop avoidance but are not considered when choosing a shortest AS_PATH within the confederation.
You can design confederations taking cue from OSPF so that all areas interconnect through a single backbone area, eliminating the possibility of inter-area loops.
Route Reflectors
Fortress
router bgp 65533
no synchronization
bgp confederation identifier 1200
bgp confederation peers 65000
neighbor 10.33.255.1 remote-as 65000
neighbor 10.33.255.1 ebgp-multihop 2
neighbor 10.33.255.1 update-source Loopback0
neighbor 10.33.255.2 remote-as 65533
neighbor 10.33.255.2 update-source Loopback0
neighbor 10.33.255.2 route-reflector-client
neighbor 10.33.255.2 next-hop-self
neighbor 10.33.255.3 remote-as 65533
neighbor 10.33.255.3 update-source Loopback0
neighbor 10.33.255.3 route-reflector-client
neighbor 10.33.255.3 next-hop-self
Nakiska
router bgp 65533
no synchronization
bgp confederation identifier 1200
network 10.33.5.0 mask 255.255.255.0
neighbor 10.33.255.4 remote-as 65533
neighbor 10.33.255.4 update-source Loopback0
neighbor 10.33.255.4 next-hop-self
neighbor 172.17.255.1 remote-as 1000
neighbor 172.17.255.1 ebgp-multihop 2
neighbor 172.17.255.1 update-source Loopback0
Marmot
router bgp 65533
no synchronization
bgp confederation identifier 1200
network 10.33.4.0 mask 255.255.255.0
neighbor 10.33.255.4 remote-as 65533
neighbor 10.33.255.4 update-source Loopback0
neighbor 10.33.255.4 next-hop-self
If you configure more than one route reflector in a cluster, you must use the bgp cluster-id command to ensure that all RRs are identifying themselves as members of the same cluster.
Fortress
router bgp 65533
no synchronization
bgp cluster-id 33
bgp confederation identifier 1200
bgp confederation peers 65000
neighbor 10.33.255.1 remote-as 65000
neighbor 10.33.255.1 ebgp-multihop 2
neighbor 10.33.255.1 update-source Loopback0
neighbor 10.33.255.2 remote-as 65533
neighbor 10.33.255.2 update-source Loopback0
neighbor 10.33.255.2 route-reflector-client
neighbor 10.33.255.2 next-hop-self
neighbor 10.33.255.3 remote-as 65533
neighbor 10.33.255.3 update-source Loopback0
neighbor 10.33.255.3 route-reflector-client
neighbor 10.33.255.3 next-hop-self
neighbor 10.33.255.5 remote-as 65533
neighbor 10.33.255.5 update-source Loopback0
neighbor 10.33.255.5 next-hop-self
Norquay
router bgp 65533
no synchronization
bgp cluster-id 33
bgp confederation identifier 1200
bgp confederation peers 65000
neighbor 10.33.255.1 remote-as 65000
neighbor 10.33.255.1 ebgp-multihop 2
neighbor 10.33.255.1 update-source Loopback0
neighbor 10.33.255.2 remote-as 65533
neighbor 10.33.255.2 route-reflector-client
neighbor 10.33.255.2 update-source Loopback0
neighbor 10.33.255.2 next-hop-self
neighbor 10.33.255.3 remote-as 65533
neighbor 10.33.255.3 route-reflector-client
neighbor 10.33.255.3 update-source Loopback0
neighbor 10.33.255.3 next-hop-self
neighbor 10.33.255.4 remote-as 65533
neighbor 10.33.255.4 update-source Loopback0
neighbor 10.33.255.4 next-hop-self
The Links Interconnecting Clusters Must Be Between Route Reflectors, Not Between Clients.
The rule that clients must peer only to their RRs has two exceptions. First, a client itself can be a route reflector for another cluster.The second exception is when there is a full IBGP mesh among the clients.
IP Multicast Routing
Multicast IP Addresses and IGMP
Class D IP addresses in the Range 224.0.0.0–239.255.255.255 are used as multicast addresses.
Multicast MAC addresses on Ethernet are created by concatenating the last 23 Bits of the IP address with the first 25 bits of the MAC address 0100.5E00.0000.
IGMP messages are limited to the local data link.
Hosts running IGMPv2 use three types of messages: Membership Report messages, Version 1 Membership Report messages, Leave Group messages.
The local router periodically polls the subnet with queries: General Query or Group-Specifi Query.
show ip igmp groups
The primary addition to IGMPv3 is the inclusion of a Group-and-Source-Specific Query.
PIM-SM
PIM-SM supports both shared and source-based trees.
PIM-SM uses seven PIMv2 messages: Hello, Bootstrap, Candidate-RP-Advertisement, Join/Prune, Assert, Register, Register-Stop.
The Bootstrap Protocol
The bootstrap protocol is used to designate and advertise the Rendezvous Point.
PIM-SM and Shared Trees
show ip pim rp mapping
show ip mroute