BGP Best Practices

TM
Advanced Services
Cisco Systems Advanced Services
BGP Best Practices

Version 3.0
Cisco
Corporate Headquarters
170 West Tasman Drive
San Jose, CA 95134-1706
USA
http://www.cisco.com
Tel: 408 526-4000
800 553-NETS (6387)
Fax:408 526-4100
Legal Notice
THE SPECIFICATIONS AND INFORMATION REGARDING THE PRODUCTS IN THIS

DOCUMENT ARE SUBJECT TO CHANGE WITHOUT NOTICE. ALL STATEMENTS,
INFORMATION, AND RECOMMENDATIONS IN THIS MANUAL ARE BELIEVED TO BE
ACCURATE BUT ARE PRESENTED WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED. USERS MUST TAKE FULL RESPONSIBILITY FOR THEIR APPLICATION OF ANY
PRODUCTS.
THE SOFTWARE LICENSE AND LIMITED WARRANTY FOR THE ACCOMPANYING PRODUCT
ARE SET FORTH IN THE INFORMATION PACKET THAT SHIPPED WITH THE PRODUCT AND
ARE INCORPORATED HEREIN BY THIS REFERENCE. IF YOU ARE UNABLE TO LOCATE THE
SOFTWARE LICENSE OR LIMITED WARRANTY, CONTACT YOUR CISCO REPRESENTATIVE
FOR A COPY.
The Cisco implementation of TCP header compression is an adaptation of a program developed by the
University of California, Berkeley (UCB) as part of UCB’s public domain version of the UNIX operating
system. All rights reserved. Copyright © 1981, Regents of the University of California.
NOTWITHSTANDING ANY OTHER WARRANTY HEREIN, ALL DOCUMENT FILES AND

SOFTWARE OF THESE SUPPLIERS ARE PROVIDED “AS IS” WITH ALL FAULTS. CISCO AND
THE ABOVE-NAMED SUPPLIERS DISCLAIM ALL WARRANTIES, EXPRESSED OR IMPLIED,
INCLUDING, WITHOUT LIMITATION, THOSE OF MERCHANTABILITY, FITNESS FOR A
PARTICULAR PURPOSE AND NONINFRINGEMENT OR ARISING FROM A COURSE OF
DEALING, USAGE, OR TRADE PRACTICE.
IN NO EVENT SHALL CISCO OR ITS SUPPLIERS BE LIABLE FOR ANY INDIRECT, SPECIAL,
CONSEQUENTIAL, OR INCIDENTAL DAMAGES, INCLUDING, WITHOUT LIMITATION, LOST
PROFITS OR LOSS OR DAMAGE TO DATA ARISING OUT OF THE USE OR INABILITY TO USE
THIS MANUAL, EVEN IF CISCO OR ITS SUPPLIERS HAVE BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
CCDE, CCENT, CCSI, Cisco Eos, Cisco HealthPresence, the Cisco logo, Cisco Lumin, Cisco Nexus,
Cisco Nurse Connect, Cisco Stackpower, Cisco StadiumVision, Cisco TelePresence, Cisco WebEx, DCE,
and Welcome to the Human Network are trademarks; Changing the Way We Work, Live, Play, and Learn
and Cisco Store are service marks; and Access Registrar, Aironet, AsyncOS, Bringing the Meeting To
You, Catalyst, CCDA, CCDP, CCIE, CCIP, CCNA, CCNP, CCSP, CCVP, Cisco, the Cisco Certified
Internetwork Expert logo, Cisco IOS, Cisco Press, Cisco Systems, Cisco Systems Capital, the Cisco
Systems logo, Cisco Unity, Collaboration Without Limitation, EtherFast, EtherSwitch, Event Center, Fast
Step, Follow Me Browsing, FormShare, GigaDrive, HomeLink, Internet Quotient, IOS, iPhone, iQuick
Study, IronPort, the IronPort logo, LightStream, Linksys, MediaTone, MeetingPlace, MeetingPlace Chime
Sound, MGX, Networkers, Networking Academy, Network Registrar, PCNow, PIX, PowerPanels,
ProConnect, ScriptShare, SenderBase, SMARTnet, Spectrum Expert, StackWise, The Fastest Way to
Increase Your Internet Quotient, TransPath, WebEx, and the WebEx logo are registered trademarks of
Cisco Systems, Inc. and/or its affiliates in the United States and certain other countries.
All other trademarks mentioned in this document or website are the property of their respective owners.
The use of the word partner does not imply a partnership relationship between Cisco and any other
company.
Copyright © 2009 Cisco Systems, Inc. All rights reserved.
BGP Best Practice Version 3.0 2

Contents
Contents 2
Tables 7
Figures 8
About This Design Document 9
Document Purpose 9
Intended Audience 9
Scope 9
Dynamic Update Peer-Groups and Peer-Templates 11
Background 11
Benefits 11
Guidelines 12
Risks and Limitations 12
Resource Allocation and Convergence 14
Background 14
Benefit 14
Guidelines 14
Details 15
Improving BGP Convergence 15
Soft Reconfiguration and Route Refresh 19
Background 19
Benefits 19
Guidelines 19
Details 20
BGP Support for Local-AS 21
Background 21
Guidelines 21
Details 21
BGP Support for Dual-AS Configuration 22
Background 22
Guidelines 22

Details 23
BGP Infrastructure Security 26
BGP TTL Security Hack (BTSH) 26

Background 26
Benefits 26
Guideline 26
BGP Authentication 27
Background 27
Benefits 27
Guideline 27
BGP Maximum-Prefix 27
Background 27
Benefits 28
Guideline 28
Route Flap Dampening 29
Background 29
Guidelines 29
Details 30
Route Reflectors 31
Background 31
Benefits 31
Guidelines 31
Confederations 33
Background 33
Guidelines 33
Details 33
Route Propagation in Confederation 33
BGP-IGP Redistribution Policies 35
Background 35
Benefits 35
Guidelines 35
BGP Next-hop 36
Background 36
Guidelines 36
Community Attribute 38

Background 38
Benefits 38
Guidelines 38
Details 39
Extended BGP Communities 41
Synchronization Rules 44
Background 44
Benefits 44
Guidelines 44
Multi-homing 45
Background 45
Benefits 45
Multi-homing Scenarios 45
Stub Network Single-homed 46
Stub Network Multi-homed: Single Border Router 46
Stub Network Multi-homed: Multiple Border Routers 46
Standard Multi-Homed Network: Single border router / Multiple ISP’s 47
Standard Multi-Homed Network: Multiple border routers 49
Multi-homing Scenario Examples 50
Scenario 1: Enterprise stub network, single homed 50
Scenario 2: Stub network multi-homed to single SP with single border router 51
Scenario 3: Stub network multi-homed to single SP with multiple border routers 52
Scenario 4: Standard Network Multi-homed to Multiple ISPs with Multiple Border Routers 53
Load Balancing 54
Background 54
Guidelines 54
Inbound Load Balancing 54
Outbound Load Balancing 54
Scenario 1 54
Scenario 2 55
Scenario 3 55
Scenario 4 (iBGP Multipath) 55
Scenario 5 (Link Bandwidth) 56
Details 57
Using Advertise Map for Inbound Load Balancing 57
Using EBGP Multipath for Outbound Load Balancing 58
Using BGP Link Bandwidth for Outbound Load Balancing 59
Route Aggregation 61
Background 61
Benefits 61
Guidelines 61
Details 62
Configuration Example Aggregation using the network command 63

Other Miscellaneous Areas 64
Prefix Lists 64
Background 64
Conditional Route Injection 65
Background 65
Benefits 66
Guidelines 66
BGP Deterministic-med 67
Background 67
Benefits 67
Guidelines 70
BGP Router Identifier 71
Background 71
Benefits 71
Guidelines 71
BGP Log Neighbor Changes 71
Background 71
Benefits 71
Guidelines 72

Tables
Table 1 BGP Community Design 41

Figures
Figure 1 BGP Support for Dual-AS example 23
Figure 2 Single-homed Enterprise Stub Network with Border Router 50
Figure 3 Stub Network Multi-homed to Single SP and Border Router 51
Figure 4 Stub Network Multi-homed to Single SP with Multiple Border Routers 52
Figure 6 Use of advertise-map to influence inbound traffic 57
Figure 7 Using EBGP Multipath for Outbound Load Balancing 58
Figure 8 Using BGP Link Bandwidth for Outbound Load Balancing 59
Figure 9 BGP Deterministic-med Example 67

About This Design Document
Routing Protocols design and implementation is quite mature, and a good amount of documentation is
available within Cisco as well as on other external sites. Still, our experience with Routing VT pointed to
the fact that what Cisco or Advanced Services is recommending with regards to routing protocol
implementation as Best Practices is not being captured.
It was also felt that there is no central place where we can store and use all the Design Reviews and Case
studies that are based on best practices in specific situations.
Furthermore, there are many initiatives currently underway to automate the exception reports based on
violation of best practices rules.
In an effort to address all these issues, Routing VT started working to capture these best practices as well
as provide a dynamic way to constantly update and add. This document captures best practices for
implementing BGP.
It is expected that the reader is familiar with basic BGP routing and is experienced in configuring Cisco
routers. No attempt is made to explain and clarify fundamental functionality, although some examples are
provided to help understand the recommendations in question.
Throughout the document, recommendations are highlighted yellow to clearly separate them from
explanations.
Document Purpose
• To provide comprehensive industry best practices for BGP used on Cisco Routers.
• To provide a medium to track and continuously update the technology best practices to ensure that the
collective technical experience of Advanced Services Routing VT is recorded, reviewed, and updated.
In other words, provide a platform for converting AS expertise to Cisco Intellectual Property.
Intended Audience
• This document is for NCEs in Advanced Services, other Cisco Technical Teams, Cisco Partners, and
Customers.
• This document is also for the Advanced Services Tools Team for automation of Best Practices
Compliance.
Scope
This document includes BGP Best Practices for Design, Implementation, Optimization, Planning,
Migration, Case Studies, New Features, and Caveats.
This document focuses on the industry best practices and other related information for BGP
implementation. The following areas are covered:
1. Dynamic Update Peer-Groups and Peer-Templates
2. Resource Allocation and Convergence
3. Soft Reconfiguration and Route Refresh
4. BGP Support for Local-AS
5. BGP Support for Dual-AS Configuration

6. BGP Infrastructure Security
7. Route Flap Dampening
8. Route Reflectors
9. Confederations
10. BGP-IGP Redistribution Policies
11. BGP Next-hop
12. Community Attribute
13. Synchronization Rules
14. Multi-homing
15. Load Balancing
16. Route Aggregation
17. Other Miscellaneous Topics
Information regarding other related technologies, such as MPLS, MBGP, MP-BGP, etc., that use BGP is
beyond the scope of this document.
Where side symbols are included, in this document they have the following meaning:
This symbol means warning. The user may be in a situation that could cause bodily injury. Before the user
works on the equipment, they should be aware of hazards involved with electrical circuitry and be familiar
with standard practices for preventing accidents.
This symbol means caution. In this situation, the user might do something that could result in equipment
damage or loss of data.
This symbol means timesaver. The user might save time by performing or being aware of the
action/information described in the paragraph.
This symbol means note. The user must add information, written or typed; to the document during the
implementation work or that the user must take note of the information presented.
This symbol means tip. The text that accompanies this symbol provides the user with a useful tip.

Dynamic Update Peer-Groups and Peer-
Templates
Background
The BGP Dynamic Update Peer-Groups feature introduces a new algorithm that dynamically calculates
and optimizes update-groups of neighbors that share the same outbound policies and can share the same
update messages. The Dynamic Update Peer-Groups implementation can automatically calculate, based on
the configuration of the neighbors, as to which neighbors can share updates. These neighbors automatically
fall under the same update-group, eliminating the need for depending on peer-group configuration. Hence,
this feature does not require any manual configuration.
To address the limitations of peer groups, the BGP Configuration Using Peer Templates feature was
introduced along with the BGP Dynamic Update Peer-Groups feature. The BGP Configuration Using Peer
Templates feature introduces a new mechanism called the peer template. A peer template is a configuration
pattern that can be applied to neighbors that share common policies.
Benefits
The BGP Dynamic Update Peer-Group feature requires no configuration and occurs automatically. In
previous versions of Cisco IOS Software [prior to 12.0(24) S], BGP update messages were grouped
together based on peer-group configurations. This method of grouping updates limited outbound policies
and specific-session configurations. With Dynamic Update Peer-Group feature, it separates update-group
replication from peer-group configuration, which improves convergence time and flexibility of neighbor
configuration.
The following are the features of the Dynamic Update Peer-Group:

• Dynamically calculates BGP update-group membership based on outbound routing policies.
• This feature does not require any configuration by the network operator.
• Optimal BGP update message generation occurs automatically and independently.
• BGP neighbor configuration is no longer restricted by outbound routing policies, and update-groups
can belong to different address families.
In IOS version prior to 12.0(24)S, peer-groups are also used for update-grouping in addition to
configuration grouping which has some limitations, such as:
- All neighbors that share the same peer group configuration also has to share the same outbound
routing policies
- All neighbors have to belong to the same peer group and address family.
Peer-templates address these limitations. The BGP Configuration using Peer Templates feature improves
the flexibility of BGP neighbor configuration through the introduction of peer-policy and peer-session
configuration templates.
Peer Session Template: Peer session templates are used to group and apply the configuration of general
session commands that are common to all address family and Network Layer Reachability Information
(NLRI) configuration modes.

Peer Policy Template: Peer policy templates are used to group and apply the configuration of commands
that are applied within specific address-families and NLRI configuration modes.
The inheritance capability is a key component of peer template operation. Inheritance expands the
scalability and flexibility of neighbor configuration by allowing you to chain together peer templates
configurations to create simple configurations that inherit common configuration statements or complex
configurations that apply very specific configuration statements along with common inherited
configurations.
• Peer templates are reusable and support inheritance, which allows the network operator to group and
apply distinct neighbor configurations for BGP neighbors that share common policies.
• Allows the network operator to define very complex configuration patterns through the capability of a
peer template to inherit a configuration from another peer template.
Guidelines
BGP Dynamic Update Peer-Group feature was first introduced in 12.0(24)S and integrated into 12.2(18)S
and 12.3(4)T. Peer-Template feature was first introduced in 12.0(24)S and integrated into 12.2(18)S ,
12.3(4)T and 12.2(27)SBC.
• The BGP Dynamic Update Peer-Group feature requires no configuration and occurs automatically.
• If you are running an IOS version that supports BGP Dynamic Update Peer-Groups feature then the
default behavior is that BGP dynamically calculates and optimizes update-groups of neighbors that
share the same outbound policies and can share the same update messages which improves the
performance of BGP update message generation. Configuring peer-groups will not achieve anything in
terms of performance rather it may just cut down on configuration. So there are no performance
benefits with peer-groups if the code supports BGP Dynamic Update Peer-Groups feature.
• It is recommended to use Peer templates as it improves the flexibility and enhances the capability of
neighbor configuration. Peer templates also provide an alternative to Peer-Group configuration and
overcome some limitations of Peer-Groups.
• The inheritance capability of peer template eliminates the need to repeat configuration statements that
are commonly reapplied to groups of neighbors because common configuration statements can be
applied once and then indirectly inherited by peer templates that are applied to neighbor groups with
common configurations.
Risks and Limitations

The following restrictions apply to the BGP Configuration using Peer Session Templates and Peer Policy
Template feature:
• A peer session template can directly inherit only one session template, and each inherited session
template can also contain one indirectly inherited session template. So, a neighbor or neighbor
group can be configured with only one directly applied peer session template and seven additional
indirectly inherited peer session templates.
• A peer policy template can directly or indirectly inherit up to eight peer policy templates.
• A BGP neighbor cannot be configured to work with both peer groups and peer templates. A BGP
neighbor can be configured to belong only to a peer group or to inherit policies only from peer
templates.

• For small networks with simple BGP update policies, Peer-Template configuration could seem to
be extensive however the benefit that is received due to inheriting peer-template and session-
template policies provides the flexibility in configuration even as the network grows.
Peer-Groups
If you are running an IOS version that does not support BGP Dynamic Update Peer-Groups, then Peer-
Groups can be used as all update messages are grouped together based on peer-group configurations and
also reduces configuration.
With peer-group configuration, multiple peer-groups are required for each remote AS. With Peer-
Templates, multiple policies are available for each remote AS so there is no need to configure an entirely
new peer-group configuration for the each different AS. There is no flexibility on a per neighbor basis as
available with peer-templates and inheritance.
If the IOS version supports Dynamic Update Peer-Groups with Peer-Templates then it is recommended to
use that feature as it provides greater flexibility, neighbour grouping and ease of configuration.
For detailed information on Dynamic Update Peer-Groups, click here.

For detailed information on Peer-Templates, click here.

Resource Allocation and Convergence
Background
BGP is primarily used to provide control plane connectivity between multiple administrative domains
(called autonomous systems (AS)) with varying routing policies. The best-known instance of inter-domain
connectivity provided using BGP is the Global Internet. BGP chooses the best loop free path through the
internetwork, with “best” being defined by the local policy of each AS. In the absence of overriding policy,
BGP will choose the shortest path through the internetwork as defined by the AS Path.
As individual networks grow larger, this could eventually pose scalability challenges to Service Providers
and enterprise networks to maintain an ever-increasing number of TCP sessions. In addition, router
processing and memory demands increase. Due to its importance in the Internet, BGP is a major focus for
scaling and convergence work.
As the size of the Internet routing table and number of peers grow, service providers and large enterprise
customers are noticing an increase in the BGP convergence time.
Several BGP enhancements have been made to improve convergence and basic scaling properties. Besides
enhancing hardware resources (CPU, memory), there are several steps that customers can take to further
reduce convergence time. These include using peer groups, Transmission Control Protocol path Maximum
Transmission Unit discovery (PMTUD), large input queues, and tuning session and update timers.
These features within Cisco IOS software assist network operators with scaling their IP-BGP networks in
ways that mitigate the management, processing, and memory burdens of expanding networks.
Benefit
• Faster convergence: By assessing the requirement of appropriate resources and tuning other
parameters, BGP administrators can achieve faster convergence within their networks.
• Increased scalability: BGP enhancements and configuration recommendations allow us to handle
even greater numbers of routing table entries within the same convergence time frame.
Guidelines
BGP updates take CPU cycles, and the number of peers and routes that you want to hold takes memory.
Appropriate hardware has to be considered depending on the number of peers and routes. In general, the
following components have an effect on the number of BGP routes/peers a router can support
• Router's CPU
• Route Memory
• IOS version
The more memory a router has the more routes it can support, much like how a router with a faster CPU
can support larger numbers of peers.
BGP updates rely on TCP, optimization of router resources like memory and TCP session parameters like
maximum segment size (MSS), path MTU discovery, interface input queues, TCP window size, etc. help
improve converge..

It is recommended to leave the BGP timers (advertisement-interval, bgp scan-timer import and bgp scan-
timer, keepalive, and hold timer) at their default values. Recent features such as Next-Hop-Tracker, Fast
Session Deactivation have aided in improved convergence without tuning the BGP timers. If default values
do not meet the network convergence requirements, consider tuning the timers with caution.
However, in networks with lots of peering sessions and large routing tables, it may become necessary to
carefully observe the BGP behavior and adjust following parameters
• The interface input queues to be monitored and adjusted if drops are seen.
• Increase SPD headroom to prevent dropping BGP Sessions.
• Enable Path MTU Discovery
• TCP MSS is recommended along with Path MTU discovery in Multi-Hop BGP Session.
• TCP Receive window-size tuned on routers with large number of routes and routers in Long Fat
Networks – Networks with large bandwidth and high delay.
Details
Improving BGP Convergence

Longer convergence times are due to the increased size of the Internet table and an increase in the number
of peers supported by a single BGP speaker. Detailed below are some areas that can be tuned to improve
BGP convergence.
Memory
The amount of memory required to store BGP routes depends on many factors, such as the router, the
number of alternate paths available, route dampening, community, the number of maximum paths
configured, BGP attributes, and VPN configurations. Knowledge of these parameters is required to
calculate the amount of memory required to store a certain number of BGP routes. Extra memory will be
needed during the initial startup for the BGP peers, as the routing information churn that takes place
consumes additional memory. With sufficient available memory, BGP would utilize the Update Packing
Optimization feature which would improve convergence. However, it is important to understand ways to
reduce memory consumption and achieve optimal routing without the need to receive the complete Internet
routing table. Route summarization for example, cuts memory consumption.
For detailed information on achieving optimal routing and memory consumption, click here.
Path MTU discovery

Every TCP session has a limit in terms of how much data it can transport in a single packet. This limit is
defined as the Maximum Segment Size (MSS) and is 536 bytes by default. This means TCP will take all of
the data in a transmit queue and break it up into 536 byte chunks before passing packets down to the IP
layer. Using a MSS of 536 bytes ensures that the packet will not be fragmented before it gets to its
destination because most links have a MTU of at least 1500 bytes.
Using a small MSS value creates a large amount of TCP/IP overhead, especially when TCP has a lot of
data to transport like it does with BGP. The solution is to dynamically determine how large the MSS value
can be without creating packets that will need to be fragmented. This is accomplished by enabling ip tcp

path-mtu-discovery (a.k.a. PMTU). PMTU allows TCP to determine the smallest MTU size among all
links between the ends of a TCP session. TCP will then use this MTU value, minus room for the IP and
TCP headers, as the MSS for the session. If a TCP session only traverses Ethernet segments, the MSS will
be 1460 bytes. If it only traverses POS segments, the MSS will be 4430 bytes. The increase in MSS from
536 to 1460 or 4430 bytes reduces TCP/IP overhead, which helps BGP converge faster.
Please note that the PMTUD derives the smallest MTU on the best path (primary path) between endpoints
and sets the MSS value for the TCP session. In case of failure in the primary path, the secondary path may
have a smaller MTU which would result in fragmentation of the packets till the smallest MTU is found in
the new path. If this is not desired, then it is a good practice to set the TCP Maximum Segment Size to the
lowest MTU size in the network using “ip tcp mss <value>”
In some environments the firewalls or other devices may be configured (or misconfigured) to block the
ICMP messages used for path MTU discovery. In such cases, PMTU Discovery will not work.
Interface Input Queues

Large numbers of interface input queue drops are a very common problem for routers with many peers and
substantial number of routes. During network transient events, there could be large number of TCP update
and ACK packets. This condition can overrun the SPD and interface input queues and result in updates
being lost.
Default Interface Input queue size is 75. Increasing the interface input queue depth will help reduce the
queue drops. Optimal values for interface input queue depends upon the number of peers and Processor
capabilities. Use interface configuration command hold-queue <1-4096> in increase the queue size.
The SPD headroom can be changed using the global SPD Headroom configuration command. SPD
headroom depth is shared by all the interfaces.
For additional information on SPD, please refer to the following URL:
http://www.cisco.com/web/about/security/intelligence/spd.html
TCP Window Size

In scenarios like high bandwidth-high delay networks (Long Fat Networks) and routers under stress, TCP
acknowledgements are delayed. This delay can lead to TCP retransmissions.
TCP Window Scaling feature can help minimize TCP retransmissions. This feature should be enabled by
setting TCP window size to more than 65535 on both the neighboring peers, using the configuration
command ip tcp window-size <bytes more than 65535>
Minimum Route Advertisement Interval (MRAI)

MRAI determines the minimum amount of time that must elapse between an advertisement and/or
withdrawal of routes to a particular destination by a BGP speaker to a peer. This rate limiting procedure
applies on a per-destination basis, although the value of “MinRouteAdvertisementIntervalTimer” is set on
a per BGP peer basis. The intent of the “MinRouteAdvertisementIntervalTimer” is to reduce the overhead
of the BGP routing protocol.
While the original intent of the implementation of the MRAI timer was to reduce the overhead of the BGP
routing protocol ,, it also has additional benefits in can promote stability by batching routing changes and
improve update packing in some scenarios.

For an iBGP Peer the default MRAI is 5 seconds, for an eBGP peer the default is 30 seconds.
For iBGP and PE-CE eBGP peers it is recommended the MRAI timer be set to 0. This is achieved with the
following command:
neighbor x.x.x.x advertisement-interval 0
This was the default beginning with 12.0(32)S

For eBGP peers this could cause dampening, so it should be confirmed with peer ASNs whether they are
using route dampening prior to making this modification.
As processing power of routers has increased, research has revealed that the implementation of MRAI, or
rather the disparate implementation of MRAI between vendors or ASN operators has actually contributed
to network instability.
While the Cisco default for an eBGP peer is 30 seconds, there is no consistency across vendor
implementations. Likewise some ASN operators may alter the default values. These differences mean that
update messages transiting different ASNs using different vendor equipment will arrive at the target router
at different times. This router will see these different messages, and will consider each one for best path
options. This may result in a multiple best path announcements to its neighbors as each update message is
received and processed.
One potential result of this is that a simple update message from one ASN would be seen as a multiple
route flap event a few ASN hops away - when in fact there was no instability whatsoever.
There have been actual measurements where this resulted in a single prefix withdrawal producing 41 BGP
events a few hops away! Not only is the MRAI timer a potential source of problems, but also differences
in CPU loadings and CPU speed will result in different update times for prefixes announcements passing
from router to router. These differences will also contribute to the effects described above.
This can also lead to substantial delays in convergence. In a 4 router fully-meshed network the total MRAI
delay for convergence would be 10-seconds for an iBGP mesh and 60-seconds for an eBGP mesh.
Fast Peering Session Deactivation

The default behavior for BGP when the route to neighbor is lost is to wait for the expiration of the hold
timer. This occurs when a peer does not receive KEEPALIVE, UPDATE, or NOTIFICATION messages
within the period specified in the Hold Time field of the OPEN message.
Fast Peering Session Deactivation uses the Address Tracking Filter to register the route to the peer. A host
route must be available for each peering session that is configured to use BGP fast session deactivation. If
the router loses the route to the peer the session is deactivated, without waiting for the Hold Time to expire.
Fast Peering Session Deactivation’s ability to deactivate a peer immediately upon loss of the route to the
peer, can be most helpful in multi-hop BGP scenarios.
This feature is not recommended for BGP peers with redundant paths to neighbor; as a transient
convergence in the IGP would cause the unwanted deactivation of the peer. In a large BGP network this
could cause significant route churn.
This feature can be configured using the BGP configuration command:

neighbor x.x.x.x fall-over

BGP Support for Next Hop Address Tracking (NHT)
BGP routes do have next-hop addresses. Typically next-hop addresses are learned via an IGP. Events like
internal network outages can cause some next-hop addresses to become unreachable. Such BGP route(s)
with unreachable next-hop(s) are nonfunctional but will continue to stay UP until next BGP scanner runs.
BGP scanner runs every 60sec and will invalidate BGP paths that have unreachable next-hop. It takes
about 60 sec for alternate BGP paths to be installed in the routing table, extending the convergence time to
60sec.
Next-Hop Address Tracking (NHT) feature monitors the reachability of next-hop addresses of BGP routes.
When a next-hop address becomes unreachable (deleted from RIB), NHT will schedule BGP next-hop scan
after a configurable Trigger Delay interval (default = 5sec). This way NHT will invalidate the BGP route
without waiting for the default BGP Scanner for 60 sec and hence will bring down convergence time from
the 60 sec range to 5 sec range.
BGP Support for Next-Hop Address Tracking is enabled by default when a supporting Cisco IOS software
image is installed.
Configuration Example
In the following example, next-hop address tracking is disabled under the IPv4 address family session:
router bgp 65000
address-family ipv4 unicast
no bgp nexthop trigger enable
In the following example, the delay interval for next-hop tracking is configured to occur every 20 seconds
under the IPv4 address family session:
router bgp 65000
address-family ipv4 unicast
bgp nexthop trigger delay 20
The default trigger-delay is 5 seconds. NHT configuration is supported for IPv4 unicast, IPv4 multicast,
IPv4 tunnel and IPv4 VPNV4 not for IPv6.
BGP Selective Address Tracking

Undesirable routes such as aggregate address, BGP prefixes or a default route to the next hop can lead to
invalid condition in BGP RIB. This can lead to oscillation of next-hops, which effects BGP convergence.
The purpose of selective next-hop route filtering is to avoid using undesirable routes to the next hop. This
task uses prefix lists and route maps to match IP addresses or source protocols and to restrict prefixes that
can be considered as next-hop routes.
Only match ip address and match source-protocol commands are supported in the route map. No set
commands or other match commands are supported.
This feature is available in relatively newer IOS versions.

Soft Reconfiguration and Route Refresh
Background
When BGP policy is changed, the BGP session needs to be reset in order for the new policy to take effect.
The resetting of a BGP session results in all prefixes received on that session being removed and
withdrawn from any neighbors to which they had been advertised. In addition, all prefixes, which had
been advertised on the BGP session that is being reset, are removed from the remote peer; potentially
causing additional withdrawals to its other peers.
The BGP Soft Reconfiguration feature and Route Refresh feature are both methods by which the route
flapping effect of resetting a session can be mitigated. The BGP Soft Reconfiguration feature pre-dates the
Route Refresh feature and is more resource intensive.
Benefits
• The benefit of BGP Soft Reconfiguration and Route Refresh features is that both methods initiate
routing policy changes without resetting the BGP session.
• The Route Refresh feature has the advantage of not having additional memory resource requirements
for storing all received prefixes, even those that are not selected by the BGP Best Path process.
• There is no configuration required to enable the Route Refresh feature.
The Route Refresh feature is a direct replacement for the BGP Soft Reconfiguration feature.
Guidelines
Hard reset of a BGP session is disruptive to an operational network. If a BGP session is reset repeatedly
over a short period of time due to multiple changes in BGP policy, it can result in other routers in the
network dampening prefixes, causing destinations to be unreachable and traffic to be black-holed.
• If both peers support the Route Refresh feature, it is recommended that this feature be used in place of
BGP Soft Reconfiguration to minimize memory requirements.
• If a peer does not support the route refresh capability, then the only soft reconfiguration option is to use
the neighbor soft-reconfiguration command, which initiates the storage of inbound routing table
updates and requires additional memory.

• The primary risk with the BGP Soft Reconfiguration feature is the increased memory requirements on a
BGP router that is performing inbound soft reconfiguration. The BGP Soft Reconfiguration feature
does not require both peers for a session to support the feature.
• If BGP Soft Reconfiguration feature is configured on an active BGP session then hard reset of the
session is required the first time for it to take effect.
• The Route Refresh feature was introduced in Cisco IOS–12.0(7)T onwards.
• The Route Refresh feature requires both peers for the session to support this feature. This capability is
negotiated at session initialization.

• The BGP Soft Reconfiguration feature and the Route Refresh feature are mutually exclusive, If the soft
reconfiguration feature using stored routing table updates is configured for a neighbor, the Route
Refresh feature is not used.
Details
Outbound soft reconfiguration

Performing outbound soft reconfiguration does not require any special configuration or have any additional
memory requirements. When a session is soft cleared a new update is created. The delta between the old
update previously sent to the peer and the new update once the session was soft cleared is sent out to peers
in the form of advertisements and withdrawals.
Inbound Soft reconfiguration

When a BGP router receives updates that are denied by inbound policy, they are not retained. This reduces
the resource requirements for memory, in some cases drastically. When the inbound policy is changed, all
of the BGP updates from the remote peer must be reprocessed. The inbound soft reconfiguration feature
enables a BGP router to retain all inbound prefixes, even if they are denied by the inbound policy. These
prefixes are marked (Received and Not Used) in the BGP RIB. These updates are not used in the BGP best
path selection.
The BGP Inbound Soft Reconfiguration must be explicitly enabled on each peer or peer group. The
configuration command to enable this feature is:
router bgp xxxxx

neighbor <Address or Peer Group> soft-reconfiguration
To soft clear a BGP neighbor, the following exec command is used:

clear ip bgp <Peer Address> soft <in|out>
If the BGP Soft Reconfiguration feature is configured, the Route Refresh feature will not be used.
Route-refresh
The Route Refresh feature allows a BGP speaker to request routing update from the peer when a session is
soft cleared. This allows the BGP speaker to reprocess the updates from its neighbors through its inbound
policy. To verify that Route Refresh is supported for a particular BGP session, the following command
shows the capabilities supported for the session:
show ip bgp neighbor <Peer Address>
Neighbor capabilities:
Route refresh: advertised and received (old & new)
If you configure Inbound Soft Reconfiguration (to verify which updates has been denied for example) on a
peer where Route Refresh was negotiated, this peer will send a Route Refresh request to fill its adjacency
RIB.
Route refresh capability advertisement is elaborated on in RFC2918.

BGP Support for Local-AS
Background
BGP is by default configured with a single autonomous system. A new BGP feature, “support for local-
AS” allows a router to appear to be a member of another autonomous system, in addition to its real
autonomous system. This helps when the autonomous systems are merged together and the peering
configurations need to be kept intact.
Guidelines
• Local AS feature can be configured only for eBGP peers. It does not work for two peers in
different sub-AS in a confederation
• Local-AS cannot have the local BGP protocol AS number or the AS number of the remote peer
• Local-AS cannot be customized for individual peers in a peer group
Details
The local-AS feature is typically used when merging two autonomous systems. Detailed example of the
usage of this feature is given in the following URL.
http://www.cisco.com/en/US/tech/tk365/technologies_configuration_example09186a00800949cd.shtml

BGP Support for Dual-AS Configuration
Background
The BGP Support for Dual AS configuration feature extends the functionality of the “BGP support for the
local-AS” feature by providing additional configuration options for customizing the autonomous system
paths. The configuration of this feature is transparent to customer peering sessions, allowing the provider to
merge two autonomous systems without interrupting customer peering arrangements. Customer peering
sessions can be updated later during a maintenance window or during other scheduled downtime
This feature comes with the following keywords:

neighbor ip-address local-as [as-number [no-prepend [replace-as [dual-as]]]]
local-as: The Second autonomous system configured in addition to the real autonomous system.
no-prepend: This keyword causes the BGP process NOT to prepend the “local autonomous system”
number to any prefixes received from its eBGP peer. This is an optional keyword.
replace-as: This keyword causes the BGP speaker to replace its real autonomous system with the
configured “local autonomous system” on all prefixes advertised to its eBGP peer. This keyword is
optional
dual-as: Configures the eBGP neighbor to establish a peering session using the real autonomous-system
number (from the configured BGP routing process) or by using the autonomous-system number configured
with the “neighbor ip-address local-as”. This is an optional keyword.
Guidelines
• This feature can be configured for eBGP peers only. This feature cannot be configured for iBGP
peers or between different sub-autonomous systems of a confederation.
• The existing local BGP protocol AS number or the AS number of the remote peer cannot be
configured as Local-AS
• Local-AS can be configured for individual peers or configurations applied through peer groups
and peer templates.
• If this feature is applied to a group of peers ( using peer-groups or peer templates), the peers
cannot be individually customized
• BGP prepends the autonomous system number from each BGP network that a route traverses to
maintain network reachability information and to prevent routing loops. Since this feature has the
potential to modify or delete the AS_PATH information, as a general recommendation this feature
may be used as a temporary measure only during migration.
• If there is a requirement to use this feature for other purposes such as the one given in the
following example, care must be taken to make sure there are no loops in the AS PATH.

Details
The following CCO document explains the BGP dual AS migration in detail.
http://www.cisco.com/en/US/docs/ios/12_3t/12_3t11/feature/guide/gtbgpdas.html
In addition to the dual AS support usage, this feature is also used by customers on situations where same
AS number is present in two different network domains. This happens when large customers merge their
network with same autonomous system numbers or when customers get private L3 VPN offerings from
providers who also use the same private autonomous system number such as 65000.
In the following example, two customers (XYZ and ABC) having the same private AS numbers in their
domain are merging their network. Obviously the objective is full connectivity between the two customer
networks. Under normal circumstances, using BGP protocol, the full connectivity between the two
customer network domains is not possible as the AS 64512 is present on both domains. BGP speaker at
AS64512 on one customer side would drop the prefixes received from other customer’s AS 64512 as the
same AS_PATH is seen on the prefixes. To achieve the full connectivity between the customers without
changing the AS numbers on the whole domain, following can be done as one of the options
Remove all the Private AS while exchanging prefixes between the two customers at the border routers of
AS 1 and AS 3.
Note: Remove-private-as will only remove a private AS number if it is on the end of the AS Path, and not in
the middle. For instance, if you have the AS Path [1234,64351,4321], remove-private-as will not remove
64351 from the AS Path.
The following steps are required to achieve this.
At Customer ABC domain:
1. On the border router at AS 1234, use local-as command to replace the AS 1234 with a private AS
such as 65000 while peering with the upstream AS64521
2. Now the AS3 border router will have prefixes with only private AS numbers in its AS PATH within its
domain. Now when AS 3 border router is sending the prefixes to the border router of AS1 in
customer XYZ, it can remove the private AS’s using “remove private-as” command.
At Customer XYZ domain:
3. Since XYZ domain also has the overlapping private AS number AS64512, this should be removed using
“remove private as” command at the border router in AS4321 peering with AS1.
Figure 1 BGP Support for Dual-AS example
1.1.1.1 3.3.3.3 4.4.4.4

AS 1 AS 3 AS 64521
AS 4321
7.7.7.7 5.5.5.5
AS 64512 AS 64512 AS 1234
Customer XYZ Customer ABC

Step1:
On the BGP speakers peering between AS1234 and AS64512 in ABC, the following configurations would
be used:
On AS-1234 peering with AS-64521
router bgp 1234

neighbor 10.1.1.13 remote-as 64521
neighbor 10.1.1.13 local-as 65000 no-prepend replace-as
The keywords and values:

local-as 65000: AS65000 is the ‘local’ AS number configured in addition the real AS number 1234.
no-prepend: This keyword causes the BGP process NOT to prepend the “local autonomous system”
number 65000 to any prefixes received from its peer at AS64521.
replace-as: This keyword causes the BGP speaker to replace the real autonomous system AS 1234 with
the ‘local’ autonomous system AS5000 in BGP updates advertised to its neighbor.
On AS-64521 peering with AS-1234
router bgp 64521

AS-64521#show ip bgp
BGP table version is 6, local router ID is 4.4.4.4
Network Next Hop Metric LocPrf Weight Path

*> 5.5.5.5/32 10.1.1.14 0 0 65000 i
*> 7.7.7.7/32 10.1.1.14 0 65000 64512 i
With these options configured, we see the route to 7.7.7.7/32 coming from AS 64512 has a local-AS of
65000 and an originating AS of 64512.
AS-64512#show ip bgp

*> 1.1.1.1/32 10.1.1.26 0 1234 64521 2 1 i
*> 3.3.3.3/32 10.1.1.26 0 1234 64521 2 i
*> 4.4.4.4/32 10.1.1.26 0 1234 64521 i
*> 5.5.5.5/32 10.1.1.26 0 0 1234 i
Step 2:
When the BGP speaker at the edge of AS3 receives the route from AS 64521, all the autonomous systems
in the AS Path are private AS numbers.

In the following output, all prefixes received from AS64521 show only private AS numbers in its path.
AS-3#show ip bgp

*> 1.1.1.1/32 10.1.1.1 0 01i
*> 3.3.3.3/32 0.0.0.0 0 32768 i
*> 4.4.4.4/32 10.1.2.2 0 0 64521 i
*> 5.5.5.5/32 10.1.2.2 0 64521 65000 i
*> 7.7.7.7/32 10.1.2.2 0 64521 65000 64512 i
Since all the AS numbers in the AS Path are private AS numbers on prefixes received from AS 64521, “
remove-private-as” command can strip all the AS numbers in the entire path while sending the updates to
AS 1.
router bgp 3
neighbor 10.1.1.1 remove-private-as
The following output shows that all the prefixes received from AS3, have only Autonomous System 3 in
AS_Path list. This is because of private AS numbers being striped off at AS3.
AS-1#show ip bgp

*> 1.1.1.1/32 0.0.0.0 0 32768 i
*> 3.3.3.3/32 10.1.1.2 0 03i
*> 4.4.4.4/32 10.1.1.2 03i
*> 5.5.5.5/32 10.1.1.2 03i
*> 7.7.7.7/32 10.1.1.2 03i
Step3:
Now that the private AS’s are removed on prefixes from Customer ABC’s domain, we have to remove
the overlapping private AS 64512 from the prefixes originating at Customer XYZ. This can be done either
at the border router of AS4321 or AS1. We can do at the AS4321 as it is closer to the originating private
AS.
On AS4321 peering with AS1:
router bgp 4321
neighbor 10.2.1.1 remove-private-as

BGP Infrastructure Security
BGP TTL Security Hack (BTSH)
Background
BGP, by default, sends packets to external neighbors with a TTL of 1 and accepts packets from external
neighbors with a TTL of 0 or higher. Because BGP will accept packets with a TTL of 0 or higher, this
makes it possible for an attacker to send packets to a BGP router from many hops away, as long as the TTL
is still greater than 0 when it arrives. Since the TCP tuple is easy to discover and an attack doesn’t need the
TCP sequence number, this presents a relatively easy attack vector.
The Generalized TTL Security Mechanism (GTSM, described in RFC 3682) protects BGP routers from
attacks sourced from devices which are not directly attached to the same segment as the BGP speaker.
When configured to use GTSM, BGP originates packets with a TTL of 255, and only accepts packets with
a TTL of 254 or higher. This TTL value is evaluated after the receiving router has decremented the TTL.
Benefits
Since forging the TTL of an IP packet is still considered not possible, the deployment of GTSM, BTSH
will protect directly connected eBGP peers from this various TCP based attacks.
Guideline
BTSH should be implemented in all eBGP peering sessions, with consideration for the fact that multihop
scenarios reduce its effectiveness dependant upon network diameter. As most eBGP peering are one hop
away, the use of BTSH can be quite beneficial.
This feature is disabled by default. If enabled it should be configured on a per-neighbor basis.
It’s important to consider the impact of GTSM on processor utilization. In many cases, deploying GTSM
will open the router to TTL based attacks, since every packet received must be punted to the process level
for TTL checking.

While this provides robust protection in directly connected peer scenarios, the usefulness declines in multi-
hop eBGP scenarios, as this requires BTSH to be configured to accept a TTL < 254, exposing the router to
CPU-utilization attacks.
This feature is not supported for iBGP peers or peer groups. For detailed information refer to
http://www.cisco.com/en/US/docs/ios/12_3t/12_3t7/feature/guide/gt_btsh.html

BGP Authentication
Background
BGP uses TCP as its transport mechanism which makes is susceptible to certain attacks using spoofed TCP
segments and potentially resetting BGP sessions by injecting bogus TCP resets.
BGP Authentication uses an implementation of RFC2385 which provides a TCP option for carrying a MD5
digest in a TCP segment. This digest functions as a signature for the segment as it uses information only
available to the two endpoints.
Benefits
The implementation of an MD5 digest to authenticate each TCP segment significantly increases the
difficulty in successfully implementing a spoof attack against a BGP session.
Guideline
This feature is disabled by default
BGP Authentication should be employed when feasible, taking the performance impact into consideration,
especially when BGP sessions use transport across “untrusted” network segments.
There is work currently underway to replace MD5 with alternate methods using HMAC/SHA currently in
draft states within the IETF. It is also possible as an alternative to MD5 to use IPSec to secure BGP
peering sessions.

The performance hits associated with implementing this feature may inhibit its deployment. Testing has
shown measurable CPU impact associated with calculating the MD5 digest for BGP data segments,
additional details are available in RFC2385
BGP Maximum-Prefix
Background
The default behavior of BGP is for a router to accept unlimited number of prefixes advertised by the
neighbor.
Too many unplanned prefixes can overload BGP process and router resources which can potentially crash
the router.
This opens vulnerability in that a neighbor can send too many prefixes, causing the BGP process to
consume enough memory to impact the overall operation of the router, to potentially including crashing the
router.

Benefits
The maximum-prefix option provides a method to limit the number of prefixes the router will accept from
a BGP peer, with an option to print a warning (log the limit being broken) or to close the session.
Additionally, there will be a Syslog alert triggered when the number of prefixes received exceeds a
specified percentage of the maximum prefixes value.
Guideline
This feature is disabled by default. When enabled, the default behavior is for peering sessions to be
disabled when the maximum prefixes value is exceeded. If the ‘restart-interval’ argument is not
configured, a disabled session will stay down after the maximum prefix value is exceeded. The default
‘threshold’ value is 75%. A Syslog alert is triggered after receiving 75% of configured value with
maximum-prefix command.
This feature should only be deployed on peering sessions where the number of prefixes being learned is
predictable and stable. It would be unwise, for example, to deploy this on a transit link in the global
Internet, as the number of prefixes being announced on the internet is continuing to grow. Keeping the
maximum number of prefixes accepted would be creating an ongoing administrative task for the network
team. Use of the restart option is not generally recommended as this could lead to session flaps.

Route Flap Dampening
Background
Route flap dampening (RFD) is a mechanism developed in the 1990’s to help stabilize BGP within the
Internet network. It was believed, at the time RFD was developed, that widespread deployment of RFD
would improve the overall convergence characteristics of the global Internet, and encourage network
operators to clean up any routes they had which might rapidly flap, or routes which changed state on a
regular basis.
More recent research on the impact of RFD highlighted that this mechanism is more harmful than the issue
it cures. When there are multiple EBGP sessions, a router will receive multiple updates pertaining to a
destination network. When the destination network become unreachable, the same router will receive
multiple withdraw messages. This can lead to over-penalizing the network prefix. This over penalizing can
lead to suppressing the prefix unnecessarily for extended period of time.
Both RIPE and NANOG communities agreed that RFD is not a best practice anymore.
http://www.ripe.net/ripe/docs/routeflap-damping.html
http://www.nanog.org/mtg-0210/ppt/flap.pdf
Guidelines
The RIPE and NANOG communities have agreed that RFD is not recommended anymore. If you still want
to implement this feature, follow the guidelines below.
The behavior of route flap dampening for routes is modified by four configurable parameters. Appropriate
parameter selection is recommended depending on various factors like severity (frequency and duration) of
flaps, prefix-length, availability of alternate routes, etc. For example, the decay half life parameter should
be set to a time considerably longer than the period of the route flap it is intended to address. It should be
noted that one time network down can be seen as multiple Downs depending on EBGP sessions. This
should be taken into consideration when deciding dampening parameters.
• It is advisable to apply route-dampening as close to the prefix being advertised as possible.

For example, dampening should be applied at the customer peering points in addition to applying it at
upstream ISP peering points. This also helps ensure that after successfully repairing the problem related to
prefix flapping, dampening parameters can be cleared on access routers.
• Apply dampening to inbound announcements from eBGP peers only.
• It is recommended that certain longer prefixes that are critical for access be excluded from dampening.
For example, dampening should not be applied to DNS servers, especially root DNS servers. Applying
dampening in such cases could result in loss of connectivity to name resolution services.

The main drawback of RFD is well explained in the following study:
“Route-flap Damping Exacerbates Internet Routing Congerence Sigcomm 2002
http://www.eecs.umich.edu/~zmao/Papers/sig02.pdf “

Basically, it shows that a single route flap (a withdrawn followed by an announcement) will make the route
to be dampened further in the backbone because several updates path (an attribute change is considered as
a flap) will be generated due to the next best path selection in the previous hops. Also the fact that each AS
may use a different MRAI (Min Route Adv Interval) reinforces this behavior called withdrawal triggered
suppression.
The following presentation provides also an overview of this issue:

http://www.nanog.org/mtg-0706/Presentations/PhilipSmith-BGP.pdf
http://www.ripe.net/ripe/docs/ripe-378.html
Details
The Route Flap Dampening feature of BGP causes a route that is flapping, i.e., being announced and
withdrawn constantly, to be ignored for a period of time that depends on the duration of the flap. Routes
that flap constantly are suppressed (ignored) longer than routes that flap occasionally.
Generally, when a route is withdrawn and announced in quick succession, the route will be tagged with a
penalty value. This penalty value will be increased for each flap. The show ip bgp <route> router
command will list the penalty and the state of the route. Initially, the route will be tagged with a history
marker to indicate that it is known to be flapping. When the penalty value crosses a threshold, the route
will be suppressed or dampened. The route will be tagged as suppressed when this happens and a
countdown timer will be started to indicate the amount of time the route will remain in the dampened state.
This time will be extended for each additional flap that occurs. When the timer expires, the route will no
longer be suppressed.
By default, a route is assigned a penalty value of 1000 (a constant) for each flap. If the value of the route’s
accumulated penalties exceeds 2000, the route is suppressed until the penalty drops below 750. The
accumulated is reduced every 5 seconds, at a rate that the penalty is reduced by half every 15 minutes.
Route dampening parameters can either be applied to all prefixes with equal penalties or to longer prefixes
with higher penalty than the short ones. The later is preferred since it is flexible and gives more granular
control.
Click here for More Details on Route Flap Dampening.

Click here for a Case Study on Route Flap Dampening.
Click here for Configuring Route Dampening.

Route Reflectors
Background
To ensure any BGP router has complete routing information, it is necessary for all BGP routers in an AS to
have iBGP peering sessions with each other. Because a BGP speaker will not send routes learned from one
iBGP peer to another, it follows that to have a complete set of routes, all BGP speakers within an AS must
peer directly with one another. This produces some real scaling problems once you have more than a few
BGP routers in your network. For larger ISPs, which may have hundreds or even thousands of routers in
their network, the number of peering sessions clearly becomes unmanageable and difficult to implement
given the limited CPU and memory on routers.
Route reflectors and Confederations address this problem by allowing a router to advertise or reflect iBGP
learned routes to other iBGP peers without requiring a full network mesh.
Route reflectors, then, help reduce the size of the iBGP mesh and the associated overhead. Normal BGP
speakers can coexist with route reflectors and route reflector clients, because only the route reflector must
support this feature. Once the best path is selected, it is reflected to all clients if received from a non-client.
If the best path was received from a client, it will be reflected to all non-clients and other clients.
Benefits
Route reflectors add additional attributes to BGP updates within an AS to allow a router to reflect iBGP
learned routes to other iBGP peers, which allows the network engineer to reduce the size of the iBGP
mesh.
Besides reducing the number of iBGP sessions, a route-reflector hierarchy serves to reduce the number of
routing updates transmitted through the network, and to reduce the size of the BGP tables of BGP speakers
within the network. A route reflector will only forward its best path to a client, and it will only forward the
best path from all of its clients to each of its normal iBGP peers.
A route reflector topology results in considerable savings in the number of paths stored on routers in the
network. This leads to less memory utilization and savings in CPU and BGP update generation. This
comes, perhaps, with a small cost in convergence time, because BGP updates now have to propagate
through multiple routers (RRC-to-RRC relationships) to get from a BGP router. Typically, the savings will
be greater for routers lower in the network hierarchy.
Guidelines
• Divide the backbone into multiple clusters and use at least one route reflector and a few clients per
cluster.
• Start by migrating small parts of the network, one part at a time.
• Changing “next-hop” using a route-map on a route reflector can cause routing loops and it is not
recommended unless absolutely required.
• Route reflectors can be configured in a redundant fashion, with each client physically connected to two
route reflectors to ensure that BGP information continues to be reflected if there is a problem with one
route reflector. The redundant configuration increases the number of physical TCP connections
required compared to the single-route reflector configuration, but still significantly streamlines the
network topology compared with a full network mesh. Clients can also peer with route reflectors in
other clusters for redundancy. Clusters may be configured hierarchically–route reflectors in a cluster

can be clients of route reflectors in a higher level. This provides a natural method to limit routing
information sent to lower levels.
• Route-reflector redundancy is recommended in order to allow multiple RRs to peer with the same RR
Clients. However, particular attention must be given to the usage of same versus different cluster IDs
on these redundant RRs. The benefit of configuring multiple RRs with same cluster ID (which means
they are in the same cluster) is that less memory is required in the router, because , because multiple
updates from within the same cluster are not kept after being received. The important drawback,
however, is that routing can be disrupted in case of link failure. This can happen in scenarios when a
link failure prevents an RR from learning the same route as other RRs in the cluster.
• When deploying or migrating to a route-reflector topology, it is important to follow the physical
topology when deciding where to place the route reflectors.
• When route-reflector clients (RRCs) peer with multiple route reflectors (RRs), it is important to assess
whether iBGP peering should be formed over physical interface addresses or over loopback addresses.
Peering over physical interface addresses allows a physical link failure to tear down the BGP session
thereby switching the RRC over to its “backup” RR. When RR-RRC peering is over loopback
addresses, a physical link failure may retain or re-establish the iBGP session from an RRC over to the
same RR via an alternate path, which may or may not be desirable depending upon the topology.
• As a rule of thumb, the route reflector should be used only as a “route server” and not for switching
high volumes of traffic or for other services.

With earlier versions of IOS, if peer groups were used for clients of a route reflector, all the clients were
required to be fully meshed and “bgp client-to-client reflection” would need to be turned off on the route-
reflector. Clients inside a cluster do not have direct iBGP peers; instead, they exchange updates through the
route reflector. Configuring peer groups within such a cluster could cause a withdrawal to the source of a
route on the route reflector to be sent to all clients in the cluster.
Click here for RFC 2796.

Confederations
Background
Some of the key characteristics of confederations are:
• They are visible to the outside world as single AS, the Confederation AS.
• The sub-autonomous systems within the confederation may use private AS numbers, rather than public
ones. This would generally be done when migrating a single AS to a confederation, but when merging
autonomous systems into a single AS by combining them into a confederation, the AS numbers would
remain the same.
• iBGP speakers in sub-AS should be fully meshed unless a route-reflector model is used.
• The total number of neighbors is reduced by limiting the full mesh requirement to only the peers in the
sub-AS.
Guidelines
• Both confederations and route reflectors can be scaled to a good extent.
• Route reflectors are much easier to migrate as you don’t have to deal with sub-AS numbers.
• If you already have large scale deployment of confederations keep using them and use RR within sub-
AS to reduce iBGP mesh.
• Confederations are better to use in scenarios involving AS mergers.
Details
Route Propagation in Confederation

Route propagation decisions are fairly similar to Route Reflectors—special treatment of AS paths is the
key distinction.
• For routes learned from other peers within the local sub-as, send only to external peers. If the external
peer is another sub-as, then pre-append the local sub-as number in the
CONFEDERATION_SEQUENCE part of the ASPATH. If the external peer is outside of the
confederation, remove the CONFEDERATION_SEQUENCES or CONFEDERATION_SET
attributes, pre-append the confederation ID to the AS-PATH attribute, and forward the route.
• According to the BGP RFC sub-AS can aggregate, resulting in a CONFEDERATION_SET, but Cisco
does not implement this.
• For routes learned from any external peers (sub-as, or outside the confederation), send the routes to all
neighbors.
• By default (i.e. in the absence of any configured policy), the LOCAL_PREF, MED, and NEXT_HOP
attributes are forwarded without change.
• Because the NEXT_HOP is unchanged, we can conclude that confederations are expected to run a
single IGP.

Confederations require more configuration work upfront and usually require network downtime to activate.
It is difficult to transition to or from confederation because of multiple AS numbers involved. All BGP
peers on a router must be reset if it needs to be assigned a new AS number. However, once the initial
configuration work is complete, confederations give network operators a greater degree of management
flexibility. While confederations help reduce iBGP mesh themselves, they can also be used with route
reflectors.

BGP-IGP Redistribution Policies
Background
Cisco IOS supports redistribution between all IP routing protocols. This also includes redistributing
between BGP and IGP protocols. The redistribution of BGP into an IGP is typically discouraged. When
BGP to IGP redistribution must be performed, it is usually recommended to filter the redistribution to
prevent uncontrolled prefix injection from destabilizing the entire network.
Benefits
The main benefit of redistributing BGP into the IGP is to provide dynamic route advertisement of BGP
learned prefixes throughout the network without requiring a pervasive BGP environment.
Guidelines
Avoid redistributing routes from BGP into an IGP whenever possible. This is especially the case when full
Internet tables are involved. If BGP must be redistributed into the IGP, it is highly recommended that
prefix filtering be configured to limit the prefixes allowed into the IGP.
Typically, only eBGP routes are redistributed into an IGP, Redistributing iBGP routes has a potential to
create routing loops. Hence, as a protection mechanism, by default iBGP learned prefixes will not be
redistributed. The redistribution of iBGP learned prefixes must be explicitly enabled. The following
command will enable the redistribution of iBGP learned prefixes:
router bgp xxxx
bgp redistribute-internal
• When doing mutual redistribution, ensure route propagation is fully controlled in either direction
otherwise it can cause routing loops.
• It is recommended to configure a default metric to the routes that are redistributed.
• Be careful about routing loops that may occur when redistributing at multiple points.
• Using default information originate would be the preferred mechanism, unless you’re using an IGP that
doesn’t support default information originate, such as EIGRP. You should limit this only to the default
route, in any case.
• Do not redistribute BGP routes matching on communities as it is unreliable and not supported.

Interior Gateway Protocols were not designed to handle the large number of prefixes that are typically
found in BGP. The redistribution of BGP into the IGP can cause network-wide instability. Even if prefix
filtering is configured on the redistribution point, the number of prefixes may be more than the IGP is able
to cope with. The misconfiguration of a BGP to IGP redistribution point has been known to cause large-
scale network outages and should be avoided as a general rule.

BGP Next-hop
This section describes how next-hop reachability works in BGP, associated issues and best practices to
address them.
Background
The BGP next hop attribute is the next hop IP address to use in order to reach a destination prefix.
Default next-hop behaviors:
• When a router announces a prefix to an eBGP peer, the next-hop is modified to the local peering
interface – normally the connected interface address.
• When a router announces an eBGP learnt route to an iBGP peer, the next-hop is not modified
• When a route-reflector reflects a route learnt from a router-reflector client (RRC) the next-hop is
not modified.
• When a route-reflector propagates a route learnt from an iBGP or eBGP peer the next-hop is not
modified.
In any peering scenario, the next hop must be reachable before a router will insert the destination prefix into
its routing tables. In cases where a prefix is learnt via eBGP and subsequently announced via iBGP the
default behavior listed above creates a requirement for the external nexthop to be learnt via an IGP, else the
prefix will not be installed as the next-hop will be unreachable.
Guidelines
Some methods for ensuring next-hop reachability for externally learnt routes are:
• Redistribute a static route to the eBGP peer into the local IGP
• Redistribute the connected or static route which is used to reach the eBGP peer into the local IGP
• Run the local IGP on the interface which is used to reach the eBGP peer advertising the route (normally
the IGP would be configured not to build any neighbor adjacencies on this interface)
• Set the next hop to the BGP peering address of the BGP speaker within the local AS which is receiving
the routes
When advertising to an iBGP peer, BGP “next-hop-self” feature modifies next hop of only the eBGP learnt
prefixes and not those learnt as an RR client or from iBGP peers.
Next Hop Behavior with Route Reflection

By default a Route Reflector will not modify the next-hop attribute of reflected (iBGP learnt) routes, Cisco
implementations allow a RR to modify the next hop for eBGP routes being reflected to route reflector
clients by configuring the bgp next-hop-self command.
If there is a requirement to modify the next hop of reflected routes or iBGP learned routes this must be
accomplished with an outbound route-map.

• Modifying the next hop at a Route Reflector can cause routing loops and is only advised when the
network architecture requires it.
“Next-Hop-Unchanged” feature
To enable an eBGP peer to propagate the next hop unchanged, use the neighbor next-hop-unchanged
command in address family or router configuration mode. Except in certain inter-AS MPLS VPN
scenarios this command should not be configured on a route reflector, and the neighbor next-hop-self
command should not be used to modify the next hop attribute for a route reflector when this feature is
enabled for a route reflector client.
This command can be used to accomplish the following:
• Bring the route reflector into the forwarding path, which can be used with the “iBGP Multipath” Load
Sharing feature to configure load balancing.
• Configure inter-AS Multi Protocol Label Switching (MPLS) Virtual Private Networks (VPNs) by not
modifying the next hop attribute when advertising routes to an MP eBGP peer.
• Turn off the next hop calculation for an eBGP peer. This feature is useful for configuring the end-to-
end connection of a label-switched path.
To propagate the next-hop of an iBGP learnt prefix without modification, use the option ‘allpaths’
keyword.
Incorrectly setting BGP attributes for a route reflector can cause inconsistent routing, routing loops, or a
loss of connectivity. Setting BGP attributes for a route reflector should be attempted only by an
experienced network operator.

Community Attribute
Background
BGP communities are used to simplify the control of routing information by providing a mechanism to
group prefixes and make routing decisions based on these groupings. In general, a BGP community is
defined as a group of prefixes which share some common property. The community scheme can
significantly simplify the configuration required to control distribution of routing information. A BGP
speaker may use this attribute to control which routing information it accepts, prefers or distributes to other
neighbors.
While communities themselves do not alter the BGP decision making process, like attributes such as local
preference, AS path, MED, etc, communities can be used as flags in order to mark a set of prefixes.
Benefits
The biggest advantage of the BGP community attribute is it provides a scalable method of implementing
routing policy.
• Building filters at an AS exit point based on the AS from which the prefixes are learned would
normally involve building a large and complex AS filter list, or a filter based on complex regular
expressions. If all the prefixes to be filtered at a specific exit point are marked with a single
community, however, this filtering becomes much simpler.
• Assigning multiple communities to a prefix can be used to build a community “string”. This allows
other routers to act based on one, some or all of the attributes. A router has the option to add or modify
a community attribute before it passes the attribute on to other peers.
• The community attribute provides more flexibility than many other prefix attributes for managing
policies because it does not form part of the BGP best-path algorithm. Communities are generally
leveraged by the routing policy engine to set and/or modify other attributes in order to influence best-
path decisions.
• The BGP Named Community Lists feature allows the network operator to assign meaningful names to
community lists. This feature also increases the number of community lists that can be configured by a
network operator because there is no limitation on the number of named community list that can be
configured.
Guidelines
It is considered a common practice to group prefixes into communities based on classifications such as the
following:
• Type of customer or peering AS. For example, those that receive full-routes versus those that receives
partial (direct customer-only) routes.
• Prefixes learned from customers.
• Prefixes learned from ISPs or peers.
• Prefixes with identical routing policies.

• Prefixes in VPN (BGP community is fundamental to the operation of MPLS VPNs; communities play a
crucial role in identifying families of routes within VPNs.)
It is better to assign prefixes into communities at the edge of the network, and then build the outgoing
policy lists based on simple communities. When advertising routes to the ISPs it is advisable to configure
communities in agreement with the ISP.
For example, RFC1998 describes a way in a multi-homed environment to indicate which of the ISPs is
primary and secondary for a particular set of routes.
When configuring community attribute, verify if any community tags already exist.
By default, if you use “set community AS:N” in a route map, the existing community string will be
OVERWRITTEN. To append to the existing community string, use the “additive” keyword. To remove the
existing community attribute, use “set community none”.
The “well-known” communities should be understood and obeyed by all BGP4 implementations.
For historical reasons, community is not sent by default – you need to enable it on a per-neighbor basis.
Communities are carried across AS boundaries (transitive) only if send-community has been configured
for the neighbor. Implementation propositions are underway to have the default set to send communities
for iBGP, and not send them for eBGP.
An enterprise network can use primary outbound filters based on communities to send to its ISP all of the
routes originated in its network.
In smaller installations, where the number of prefixes in the network is small, it is advisable to use a prefix-
list as a “backup” for the primary community filter.
This does nothing more than to help protect the Internet against possible mistakes.
If a range of routes is to be aggregated and the resultant aggregates attribute section does not carry the
ATOMIC_AGGREGATE attribute, then the resulting aggregate should have a COMMUNITIES path
attribute, which contains all communities from all of the aggregated routes.
Details
RFC 1997 defines a BGP community as a transitive attribute with 32-bit binary value that can be applied to
BGP routes. RFC 1997 suggests the first two octets be an AS number presumably of the originating
domain), and the second two octets may be defined by that autonomous system.
By default, Cisco implementation supports decimal value for BGP community attribute. With release 12.0,
BGP community can be configured in 3 different formats: decimal, hexadecimal, and AA:NN, where AA
is AS numbers and NN is an attribute assigned by the BGP AS administrator.
To use and display the newer format AA:NN, where the first part is the AS number and the second part is a
2 byte number, we need to use the ip bgp new-format global configuration command.
Router(config)#ip bgp-community new-format
There are well-known BGP communities defined by RFC 1997:

• no-export: Do not advertise to eBGP peers. Keep this route within an AS.
• local-AS: Do not send outside local AS in confederation (special case of no-export).
• no-advertise: Do not advertise this route to any peer, internal or external.
• None: No community attribute. Useful to clear any communities associate with a route.
• Internet: Advertise this route to the internet community, any router belongs to it.
Basic Configuration
Below are basic configuration steps for setting BGP community attribute to receive BGP routes.

RTR-A#
!
router bgp 200
network 160.10.0.0
neighbor 3.3.3.1 send-community
neighbor 3.3.3.1 route-map setcommunity out
!
route-map setcommunity
match ip address 1
set community 200:200
ip access-list 1 permit 0.0.0.0 255.255.255.255
It is important to note that without neighbor x send-community command, community attribute will not
be sent for those routes.
Community-lists
A community list is a group of communities which can be used in a match clause of a route-map; this
allows filtering or the setting of attributes based on different lists of community numbers. Earlier
configuration that we saw set the community attribute, and community-lists now apply policy based on the
set community attribute.
RTR-B#
!
router bgp 300
neighbor 3.3.3.3 route-map check-community in
!
ip bgp-community new format
!
route-map check-community permit 10
match community 1
set local-preference 200
!
match community 2 exact
!
match community 3
!
ip community-list 1 permit 100:20
ip community-list 2 permit 300:200
ip community-list 3 permit internet
!

In the above example, any prefix that has 100:20 in the community attribute matches list 1. Any prefix that
has only 300:200 as community matches list 2. The keyword exact indicates that the community should
only consist of 300:200 and nothing else.
In this case, note the last community-list command. Community-lists work just like access-lists and it
important for all other routes to be allowed after applying community lists, else they will be filtered. Since
Internet community means all routes, this command allows all other routes to come through.
On the other hand, ISPs want tight control over routes they take into their network and in that case, they
will not use the community-list with Internet keyword, and only allow routes that they chose. It is
important to realize what our policy is, and use appropriate implementation.
Extended BGP Communities

For applications which require larger communities to provide grouping of routes within BGP, there are
multiple drafts on extended BGP communities. BGP extended communities are 64 bits instead of the
current 32 bits. These drafts also provide a structure to the community attribute to address issues like route
target and route originator indicators that help in associating routes to a L3VPN. For more information,
please refer to extended communities draft on IETF web site.
Named BGP Community-lists

With release 12.0(10)S, 12.1(9)E, 12.2(8)T and 12.0(16)ST, Cisco introduced the named community-list
feature. This feature introduces a new type of community list called the named community list. BGP
named community lists allow the network operator to assign meaningful names to community lists and
increases the number of community lists that can be configured. A named community list can be
configured with regular expressions and with numbered community lists. All rules of numbered
communities apply to named community lists except that there is no limitation on the number of
community attributes that can be configured for a named community list. Named community-lists do not
have the limitation of 100 community groups that exist with standard and extended community lists.
BGP Community Design Case Study

This section covers a BGP community design case study to help better explain BGP community attribute
use in BGP deployments.
Consider an ISP, using AS 10 that services customers in single connection mode and multiple connection
mode with other connections used for backup or load balancing purposes. This ISP has many peers, both
customers and transit, and has the added complexities of aggregate routes and routes for internal networks
or affiliates.
The ISP has finalized the following community attributes for various route types:
Table 1 BGP Community Design
Route Type Community BGP Policy *

ISP internal 10:50 Local Preference 100
routes
Peer Routes 10:100 Local Preference 85
Preferred Peers 10:150 Local Preference 90
Customer 10:200 Local Preference 95
Routes -

Specifics
Customer routes 10:400 Customer routes that are
- Aggregates aggregated.
Advertised to peers
Local Preference 95
ISP assigned 10:500 Aggregated towards peers, local
customer routes preference 100.
* BGP Policy is unique to each router based on its BGP connections
The BGP routing policy for the ISP has following salient points:
• All core routers should have all the routes including peer routes, customer routes and affiliate routes.
• Some customers that are single homed should get only the default route.
• Multi-homed customers have two options: either full Internet routing table or just selected ISP routes
and some other critical routes for optimal routing.
• Customers with IANA assigned AS numbers and IP addresses should be advertised toward peers.
Private AS’s and private IP addresses should not be advertised to peers.
• If customer routes can be aggregated, only aggregate should be advertised to peers.
• If customer routers are dual homed (connected to multiple ISPs) and have specific routes advertisement
and AS-path prepend requirements, then specific routes should be advertised to peers.
• ISP assigned customers routes should be aggregated towards peers.
• If an ISP has preferred path through other ISPs for specific destinations, and those peers should be
preferred for those destinations.
Configuration on a peering router

router bgp 10
aggregate-address 129.168.203.0 255.255.255.0 as-set summary-only
neighbor 202.168.79.6 route-map set-community in
neighbor 202.168.79.6 route-map peer-community out
!
neighbor 199.9.200.33 route-map set-community in
neighbor 199.9.200.33 route-map peer-community out
exit
!
! AS-Path ACL 71 permits all routes from peering ASs
!
ip as-path access-list 71 permit ^201_
ip as-path access-list 71 permit ^651_
!
ip bgp-community new-format
ip community-list 100 permit 10:50 10:200 10:400 10:500
!

route-map set-community permit 10
match as-path 71
!
route-map peer-community permit 10
match community 100
end
Configuration on a customer router

The following example shows a scenario with the customer getting full Internet routing table and customer
routes getting aggregated.
router bgp 10
neighbor 144.70.7.1 route-map customer-in in
neighbor 144.70.7.1 route-customer-out out
exit
!
ip community-list 100 10:50 10:100 10:150 10:500
!
access-list 1 permit 144.70.0.0
route-map customer-in permit 10
match ip address 1
!
route-map customer-out permit 10
match community 100
end

Synchronization Rules
Background
BGP synchronization feature refers specifically to prefix synchronization between BGP and the IGP. If it is
enabled, BGP speakers will not advertise routes learned from an iBGP peer unless the destination
described in the route is also reachable through the local IGP. A prefix is synchronized in BGP if there is a
matching prefix in the IGP. The purpose of the synchronization feature is to prevent traffic from being
black-holed in networks that have non-contiguous BGP speakers. If a BGP learned prefix is not
synchronized, the prefix will not be inserted into the routing table and will not be advertised to external
peers. The concept of a synchronized prefix is only relevant to iBGP learned prefixes.
Benefits
The BGP synchronization feature is only beneficial in very specific environments, in which two
requirements must be met.
• There are non-contiguous BGP speakers. This means that the transit path between two iBGP peers
contains routers that are not running BGP.
• BGP prefix routing information is being redistributed into the local IGP. If no BGP prefix information
is inserted into the IGP, it is unlikely the BGP prefix would ever be able to become synchronized.
It is seldom that a network will benefit from BGP synchronization being enabled.
Guidelines
It is common practice in virtually every BGP network to disable this feature. Cisco recommends that BGP
synchronization be disabled. BGP synchronization is enabled in older versions of Cisco IOS Software, but
in more recent versions, it is disabled.
Synchronization should not be used to prevent the transiting of traffic through a non-transit autonomous
system. Some form of route filtering should be used instead.
The default setting for BGP synchronization is to be enabled on older versions of IOS.
The command to disable this feature is:
router bgp xxxxx
no synchronization
Note that synchronization is off by default in recent IOS releases.

Multi-homing
Background
Many enterprises require continuous connectivity to the global Internet to provide email, web browsing,
and business to business applications support. In order to support this continuous connectivity, many
enterprises have turned to multi-homing.
The tight integration of day-to-day business communication with the Internet has made Internet
connectivity a mission critical service. The use of email and the web have become tightly integrated into
the world economy and therefore are critical to the way business is done. The requirement for Internet
connectivity is not just for connectivity, but highly redundant connectivity. It is in this capacity,
connecting to the Internet, that BGP is most commonly seen in enterprise environments.
The term multi-homing has become quite common. So what does it mean to be multi-homed with respect
to Internet connectivity?
A network is multi-homed when it has more than one path to reach the Internet. This may be accomplished
via multiple paths to a single provider, or multiple paths to different providers. Customers usually prefer
paths to different providers for improved redundancy reasons.
The performance of the Internet connectivity can be enhanced through multi-homing. This is commonly
done through the use of different providers to provide a more diverse selection of paths to reach
destinations.
Keep in mind that multi-homing does not always provide physical path redundancy. This is highlighted in
the case of two Service Providers that may be using the same physical equipment to deliver a circuit to the
Enterprise. There would be logical redundancy but not physical.
Benefits
There are a few primary reasons for multi-homing:
• Reliability
Internet connectivity has become a mission critical service in many environments. Multi-homing
(when done correctly), will provide the redundancy needed to ensure reliable service delivery.
• Optimal Routing
The performance of connectivity to the Internet connectivity can be enhanced through multi-homing.
This is commonly done through the use of different providers to provide a more diverse selection of
paths to reach destinations.
• Increased Bandwidth requirements
As the traffic requirements grow, so does the need for more bandwidth from an enterprise to the outside
world. More bandwidth does not necessarily mean increasing the pipe. It can also be achieved by
increasing the number of links to the service providers and hence load share the traffic.
Multi-homing Scenarios
Depending on the criticality of the connectivity and bandwidth requirements, one of the following multi-
homing scenarios can be used to provide Internet connectivity to an enterprise network.

Stub Network Single-homed
Guidelines
A single-homed stub network is one that connects to the Internet using a single connection to the service
provider.
• A single-homed stub network does not require the use of BGP.
• The provider will configure a static route for the customers prefix. The enterprise will configure a
static default route.
• If multiple circuits are used between the provider router and customer router, multiple static routes are
used. The use of multiple connections in this design assumes they are all connected to the same
routers.

Router failure will result in full loss of connectivity.
Stub Network Multi-homed: Single Border Router

This is a scenario where the enterprise connects to the Internet through a single service provider and
(preferably) through different physical infrastructures.
The enterprise uses a single router to achieve this connectivity.
Guidelines
BGP should be used for better traffic control in both directions thus achieving better load sharing across
both links.
• In the stub environment, the enterprise should typically receive a default route to the service provider.
At most partial routes can be requested if needed. Receiving the full Internet routing table is not
necessary in a stub environment.
• Private AS numbers can be used since the enterprise is connecting to a single service provider.
• Some form of route filtering should be used to prevent the single router within the stub autonomous
system from becoming a transit path for the peering AS. This is commonly achieved by using an as-
path access list.

This setup provides protection against a link failure and covers for partial loss of connectivity in the service
provider network. However, if the Enterprise CPE or service provider Access Router fails, connectivity to
the Internet will be completely lost. This is due to the single point of failure associated with this design
strategy.
Stub Network Multi-homed: Multiple Border Routers

This is a scenario where an enterprise network connects to the Internet using multiple border routers via a
single service provider to achieve load sharing and resilience. This type of multi-homing may be done via

different POPs of the same ISP and preferably through two circuits which do not share the same physical
infrastructure (CO, fiber, or other elements).
Guidelines
• BGP should be used for better traffic control in both directions thus achieving better load sharing
across both links. The customer may accept a default route, partial routes, or a full routing table from
the service provider, depending on the level of outbound traffic control the customer would like to
achieve.
• Private AS numbers can be used because the enterprise is connecting to a single service provider.
• Each of the customer border routers should eBGP peer with the service provider.
• All customer border routers and all transit routers interconnecting the border routers should speak BGP
and establish an iBGP full mesh between each other. If this is not feasible, all border routers should
build iBGP adjacencies with each other through direct L2 link or through virtual interfaces such as
GRE Tunnel. This way, even if the transit routers do not have partial or full BGP routes, the traffic
will not be black holed or looped.( Please note that there may be other disadvantages to using
Tunnels on specific routers such as performance, fragmentation etc., which should be investigated
thoroughly before deciding to implement this solution)
• The enterprise should then originate a default route from each border router. In order to prevent traffic
from following a default route to a border router with a failed upstream circuit, the default origination
should be conditional on the circuit being up and active. This conditional advertisement can be based
on a static default pointed at the interface, or can be received from BGP and redistributed into the IGP.
If additional prefix information is received from the upstream provider, do not redistribute this into any
IGP process running on the border routers.
• Some form of route filtering should be configured to prevent the customer’s autonomous system from
becoming a transit path for the service provider. This is commonly achieved by using an as-path access
list.

In a rare event if service provider network is down completely due to a disaster, Internet connectivity is
lost.
Standard Multi-Homed Network: Single border router / Multiple ISP’s
This is a scenario where an enterprise network connects to the Internet using a single border router and
uses multiple links to multiple service providers.
Guidelines
This design scenario requires the enterprise to obtain their own ASN from their regional registry. The
enterprise will also need a block of address space that is large enough to pass standard peering filters.
Depending on the resources available one of the following options can be considered.
Option 1 –Static Defaults

The customer will use a static default route to each service provider.

This option allows for a very even sharing of outbound traffic. But it leads to sub-optimal routing because
traffic destined for provider A or its directly connected customers may go through the other service
providers.
The following options require eBGP peering with the service providers.
Option 2 –ISP Provided Defaults

The customer will use the default routes advertised by each service provider. This is preferred over static
default routes.
Option 3 – Receive Partial Routing

The next option is to receive partial routing information.
The customer should request a default route and a set of partial routes from the two service providers. The
customer should configure some form of route filtering to prevent the customer’s autonomous system from
becoming a transit path between the two service providers.
The enterprise can request the upstream providers send only their locally originated prefixes and those
prefixes for their customers.
This will result in the enterprise being able to correctly route traffic destined to either provider.
A default route directed at each provider would still be required to reach any destinations that are beyond
the immediate upstream providers and their customers. This can be resolved by asking the provider to send
default + partial routes.
This option has the advantage that out going traffic from enterprise destined for the connected providers
locally originated networks and their directly connected networks gets optimally routed. Traffic destined
for other networks may still take sub optimal routing because the router uses the default route for these
networks.
Option 4 – Full BGP Routes

The enterprise can also receive full tables from both providers. This will allow the enterprise border router
to send traffic to the upstream provider that is logically closest to the destination. This logical distance is
derived from the AS_PATH. If the AS_PATH is the same length, the traffic will be sent to the upstream
provider whose path had the lowest ROUTER_ID. This traffic can be balanced between the two if eBGP
Multi-Path is used. This method results in the greatest resource requirements.
The customer should configure some form of route filtering to prevent the customer’s autonomous system
from becoming a transit path between the two service providers. This is commonly achieved by using an
as-path access list.

Accepting full routing tables requires more resources (memory / cpu) on the border router.
Page: 48
Again, this design still poses a single-point-of-failure scenario whereby if the enterprise border router fails
Internet connectivity is lost.
Consideration has to taken with regard to how to advertise the routes within the enterprise AS. Typically,
you want to advertise a default via your IGP.
Refrain from implementing unfiltered redistribution of BGP routes into the enterprise IGP.

Standard Multi-Homed Network: Multiple border routers
In this scenario, the enterprise network connects to the Internet using multiple border routers and uses
multiple service providers for resilience and load sharing. Preferably, the two links to the two service
providers will not share the same physical infrastructure. This will prevent a layer 2 failure from impacting
both paths to the global Internet.
Guidelines
• This design scenario requires the enterprise to obtain their own ASN from their regional registry.
• The customer should obtain a publicly assigned block of address from the national registry, or work
with the two service providers to obtain a block from one of the two service providers which the other
service provider will agree to advertise.
o The customer should be aware that if the service provider from which they have obtained
an address block from aggregates the supplied addresses into a longer block, traffic will
be directed to their network in an asymmetric way.
o The customer should be aware that some service providers will not advertise routes with
a prefix length longer than some local policy. Any address block obtained should fit into
the local policy for longest routable prefix length for both service providers.
• The enterprise will also need a block of address space that is large enough to pass standard peering
filters as mentioned above.
• eBGP should be configured with each of the service providers.
• All the enterprise border routers and the routers in transit between border routers should build a full
iBGP mesh.
The customer should configure a route filter of some type which prevents the customer’s autonomous
system from becoming a transit network between the two service providers. This is commonly achieved by
using an as-path access list.
Depending on the resources available one of the following options can be considered.
Option 1
Request upstream service providers to send partial routes (which will include their local prefixes and their
directly connected customer networks).
This option requires additional default routes on each of the border routers to reach all other destinations.
Option 2
Accept full routing updates from each upstream service provider.
If required, the attributes of the incoming updates can be altered to achieve required load sharing if the AS-
PATH is the same.
Static defaults can also be used in conjunction with full routes if needed depending on the eBGP
connectivity.

This option involves more complex configurations depending on the amount of level load sharing desired.

When accepting full routes, additional resources on the devices are required.
Service providers usually summarize the address space they assign to different connected customers and
advertise only a summary to their customers and their upstream providers. This can lead to sub optimal
inbound traffic when multi-homing to different service providers. Provider A, who owns the customer
address space, may summarize the address space to a smaller prefix length and other providers may
advertise the actual prefix (larger prefix length than the summary). In such case most of the inbound traffic
to enterprise prefers provider B. This situation requires that provider A also advertise a specific route along
with the summary.
Almost all of the service providers use filters at their public or private peering points. These filters are used
to allow only prefixes of certain length or less. If the prefix-length for the customer’s address space is not
large enough to pass these filters, sub optimal routing can occur or the enterprise may end up using only
one service provider for the incoming traffic.
Load-balancing scenarios can typically be achieved when routes come from the same ISP. It is important to
check utilization of resources in such cases since memory usage increases a lot.
The concerns in “Stub Network Multi-homed: Multiple Border Routers” for how to advertise the eBGP
learned routes to the rest of the AS still apply.
Multi-homing Scenario Examples

The following examples demonstrate the techniques described in the above section. The illustration begins
with a simple sub network with single ISP connection and builds over this setup to achieve more complex
scenarios.
Scenario 1: Enterprise stub network, single homed

Let us assume an enterprise network with a single BGP router that connects to the Internet using a single
SP. In this scenario, let us assume that there are two links from the border router to the SP router.
Figure 2 Single-homed Enterprise Stub Network with Border Router
This situation does not require the use of BGP. Default static routes out each link to the service provider
will do the job. The service provider will also have static routes to the address space of enterprise and
advertises this network to their upstream providers and its customers.

Traffic is load balanced across both links. This meets the additional bandwidth requirements and also
protects against single link failure.
The caveat of this design is that the single border router becomes a single point of failure.
Scenario 2: Stub network multi-homed to single SP with single border

router
Figure 3 Stub Network Multi-homed to Single SP and Border Router
This scenario assumes that the enterprise has a single border router with two connections to the same SP.
The connections are terminated at the service provider in different locations.
BGP should be configured to achieve load sharing and to achieve control of inbound and outbound traffic
patterns.
Bidirectional Forwarding Detection (BFD) on the circuits between the enterprise and SP may be required
to cater to the needs of quicker link failure detection and BGP fallover to alternate path. There is increasing
trend to use of BFD is on Ethernet circuits between a data center and SP. By default the link failure
detection can be delayed up to 3 minutes. BFD support on the SP and Customer equipment needs to be
checked.
Use of aggressive BGP timers instead f BFD is not recommended due to the problems witnessed by
multiple customers and service providers.

Scenario 3: Stub network multi-homed to single SP with multiple border
routers
Figure 4 Stub Network Multi-homed to Single SP with Multiple Border Routers
This scenario assumes that the enterprise has multiple border routers that connect to the same SP at
different points of presence.
BGP should be configured to achieve traffic control and load sharing. Private ASN numbers can be used
because of single SP used.
All the enterprise border routers including the ones in the transit path should run BGP and should be fully
meshed.
Subnets of the enterprise space can be advertised from each border router to achieve inbound traffic load
sharing.
The enterprise border routers should originate a default and this default should be generated on the
condition that the link to the SP is operational to avoid black holing and looping of traffic.
Use of BFD may again be a requirement in this scenario as well subject to need and support both on SP
side and enterprise side equipment.

Scenario 4: Standard Network Multi-homed to Multiple ISPs with
Multiple Border Routers
Figure 5 Standard Network Multi-homed to Multiple SPs with Multiple Border Routers
Inter-AS between
AS100 and AS200
MPLS SP A Internet SP MPLS SP B

AS 100 AS 500 AS 200
NAT NAT
Enterprise AS 65001
This scenario is the most complex scenario in terms of connectivity. Customer has multiple border routers
connecting to multiple SPs. MPLS and Internet SP’s can be same SP as well. The large enterprise is doing
multi-homing to the same SP as well as multiple SPs depending upon its geography as well as depth of
network redundancy needs.
BGP should be definitely used to achieve load sharing and redundancy at least at critical hub sites such as
data centers.
For this purpose, Enterprise should obtain its own AS number from the local registry but it can use private
AS number if it is only connecting over an MPLS network to connect to remote branch offices with its
datacenter and head office.
Private IP addresses would suffice for internal enterprise connectivity between datacenter, head office, and
remote branches, etc. However, for external internet connectivity, enterprise should obtain an address block
from the Internet SP and use NAT at its internet gateways unless enterprise can afford to own large enough
internet addressing space so that every device in the enterprise network has public IP address.
The enterprise can obtain partial internet routes or full internet routing table depending on the extent of
load sharing that needs to be achieved and cost of equipment requirement. The more number of router,
higher the memory and processing power is needed at the enterprise router. Additionally, static defaults
may be required to achieve complete connectivity.
Use of BFD may again be a requirement at critical enterprise sites such as data centers as explained under
scenario 2. This recommendation should be made subject to need and support both on SP side and
enterprise side equipment.

Load Balancing
Background
Load Balancing is the ability to divide the traffic over multiple links to achieve even distribution. Perfect
load balancing with any routing protocol is almost always impossible, and BGP is no exception. The goal
is to achieve load sharing as close to equal distribution of traffic as possible. In reality, the network only
controls the load sharing of outbound network traffic. Inbound load sharing can be achieved by
manipulation of certain BGP metrics, but this is still no guarantee of 50/50 load balancing. When load
sharing is discussed, what is really meant is managing configuration to attain optimal traffic patterns.
Guidelines
While optimizing traffic flows, inbound traffic and outbound traffic should be treated separately. The
strategies for load sharing inbound traffic and for load sharing outbound traffic are separate and often work
independent of each other.
Inbound Load Balancing

• If the enterprise is multi-homing to the same service provider, the address space can be broken down into
smaller subnets. Each subnet can then be advertised out separate links to achieve the load balancing goal.
However, this does not provide adequate level of redundancy. To achieve redundancy, these smaller subnets
can be advertised with different AS-PATH lengths to the ISP via different links.
• If the enterprise is multi-homing to different service providers, AS-PATH can be pre-pended while advertising
the address space to the service provider where heavy utilization is observed.
• Another option is to use communities to tag the enterprise address space prefixes in accordance with the
service provider’s policies so that the provider can take further action based on their values while advertising
them to their customers and upstream providers.
• Another way to influence inbound load balancing is to use an advertise-map. With an advertise-map, customer
can monitor a prefix and if the prefix vanishes, BGP will advertise a configured network to the service
provider.
Outbound Load Balancing
Scenario 1
If the enterprise is multi-homed to a single service provider, simple default routes pointing out each of the
links may achieve the goal, as long as there are many distinct flows in the traffic mix.

Scenario 2
If the enterprise is multi-homed to different providers, there are several options.
- One is to accept partial routes from each of the providers and use defaults for other traffic. This may
achieve the required load sharing in some cases.
- The other option is to accept full routing updates from each of the service providers and filter routes to
receive mutually exclusive partial routes from each of the providers. AS-PATH filtering can be used to
achieve this.
- Local-pref based on some prefix-distribution (usually, you need trial and error to decide this
distribution)
In each of the above cases, it is generally a good practice to have default route along with more specific
routes in order to achieve complete connectivity
Scenario 3
If multiple links are used to create multiple eBGP sessions between the enterprise router and the service
provider router for additional bandwidth and/or redundancy, there is a potential for using only one link
because eBGP sessions are tied to the physical interface address of the link to the provider.
One of the following options can be used to achieve outbound load balancing in such a scenario with
option (1) as a preferred solution.
Option 1: eBGP Multi-hop feature

eBGP multi-hop feature uses a single eBGP session between the two routers, with the eBGP session being
sourced from a loopback instead of a physical interface. A static route to the remote loopback is
configured for each interface. This provides the next-hop resolution and load balancing through recursive
routing to the next-hop.
Option 2: eBGP Multipath feature

The eBGP multipath feature provides another solution to this problem. An eBGP session is configured
between the two routers for each link. The eBGP sessions are tied directly to the interface addresses. The
result is both routers receive multiple paths, one for each link, which are identical except for neighbor
address from which the path was received. The eBGP multipath feature will allow the router to install all
paths up to the maximum-paths value configured.
Scenario 4 (iBGP Multipath)

Similar to the eBGP multi-path feature, iBGP multipath provides for load balancing iBGP traffic between
iBGP neighbors. The iBGP multipath load sharing feature enables the BGP speaking router to select
multiple iBGP paths as best paths to a destination. The best paths or multi-paths are then installed in the IP
routing table of the router.
For multiple paths to the same destination to be considered as multipaths, the following criteria must be
met:
- All attributes must be the same. The attributes include weight, local preference, autonomous system
path (entire attribute and not just length), origin code, Multi Exit Discriminator (MED), and Interior
Gateway Protocol (IGP) distance.

- The next hop router for each multipath must be different.
Even if the criteria are met and multiple paths are installed, the BGP speaking router will still designate one
of the multipaths as the best path and advertise only this best path to its neighbors.
BGP load sharing over an MPLS core does not work properly in 12.0S. E.g. with two BGP next hops and
two core links CEF is not taking all four paths (i.e. two paths to two next-hops, each via two outgoing
interfaces as this requires in CEF two levels of recursion, which is not supported in 12.0S. Also be aware
of CSCsb52253 - CEF/LFIB picking an incorrect LDP label for the BGP next-hop, leading to a black-hole.
The fix suggested is to use "send-label" with BGP. This problem is not present in 12.2S and IOS XR. ****
Scenario 5 (Link Bandwidth)

The BGP Link Bandwidth feature is used to advertise the bandwidth of an autonomous system exit link as
an extended community. The BGP Link Bandwidth feature is supported by the internal BGP (iBGP) and
external BGP (eBGP) multi-path features. The link bandwidth extended community indicates the
preference of an autonomous system exit link in terms of bandwidth. The link bandwidth extended
community attribute may be propagated to all iBGP peers and used with the BGP multi-path features to
configure unequal cost load balancing. When a router receives a route from a directly connected external
neighbor and advertises this route to iBGP neighbors, the router may advertise the bandwidth of that link.
Hence the distribution of traffic occurs in proportion to the egress bandwidth.

While achieving load balancing by multi-homing to multiple service providers, the process may need
multiple iterations to achieve the desired load sharing. Over time, traffic flows may change due to changing
applications and application connectivity requirements in the enterprise. At that time the whole exercise
may need to be repeated. Load balancing almost always brings some amount of asymmetric flows with
traffic for a flow exiting a particular link while traffic belonging to the same flow entering the other link so
this should be accounted for while designing networks.

Details
Using Advertise Map for Inbound Load Balancing

A way to influence inbound load balancing is to use advertise-map. With advertise-map, customer can
monitor a prefix and if the prefix vanishes, BGP will advertise a configured network to the service
provider. Using this feature, the enterprise router can be configured to advertise the address space of the
other ISP2 to ISP1. In situations of multi-homing with two ISPs, similar configuration can be done to
advertise one ISP’s address space to the other and vice versa in case of a failure of either of the links. This
technique is also called conditional advertisement
Figure 6 Use of advertise-map to influence inbound traffic
When all links are working fine, 1.10.0.0/16 is advertised to ISP1 and 10.15.7.0/24 is advertised to ISP2.
When ISP2 link fails, advertise-map monitors the prefix of the link between the R2 and ISP2 and when the
prefix vanishes from the routing table, it advertises 10.15.7.0/24 also to ISP1. Make sure that 10.15.0.0 is
not learnt through ISP1.
Example
router bgp 500
neighbor <R1> advertise-map am non-exist-map bb
!
access-list 1 permit 10.15.7.0 !Advertise this when...
access-list 2 permit 10.15.0.0 !… this disappears
route-map am permit 10
match ip address 1
route-map bb permit
match ip address 2
This scenario achieves the purpose of inbound load sharing and in case of a failure the address space is still
reachable via the other ISP.
The other way of doing this is to use AS-PATH pre-pending. The disadvantage of AS-PATH pre-pending
is that it doubles the prefixes in the upstream providers’ BGP tables.

Using EBGP Multipath for Outbound Load Balancing
EBGP multipath can be used to load balance traffic across two links from an enterprise to the same ISP.
Typically BGP chooses only one of the paths and the tie breaker is the router-id in case all other attributes
happen to be the same.
To overcome this issue, BGP multipath when enabled installs both the routes in the routing tables and
hence load balancing can be achieved.
When enabled, BGP can install up to eight equal paths in the IP routing table. This requires that all other
attributes for both the paths be the same.
Figure 7 Using EBGP Multipath for Outbound Load Balancing
Sample Configuration
Enterprise Router
router bgp 65100
no synchronization
bgp log-neighbor-changes
network 172.18.0.0
maximum-paths 3
no auto-summary
!
ISP Router
router bgp 100
no synchronization
bgp log-neighbor-changes

network 172.19.0.0
maximum-paths 3
no auto-summary
!
Using BGP Link Bandwidth for Outbound Load Balancing

The BGP Link Bandwidth feature is used to advertise the bandwidth of an autonomous system exit link as
an extended community. The BGP Link Bandwidth feature is supported by the internal BGP (iBGP) and
external BGP (eBGP) multi-path features. The link bandwidth extended community indicates the
preference of an autonomous system exit link in terms of bandwidth. The link bandwidth extended
community attribute may be propagated to all iBGP peers and used with the BGP multi-path features to
configure unequal cost load balancing. When a router receives a route from a directly connected external
neighbor and advertises this route to iBGP neighbors, the router may advertise the bandwidth of that link.
Link Bandwidth is an extended community of type 0x0004.
Unequal cost load balancing with BGP using link bandwidth and BGP multi-path is supported in 12.2.
Link bandwidth is propagated to IBGP peers only. The attribute is stripped while sending to EBGP peers.
When used along with EBGP multi-path the bandwidth that is advertised to IBGP peers is the sum of DMZ
link bandwidths of all EBGP multi-paths.
Figure 8 Using BGP Link Bandwidth for Outbound Load Balancing
Sample Configuration
R1#
router bgp 100
bgp dmzlink-bw
maximum-paths ibgp 6

R2#
router bgp 100
bgp dmzlink-bw
maximum-paths 6
neighbor 10.10.10.1 send-community extended
neighbor 172.4.4.1 dmzlink-bw
R3#
router bgp 100
neighbor 20.1.1.1 send-community extended
Click here for Additional Details on BGP Link Bandwidth.

Route Aggregation
Background
Aggregation is the process of combining the characteristics of several different routes in such a way that a
single route can be advertised. Typically, an address block with a lower mask is advertised in order to
summarize the various sub-prefixes used in the autonomous system.
Benefits
Aggregation reduces the number of prefixes resulting in memory utilization savings since the size of the
BGP and IP routing table is reduced. Service Provider assigned address space aggregates routes from
customers in order to minimize the size of the Internet route table.
Aggregation of routing information also increases stability—aggregate stays even if specifics come and go.
This minimizes flapping of routes and eases the mandate of route dampening.
When a route is generated as an aggregate route using aggregate-address, the router will fill in the
AGGREGATOR attribute of the route with its own AS number and router ID. This can be useful for
troubleshooting. If you find someone is mistakenly advertising your address space as part of an aggregate,
you can track down the AS and router that is doing it.
Guidelines
• An enterprise network’s address space should allow aggregation of its addresses to reduce the amount
of extra information to be carried by each upstream provider. Aggregation requires generating a stable
summary (or supernet) route that covers all of the subnets in the network. There is no need for the
Internet to know about more specific routes in a particular network or AS.
• The first step in configuring route aggregation with BGP is to specify the aggregate-address command
to define the aggregate prefix to be generated. The command includes both net and mask.
• ISPs can also use the network command to generate these summary routes. However, they may choose
to use the aggregate-address in order to include AS-SET and COMMUNITY summarization
information in the aggregate route. The network command will not do this. If you use a network
statement to generate your summary routes, the atomic aggregate is not set.
• When using the network command, you also need a static to null0. The great advantage of using
aggregate command is that the aggregate is dynamic (it may/may not exist and the attributes may
change); on the other hand is less stable.
• When using the aggregate-address command, if the as-set keyword is not used, AS-PATH
information from the more-specific routes is lost. To signify this, the route sets the ATOMIC-
AGGREGATE attribute in the aggregate BGP route. Once set, this attribute should never be removed
by other routers that forward the route throughout the Internet.
When the as-set keyword is included, it causes the router to generate an AS-SET within the AS-PATH.
An AS-SET includes AS-PATH information from all more specific routes that contributed to the
aggregate. Besides generating an AS-SET, the router will also include BGP community information from
all of the more specific routes in the community attribute of the aggregate route.
For aggregates, the AS sequence information associated with the contributing routes is meaningless.
However, the AS path is still needed for loop detection, so the aggregating router, inserts all AS that

contribute to the aggregate route into the (unordered) AS SET say, {1 2 3}, and then puts it’s own AS
number, say 4, into the AS SEQUENCE - 4 {1 2 3}.
For example, if a route has an AS PATH of 60 50 {10 20}, it means that it was learned from AS 60,
aggregate was generated by AS 50, and contributed to by AS 10 and 20 (since 10 and 20 are in the AS set).
• By default, the aggregate address command does not filter out the more specific BGP routes. To do
this, the summary-only keyword needs to be used.
• More specific routes than the aggregate should be filtered using a prefix-list.
• Use a route-map to adjust other BGP attributes. For example, if a network connects to ISPs in two
different locations, and generates two aggregates, the MED attribute can be set so that traffic for one
aggregate will come in through one link, and traffic to the other will come in through the other link
(with both links providing a backup to the other).
• For the aggregate address to enter the BGP table there must be at least one specific route in the BGP
table. The more specific route can be injected in the BGP routing table by incoming updates from other
AS areas, can be redistributed from an IGP, or can be established by the network router configuration
command.
• In certain instances the provider will do the aggregation for the network, also known as proxy
aggregation. When aggregating, the upstream provider must ensure that all the components of the
aggregate are contained within its administrative domain. This situation calls for specialized
configurations to ensure connectivity to all the components of the /16 inside and outside the provider
AS. Specifically, the provider may have to leak the "holes" of the non-provider /24’s back into it’s AS
and to all its customers to ensure reach ability.

BGP will automatically summarize on classful boundaries, but this default setting should be turned off
using no auto-summary in networks relying on CIDR and VLSM since it takes control of summarization
out of the hands of the engineer. Also, at least one valid sub-prefix must be present in the routing table for
the aggregate to be considered valid.
Advertising summarized information to an upstream provider could result in sub-optimal routing decisions
for inbound traffic through the provider into the network. This is applicable in a scenario where the
network is multi-homed to a single provider at different locations or multi-homed to different providers
when the enterprise has provider independent address space.
Details
By default, the aggregate address command does not filter out the more specific BGP routes. To do this,
the summary-only keyword needs to be used.
Different methods of aggregation can be used depending on how the summaries and specifics need to be
propagated. The way aggregates are formed and advertised and whether they carry with them more specific
routes will influence traffic patterns and sizes of BGP routing tables. As mentioned earlier, BGP
aggregation applies to routes that exist in the BGP table. An aggregate can be sent if at least one more
specific route of that aggregate exists in the BGP table.
The suppress-map is another form of route-map that can be used to indicate the more specific routes to be
suppressed or the more specific routes to be allowed. When a route is permitted through the suppress map,
the route is suppressed. If the route is not permitted (denied), the route is not suppressed--that is, allowed.
Note that the deny logic here does not prevent the route from being advertised; rather, it prevents it from
being suppressed. Using the suppress-map keyword creates the aggregate route but suppresses
advertisement of specified routes. You can use the match clauses of route maps to selectively suppress
some more-specific routes of the aggregate and leave others unsuppressed. IP access lists and autonomous
system path access lists match clauses are supported.

Aggregation using the network command
router bgp 65000
network 172.16.0.0 mask 255.255.0.0
ip route 172.16.0.0 255.255.0.0 null0 254
The preceding configuration places a static instance of 172.16.0.0/16 in the routing table. Note that the
static entry is pointing to null0 (bit bucket). A high value for the administrative distance (254) can be
optionally used so that the static route gets used only as a last resort within the router. However, as long as
more specific routes are available, the router will always prefer a more specific route first.
Aggregation using the aggregate-address command

As explained, the aggregate-address command provides more control over propagation of summaries and
specifics. Example:
router bgp 65000
network 172.16.50.0 mask 255.255.255.0
aggregate-address 172.16.0.0 255.255.0.0 summary-only
The above configuration uses the aggregate-address command to aggregate all the more specific routes of
172.16.0.0/16 into a single address. The summary-only argument at the end of the aggregate-address
command tells the router to advertise the aggregate only. The network 172.16.50.0 mask 255.255.255.0
command allows a more specific route to be originated by the router so that aggregation can work.
It is important to note that the aggregate is created only if components are learned.
Additional details on Route Aggregation

Other Miscellaneous Areas
Prefix Lists
Prefix lists are used to filter IP prefixes and can match both the prefix number and the prefix length (mask).
Compared to access-lists prefix lists provide for a significant performance improvement as it requires
fewer processing cycles. It also provides greater flexibility and supports incremental list updates.
Background
The ip prefix-list command is used to configure IP prefix filtering. Prefix lists are configured with permit
or deny keywords to either permit or deny the prefix based on the matching condition. A prefix list consists
of an IP address and a bit mask. The IP address can be a classful network, a subnet, or a single host route.
The bit mask is entered as a number from 1 to 32. An implicit deny is applied to traffic that does match any
prefix-list entry.
Prefix lists are configured to match an exact prefix length or a prefix range. The ge and le keywords are
used to specify a range of the prefix lengths to match, providing more flexible configuration than can be
configured with just the network/length argument. The prefix list is processed using an exact match when
neither the ge nor le keyword is entered. If only the ge value is entered, the range is the value entered for
the ge ge-length argument to a full 32-bit length. If only the le value is entered, the range is from value
entered for the network/length argument to the le le-length argument. If the ge, ge-length and le, le-length
keywords and arguments are entered, the range falls between the values used for the ge-length and le-
length arguments. The following formula shows this behavior:
network/length < ge ge-length < le le-length <= 32
A prefix list is configured with a name and/or sequence number. One or the other must be entered when
configuring this command. If a sequence number is not entered a default sequence number of 5 is applied
to the prefix list and subsequent prefix list entries will increment by 5 (for example, 5, 10, 15, and
onwards). If a sequence number is entered for the first prefix list entry but not subsequent entries, then the
subsequent entries will also be incremented by 5 (For example, if the first configured sequence number is
3, then subsequent entries will be 8, 13, 18, and onwards). Default sequence numbers can be suppressed by
entering the no form of this command with the seq keyword.
Prefix lists are evaluated starting with the lowest sequence number and continues down the list until a
match is made. Once a match is made that covers the network the permit or deny statement is applied to
that network and the rest of the list is not evaluated
The prefix list is applied to inbound or outbound updates for specific peer by entering the neighbor prefix-
list command. Prefix list information and counters are displayed in the output of the show ip prefix-list
command. Prefix-list counters can be reset by entering the clear ip prefix-list command.
Examples
The following examples show how a prefix list can be used.
To deny the default route 0.0.0.0/0:
ip prefix-list abc deny 0.0.0.0/0

To permit the prefix10.0.0.0/8:
ip prefix-list abc permit 10.0.0.0/8
The following examples show how to specify a group of prefixes.
To accept a mask length of up to 24 bits in routes with the prefix 192/16:
ip prefix-list abc permit 192.168.0.0/16 le 24
To deny mask lengths greater than 25 bits in routes with the prefix 192/16:
ip prefix-list abc deny 192.168.0.0/16 ge 25
To permit mask lengths from 8 to 24 bits in all address space:
ip prefix-list abc permit 0.0.0.0/0 ge 8 le 24
To deny mask lengths greater than 25 bits in all address space:
To deny all routes with a prefix of 10/8:
ip prefix-list abc deny 10.0.0.0/8 le 32
To deny all masks with a length greater than 25 bits routes with a prefix of 192.168.1/24:
To permit all routes with a prefix of 0/0:
ip prefix-list abc permit 0.0.0.0/0 le 32
More information on Prefix-list configuration can be found at.

http://www.cisco.com/en/US/docs/ios/12_2/ip/configuration/guide/1cfbgp.html#wp1001470
Conditional Route Injection
Background
BGP uses conditional route injection to inject more specific prefixes into a BGP routing table over less
specific prefixes that were selected through normal route aggregation. These more specific prefixes can be
used to provide a finer granularity of traffic engineering or administrative control than is possible with
aggregated routes.
The BGP Conditional Route Injection feature allows originating a prefix into a BGP routing table without
the corresponding match. This feature allows more specific routes to be generated based on administrative
policy or traffic engineering information in order to provide more specific control over the forwarding of
packets to these more specific routes, which are injected into the BGP routing table only if the configured
conditions are met. Enabling this feature will improve the accuracy of common route aggregation by
conditionally injecting or replacing less specific prefixes with more specific prefixes. Only those prefixes
that are equal to or more specific than the original prefix may be injected. The BGP Conditional Route
Injection feature is enabled with the bgp inject-map exist-map command. This command uses two route

maps (inject-map and exist-map) to install one (or more) more specific prefix into a BGP routing table. The
exist-map specifies the prefixes that the BGP speaker will track. The inject-map defines the prefixes that
will be created and installed into the local BGP table.
BGP Conditional Route Injection should not be confused with the existing Conditional Advertisement
feature. There are fundamental differences between the two. Conditional Advertisement is neighbor
centric, advertising certain networks that are already in the BGP table to a given EBGP neighbor based on
the existence of another particular network in the BGP table. Conditional Route-Injection creates new
routes in the BGP table if a particular route does exist in the BGP table. Route-injection is not neighbor
centric.
Benefits
The BGP Conditional Route Injection feature allows injecting more specific prefixes into a BGP routing
table over less specific prefixes that were selected through normal route aggregation. These more specific
prefixes can be used to provide a finer granularity of traffic engineering or administrative control than is
possible with aggregated routes.
Guidelines
The BGP Conditional Route Injection feature is based on the injection of a more specific prefix into the
BGP routing table when a less specific prefix is present. If conditional route injection is not working
properly, check the following:
- If the aggregate prefix exists but conditional route injection does not occur, verify that the
aggregate prefix is being received from the correct neighbor and the prefix list identifying that
neighbor is a /32 match.
- Verify that the prefix that is being injected is not outside of the scope of the aggregate prefix.
- Ensure that the inject route map is configured with the set I address command and not the match ip
address command.
This following configuration example configures conditional route injection for the inject-map named
ORIGINATE and the exist-map named LEARNED_PATH.
router bgp 109
bgp inject-map ORIGINATE exist-map LEARNED_PATH
!
route-map LEARNED_PATH permit 10
match ip address prefix-list ROUTE
match ip route-source prefix-list ROUTE_SOURCE
!
route-map ORIGINATE permit 10
set ip address prefix-list ORIGINATED_ROUTES
set community 14616:555 additive
!
ip prefix-list ROUTE permit 10.1.1.0/24

!
ip prefix-list ORIGINATED_ROUTES permit 10.1.1.0/25
ip prefix-list ORIGINATED_ROUTES permit 10.1.1.128/25
!
ip prefix-list ROUTE_SOURCE permit 10.2.1.1/32
BGP Deterministic-med
Background
As stated in RFC 1771, MED is an optional non-transitive attribute that is a four octet non-negative
integer. The value of this attribute may be used by a BGP speaker's decision process to discriminate
among multiple exit points to a neighboring autonomous system. Enabling the bgp deterministic-med
command ensures the comparison of the MED variable when choosing routes advertised by different
peers in the same autonomous system. Enabling bgp deterministic-med command, removes any
dependencies of MED-based path decisions. It ensures that MED comparison is made across all routes
received from the same AS and the best path is chosen to reach the routers in that AS.
Benefits
Using an example will illustrate the benefits of bgp deterministic-med. For this diagram R1 has a
loopback (BGP ID of 1.1.1.1), R2 has a loopback (BGP ID of 2.2.2.2), and so on.
Figure 9 BGP Deterministic-med Example
In some circumstances it is possible for a router to choose different paths as the "best" path depending on
the order that the route advertisements were received in. For the topology above let's look at the routing

table from R1 for network 70.0.0.0/8. Note that the MED values indicated on the diagram are only
applied to the 70.0.0.0/8 advertisement.
R1# show ip bgp 70.0.0.0

BGP routing table entry for 70.0.0.0/8, version 4
Paths: (3 available, best #3, table Default-IP-Routing-Table)
Advertised to non peer-group peers:
2.2.2.2
200 400
3.3.3.3 from 3.3.3.3 (3.3.3.3)
Origin IGP, metric 0, localpref 100, valid, internal
300 400
2.2.2.2 from 2.2.2.2 (2.2.2.2)
Origin IGP, metric 30, localpref 100, valid, external
300 400
4.4.4.4 from 4.4.4.4 (4.4.4.4)
Origin IGP, metric 20, localpref 100, valid, internal, best
R1#
With the routes in this order R1 selects the route from 4.4.4.4 as best based on the following:
- The path from 2.2.2.2 is better than the path from 3.3.3.3 because the route through 2.2.2.2 is
external and the route through 3.3.3.3 is internal (Step 7). Note that because the next hop AS is not
the same for these two paths, MED is not compared.
- Now the path from 4.4.4.4 and 2.2.2.2 will be compared. The route from 4.4.4.4 will win because
it has a lower MED value (Step 6). MED is compared in this case because the two paths share the
same next hop AS.
Now let's clear our 4.4.4.4 neighbor so that the order the routes were received is changed:
R1#clear ip bgp 4.4.4.4

Flag: 0x240
3.3.3.3 4.4.4.4
300 400
4.4.4.4 from 4.4.4.4 (4.4.4.4)
200 400
3.3.3.3 from 3.3.3.3 (3.3.3.3)
300 400

2.2.2.2 from 2.2.2.2 (2.2.2.2)
Origin IGP, metric 30, localpref 100, valid, external, best
R1#
This is a problem because the 2.2.2.2 route is now selected as best. This inconsistency in selecting the
best route can lead to routing loops in the network. Let's take a look at why the 2.2.2.2 route was chosen
as best:
- The path from 3.3.3.3 is better than the path from 4.4.4.4 because 3.3.3.3 has a lower peer address
(Step 13). MED is not compared because they are from different AS.
- The path from 2.2.2.2 is better than the path from 3.3.3.3 because 2.2.2.2 is external and the route
through 3.3.3.3 is internal (Step 7). MED is not compared because they are from different AS.
In order for R1 to consistently choose the same best path we need to configure deterministic-med as
shown below:
R1#
!
router bgp 100
bgp deterministic-med
!
Deterministic-med forces BGP to do the following:
- Group the entries in the BGP table into sub-groups based on the next hop AS.
- For each sub-group, order the entries within that sub-group based on MED.
- For each sub-group, select the best route.
- Compare the best route from each sub-group in order to select the best route.
Let's look at the BGP table in R1 after configuring deterministic-med:

2.2.2.2
200 400
3.3.3.3 from 3.3.3.3 (3.3.3.3)
300 400
4.4.4.4 from 4.4.4.4 (4.4.4.4)
300 400
2.2.2.2 from 2.2.2.2 (2.2.2.2)

R1#
The 3.3.3.3 entry becomes the best path due to the following:
- The 3.3.3.3 path is the best for all routes with 200 as the first hop AS.
- The 4.4.4.4 path is the best for all routes with 300 as the first hop AS. This path beats the 2.2.2.2
path because 4.4.4.4 has a lower MED (Step 6).
- Now we compare the 3.3.3.3 path to the 4.4.4.4 path. The 3.3.3.3 path wins because it has a lower
peer ID (Step 13).
If we clear our 2.2.2.2 peer, which would normally cause a change in what is considered the best path, we
can see that 3.3.3.3 is still the best path:
R1#clear ip bgp 2.2.2.2

R1#show ip bgp 70.0.0.0
2.2.2.2
200 400
3.3.3.3 from 3.3.3.3 (3.3.3.3)
300 400
4.4.4.4 from 4.4.4.4 (4.4.4.4)
300 400
2.2.2.2 from 2.2.2.2 (2.2.2.2)
R1#
The path from 3.3.3.3 is now considered the best every time, because the paths are ordered the same way
each time. Although the above situation is rare, it does sometimes happen. This should be enough proof
to convince you and your customers to ALWAYS run deterministic-med. Another requirement when
running deterministic-med is that every BGP router in the AS must also have deterministic-med
configured.
Guidelines
BGP deterministic-med is disabled by default and it is recommended to ALWAYS enable BGP
deterministic-med.
Configuration is shown below:
router bgp 100
bgp deterministic-med

BGP Router Identifier
Background
The BGP Router Identifier is a 4-byte field in the BGP Open message used to identify a BGP speaker.
Benefits
Deterministically setting the BGP Router Identifier is beneficial when considering the following scenarios.
- If the BGP router id is not manually set, changing a routers IP address can cause a BGP router id
change and reset the routers BGP peers.
- When two equal cost BGP routes are considered by the BGP Best Path Algorithm one selection
criteria is BGP Router Identifier. (Route with lowest BGP RID is selected.)
- BGP synchronization requires that BGP and OSPF router identifiers match when redistributing
routes between protocols.
- A router running only IPv6 requires a BGP Router Identifier be manually set.
- BGP ‘ORIGINATOR_ID’ and ‘CLUSTER_LIST’ path attributes are set to the BGP Router
Identifier by default.
Guidelines
The BGP Router Identifier is set to the highest IP address assigned to a routers loopback interface. If no
loopback interface is configured the highest IP address configured on a physical interface is used.
The ‘bgp router-id <ip address>’ command should be used to deterministically set the routers BGP Router
ID to an IP address configured on a loopback interface.

Changing the BGP Router ID on an active BGP peer will force the session to be reset.
BGP Log Neighbor Changes
Background
BGP Log Neighbor Changes enables the logging of neighbor adjacency changes and/or resets. Messages
are written locally to the routers logging buffer or remotely to a Syslog server.
Benefits
Logging changes in BGP neighbor state is useful for identifying and troubleshooting BGP peering issues.

Guidelines
Logging of BGP neighbor adjacency changes is disabled by default.
Logging of neighbor adjacency state changes should be enabled using the ‘bgp log-neighbor-changes’
command.

BGP Best Practices

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

BGP Best Practices

Încărcat de

Drepturi de autor:

Formate disponibile

TM

Cisco Systems Advanced Services

BGP Best Practices

THE SPECIFICATIONS AND INFORMATION REGARDING THE PRODUCTS IN THIS

NOTWITHSTANDING ANY OTHER WARRANTY HEREIN, ALL DOCUMENT FILES AND

Copyright © 2009 Cisco Systems, Inc. All rights reserved.

BGP Best Practice Version 3.0 2

About This Design Document 9

Dynamic Update Peer-Groups and Peer-Templates 11

Resource Allocation and Convergence 14

Soft Reconfiguration and Route Refresh 19

BGP Support for Local-AS 21

BGP Support for Dual-AS Configuration 22

BGP Best Practice Version 3.0 3

BGP Infrastructure Security 26

BGP TTL Security Hack (BTSH) 26

Route Flap Dampening 29

BGP-IGP Redistribution Policies 35

BGP Best Practice Version 3.0 4

BGP Best Practice Version 3.0 5

BGP Best Practice Version 3.0 6

Table 1 BGP Community Design 41

BGP Best Practice Version 3.0 7

Figure 1 BGP Support for Dual-AS example 23

Figure 2 Single-homed Enterprise Stub Network with Border Router 50

Figure 3 Stub Network Multi-homed to Single SP and Border Router 51

Figure 4 Stub Network Multi-homed to Single SP with Multiple Border Routers 52

Figure 6 Use of advertise-map to influence inbound traffic 57

Figure 7 Using EBGP Multipath for Outbound Load Balancing 58

Figure 8 Using BGP Link Bandwidth for Outbound Load Balancing 59

Figure 9 BGP Deterministic-med Example 67

BGP Best Practice Version 3.0 8

BGP Best Practice Version 3.0 9

BGP Best Practice Version 3.0 10

The following are the features of the Dynamic Update Peer-Group:

BGP Best Practice Version 3.0 11

Risks and Limitations

BGP Best Practice Version 3.0 12

For detailed information on Dynamic Update Peer-Groups, click here.

BGP Best Practice Version 3.0 13

BGP Best Practice Version 3.0 14

Improving BGP Convergence

Path MTU discovery

BGP Best Practice Version 3.0 15

Interface Input Queues

TCP Window Size

Minimum Route Advertisement Interval (MRAI)

BGP Best Practice Version 3.0 16

This was the default beginning with 12.0(32)S

Risks and Limitations

Fast Peering Session Deactivation

This feature can be configured using the BGP configuration command:

BGP Best Practice Version 3.0 17

BGP Selective Address Tracking

BGP Best Practice Version 3.0 18

Risks and Limitations

BGP Best Practice Version 3.0 19

Outbound soft reconfiguration

Inbound Soft reconfiguration

router bgp xxxxx

To soft clear a BGP neighbor, the following exec command is used:

BGP Best Practice Version 3.0 20

BGP Best Practice Version 3.0 21

This feature comes with the following keywords: