Border Gateway Protocol
   HOME

TheInfoList



OR:

Border Gateway Protocol (BGP) is a standardized
exterior gateway protocol An exterior gateway protocol is an IP routing protocol used to exchange routing information between autonomous systems. This exchange is crucial for communications across the Internet. Notable exterior gateway protocols include Exterior Gate ...
designed to exchange
routing Routing is the process of selecting a path for traffic in a network or between or across multiple networks. Broadly, routing is performed in many types of networks, including circuit-switched networks, such as the public switched telephone netw ...
and reachability information among
autonomous systems An autonomous robot is a robot that acts without recourse to human control. The first autonomous robots environment were known as Elmer and Elsie, which were constructed in the late 1940s by W. Grey Walter. They were the first robots in history t ...
(AS) on the
Internet The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a '' network of networks'' that consists of private, pub ...
. BGP is classified as a
path-vector routing protocol A path-vector routing protocol is a network routing protocol which maintains the path information that gets updated dynamically. Updates that have looped through the network and returned to the same node are easily detected and discarded. This al ...
, and it makes
routing Routing is the process of selecting a path for traffic in a network or between or across multiple networks. Broadly, routing is performed in many types of networks, including circuit-switched networks, such as the public switched telephone netw ...
decisions based on paths, network policies, or rule-sets configured by a
network administrator A network administrator is a person designated in an organization whose responsibility includes maintaining computer infrastructures with emphasis on local area networks (LANs) up to wide area networks (WANs). Responsibilities may vary between org ...
. BGP used for routing within an autonomous system is called Interior Border Gateway Protocol, Internal BGP (iBGP). In contrast, the Internet application of the protocol is called Exterior Border Gateway Protocol, External BGP (eBGP).


History

The Border Gateway Protocol was sketched out in 1989 by engineers on the back of "three ketchup-stained napkins", and is still known as the ''three-napkin protocol''. It was first described in 1989 in RFC 1105, and has been in use on the Internet since 1994.
IPv6 Internet Protocol version 6 (IPv6) is the most recent version of the Internet Protocol (IP), the communication protocol, communications protocol that provides an identification and location system for computers on networks and routes traffic ...
BGP was first defined in in 1994, and it was improved to in 1998. The current version of BGP is version 4 (BGP4), which was published as RFC 4271 in 2006. RFC 4271 corrected errors, clarified ambiguities and updated the specification with common industry practices. The major enhancement was the support for
Classless Inter-Domain Routing Classless Inter-Domain Routing (CIDR ) is a method for allocating IP addresses and for IP routing. The Internet Engineering Task Force introduced CIDR in 1993 to replace the previous classful network addressing architecture on the Internet. Its g ...
(CIDR) and use of
route aggregation A supernetwork, or supernet, is an Internet Protocol (IP) network that is formed by aggregation of multiple networks (or subnets) into a larger network. The new routing prefix for the aggregate network represents the constituent networks in a ...
to decrease the size of
routing table In computer networking, a routing table, or routing information base (RIB), is a data table stored in a router or a network host that lists the routes to particular network destinations, and in some cases, metrics (distances) associated with th ...
s. The new RFC allows BGP4 to carry a wide range of IPv4 and
IPv6 Internet Protocol version 6 (IPv6) is the most recent version of the Internet Protocol (IP), the communication protocol, communications protocol that provides an identification and location system for computers on networks and routes traffic ...
"address families". It is also called the Multiprotocol Extensions which is
Multiprotocol BGP Multiprotocol Extensions for BGP (MBGP or MP-BGP), sometimes referred to as Multiprotocol BGP or Multicast BGP and defined in IETF RFC 4760, is an extension to Border Gateway Protocol (BGP) that allows different types of addresses (known as address ...
(MP-BGP).


Operation

BGP neighbors, called peers, are established by manual configuration among routers to create a TCP session on
port A port is a maritime facility comprising one or more wharves or loading areas, where ships load and discharge cargo and passengers. Although usually situated on a sea coast or estuary, ports can also be found far inland, such as H ...
179. A BGP speaker sends 19-byte keep-alive messages every 30 seconds (protocol default value, tunable) to maintain the connection. Among routing protocols, BGP is unique in using TCP as its transport protocol. When BGP runs between two peers in the same autonomous system (AS), it is referred to as ''Internal BGP'' (''iBGP'' or ''Interior Border Gateway Protocol''). When it runs between different autonomous systems, it is called ''External BGP'' (''eBGP'' or ''Exterior Border Gateway Protocol''). Routers on the boundary of one AS exchanging information with another AS are called ''border'' or ''edge routers'' or simply ''eBGP peers'' and are typically connected directly, while ''iBGP peers'' can be interconnected through other intermediate routers. Other deployment
topologies In mathematics, topology (from the Greek words , and ) is concerned with the properties of a geometric object that are preserved under continuous deformations, such as stretching, twisting, crumpling, and bending; that is, without closing ho ...
are also possible, such as running eBGP peering inside a
VPN A virtual private network (VPN) extends a private network across a public network and enables users to send and receive data across shared or public networks as if their computing devices were directly connected to the private network. The be ...
tunnel, allowing two remote sites to exchange routing information in a secure and isolated manner. The main difference between iBGP and eBGP peering is in the way routes that were received from one peer are typically propagated by default to other peers: * New routes learned from an eBGP peer are re-advertised to all iBGP and eBGP peers. * New routes learned from an iBGP peer are re-advertised to all eBGP peers only. These route-propagation rules effectively require that all iBGP peers inside an AS are interconnected in a full mesh with iBGP sessions. How routes are propagated can be controlled in detail via the ''route-maps'' mechanism. This mechanism consists of a set of rules. Each rule describes, for routes matching some given criteria, what action should be taken. The action could be to drop the route, or it could be to modify some attributes of the route before inserting it in the
routing table In computer networking, a routing table, or routing information base (RIB), is a data table stored in a router or a network host that lists the routes to particular network destinations, and in some cases, metrics (distances) associated with th ...
.


Extensions negotiation

During the peering handshake, when OPEN messages are exchanged, BGP speakers can negotiate optional capabilities of the session, including multiprotocol extensions and various recovery modes. If the multiprotocol extensions to BGP are negotiated at the time of creation, the BGP speaker can prefix the Network Layer Reachability Information (NLRI) it advertises with an address family prefix. These families include the IPv4 (default), IPv6, IPv4/IPv6 Virtual Private Networks and multicast BGP. Increasingly, BGP is used as a generalized signaling protocol to carry information about routes that may not be part of the global Internet, such as VPNs. In order to make decisions in its operations with peers, a BGP peer uses a simple
finite state machine A finite-state machine (FSM) or finite-state automaton (FSA, plural: ''automata''), finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number o ...
(FSM) that consists of six states: Idle; Connect; Active; OpenSent; OpenConfirm; and Established. For each peer-to-peer session, a BGP implementation maintains a state variable that tracks which of these six states the session is in. The BGP defines the messages that each peer should exchange in order to change the session from one state to another. The first state is the Idle state. In the Idle state, BGP initializes all resources, refuses all inbound BGP connection attempts and initiates a TCP connection to the peer. The second state is Connect. In the Connect state, the router waits for the TCP connection to complete and transitions to the OpenSent state if successful. If unsuccessful, it starts the ConnectRetry timer and transitions to the Active state upon expiration. In the Active state, the router resets the ConnectRetry timer to zero and returns to the Connect state. In the OpenSent state, the router sends an Open message and waits for one in return in order to transition to the OpenConfirm state. Keepalive messages are exchanged and, upon successful receipt, the router is placed into the Established state. In the Established state, the router can send and receive: Keepalive; Update; and Notification messages to and from its peer. * Idle State: ** Refuse all incoming BGP connections. ** Start the initialization of event triggers. ** Initiates a TCP connection with its configured BGP peer. ** Listens for a TCP connection from its peer. ** Changes its state to Connect. ** If an error occurs at any state of the FSM process, the BGP session is terminated immediately and returned to the Idle state. Some of the reasons why a router does not progress from the Idle state are: *** TCP port 179 is not open. *** A random TCP port over 1023 is not open. *** Peer address configured incorrectly on either router. *** AS number configured incorrectly on either router. * Connect State: ** Waits for successful TCP negotiation with peer. ** BGP does not spend much time in this state if the TCP session has been successfully established. ** Sends Open message to peer and changes state to OpenSent. ** If an error occurs, BGP moves to the Active state. Some reasons for the error are: *** TCP port 179 is not open. *** A random TCP port over 1023 is not open. *** Peer address configured incorrectly on either router. *** AS number configured incorrectly on either router. * Active State: ** If the router was unable to establish a successful TCP session, then it ends up in the Active state. ** BGP FSM tries to restart another TCP session with the peer and, if successful, then it sends an Open message to the peer. ** If it is unsuccessful again, the FSM is reset to the Idle state. ** Repeated failures may result in a router cycling between the Idle and Active states. Some of the reasons for this include: *** TCP port 179 is not open. *** A random TCP port over 1023 is not open. *** BGP configuration error. *** Network congestion. *** Flapping network interface. * OpenSent State: ** BGP FSM listens for an Open message from its peer. ** Once the message has been received, the router checks the validity of the Open message. ** If there is an error it is because one of the fields in the Open message does not match between the peers, e.g., BGP version mismatch, the peering router expects a different My AS, etc. The router then sends a Notification message to the peer indicating why the error occurred. ** If there is no error, a Keepalive message is sent, various timers are set and the state is changed to OpenConfirm. * OpenConfirm State: ** The peer is listening for a Keepalive message from its peer. ** If a Keepalive message is received and no timer has expired before reception of the Keepalive, BGP transitions to the Established state. ** If a timer expires before a Keepalive message is received, or if an error condition occurs, the router transitions back to the Idle state. * Established State: ** In this state, the peers send Update messages to exchange information about each route being advertised to the BGP peer. ** If there is any error in the Update message then a Notification message is sent to the peer, and BGP transitions back to the Idle state.


Router connectivity and learning routes

In the simplest arrangement, all routers within a single AS and participating in BGP routing must be configured in a full mesh: each router must be configured as a peer to every other router. This causes scaling problems, since the number of required connections grows quadratically with the number of routers involved. To alleviate the problem, BGP implements two options:
route reflector Border Gateway Protocol (BGP) is a standardized exterior gateway protocol designed to exchange routing and reachability information among autonomous systems (AS) on the Internet. BGP is classified as a path-vector routing protocol, and it m ...
s (RFC 4456) and
BGP confederation Border Gateway Protocol (BGP) is a standardized exterior gateway protocol designed to exchange routing and reachability information among autonomous systems (AS) on the Internet. BGP is classified as a path-vector routing protocol, and it m ...
s (RFC 5065). The following discussion of basic update processing assumes a full iBGP mesh. A given BGP router may accept network-layer reachability information (NLRI) updates from multiple neighbors and advertise NLRI to the same, or a different set, of neighbors. The BGP process maintains several
routing information base In computer networking, a routing table, or routing information base (RIB), is a data table stored in a router or a network host that lists the routes to particular network destinations, and in some cases, metrics (distances) associated with th ...
: * RIB: routers main
routing information base In computer networking, a routing table, or routing information base (RIB), is a data table stored in a router or a network host that lists the routes to particular network destinations, and in some cases, metrics (distances) associated with th ...
table. * Loc-RIB: local
routing information base In computer networking, a routing table, or routing information base (RIB), is a data table stored in a router or a network host that lists the routes to particular network destinations, and in some cases, metrics (distances) associated with th ...
BGP maintains its own master routing table separate from the main routing table of the router. * Adj-RIB-In: For each neighbor, the BGP process maintains a conceptual ''adjacent
routing information base In computer networking, a routing table, or routing information base (RIB), is a data table stored in a router or a network host that lists the routes to particular network destinations, and in some cases, metrics (distances) associated with th ...
, incoming'', containing the NLRI received from the neighbor. * Adj-RIB-Out: For each neighbor, the BGP process maintains a conceptual ''adjacent
routing information base In computer networking, a routing table, or routing information base (RIB), is a data table stored in a router or a network host that lists the routes to particular network destinations, and in some cases, metrics (distances) associated with th ...
, outgoing '', containing the NLRI send to the neighbor. The physical storage and structure of these conceptual tables are decided by the implementer of the BGP code. Their structure is not visible to other BGP routers, although they usually can be interrogated with management commands on the local router. It is quite common, for example, to store the Adj-RIB-In, Adj-RIB-Out and the Loc-RIB together in the same data structure, with additional information attached to the RIB entries. The additional information tells the BGP process such things as whether individual entries belong in the Adj-RIBs for specific neighbors, whether the peer-neighbor route selection process made received policies eligible for the Loc-RIB, and whether Loc-RIB entries are eligible to be submitted to the local router's routing table management process. BGP submits the routes that it considers best to the main routing table process. Depending on the implementation of that process, the BGP route is not necessarily selected. For example, a directly connected prefix, learned from the router's own hardware, is usually most preferred. As long as that directly connected route's interface is active, the BGP route to the destination will not be put into the routing table. Once the interface goes down, and there are no more preferred routes, the Loc-RIB route would be installed in the main routing table. BGP carries the information with which rules inside BGP-speaking routers can make policy decisions. Some of the information carried that is explicitly intended to be used in policy decisions are: * Communities * multi-exit discriminators (MED). *
autonomous systems An autonomous robot is a robot that acts without recourse to human control. The first autonomous robots environment were known as Elmer and Elsie, which were constructed in the late 1940s by W. Grey Walter. They were the first robots in history t ...
(AS)


Route selection process

The BGP standard specifies a number of decision factors, more than the ones that are used by any other common routing process, for selecting NLRI to go into the Loc-RIB. The first decision point for evaluating NLRI is that its next-hop attribute must be reachable (or resolvable). Another way of saying the next-hop must be reachable is that there must be an active route, already in the main routing table of the router, to the prefix in which the next-hop address is reachable. Next, for each neighbor, the BGP process applies various standard and implementation-dependent criteria to decide which routes conceptually should go into the Adj-RIB-In. The neighbor could send several possible routes to a destination, but the first level of preference is at the neighbor level. Only one route to each destination will be installed in the conceptual Adj-RIB-In. This process will also delete, from the Adj-RIB-In, any routes that are withdrawn by the neighbor. Whenever a conceptual Adj-RIB-In changes, the main BGP process decides if any of the neighbor's new routes are preferred to routes already in the Loc-RIB. If so, it replaces them. If a given route is withdrawn by a neighbor, and there is no other route to that destination, the route is removed from the Loc-RIB and no longer sent by BGP to the main routing table manager. If the router does not have a route to that destination from any non-BGP source, the withdrawn route will be removed from the main routing table. As long as there is
tiebreaker In games and sports, a tiebreaker or tiebreak is used to determine a winner from among players or teams that are tied at the end of a contest, or a set of contests. General operation In matches In some situations, the tiebreaker may consi ...
the route selection process moves to the next step. The local preference, weight, and other criteria can be manipulated by local configuration and software capabilities. Such manipulation, although commonly used, is outside the scope of the standard. For example, the ''community'' attribute (see below) is not directly used by the BGP selection process. The BGP neighbor process can have a rule to set local preference or another factor based on a manually programmed rule to set the attribute if the community value matches some pattern-matching criterion. If the route was learned from an external peer the per-neighbor BGP process computes a local preference value from local policy rules and then compares the local preference of all routes from the neighbor.


Communities

BGP communities are attribute tags that can be applied to incoming or outgoing prefixes to achieve some common goal. While it is common to say that BGP allows an administrator to set policies on how prefixes are handled by ISPs, this is generally not possible, strictly speaking. For instance, BGP natively has no concept to allow one AS to tell another AS to restrict advertisement of a prefix to only North American peering customers. Instead, an ISP generally publishes a list of well-known or proprietary communities with a description for each one, which essentially becomes an agreement of how prefixes are to be treated. Examples of common communities include: * local preference adjustments, * geographic * peer type restrictions *
denial-of-service attack In computing, a denial-of-service attack (DoS attack) is a cyber-attack in which the perpetrator seeks to make a machine or network resource unavailable to its intended users by temporarily or indefinitely disrupting services of a host conn ...
identification * AS prepending options. An ISP might state that any routes received from customers with following examples: * To Customers North America (East Coast) 3491:100 * To Customers North America (West Coast) 3491:200 The customer simply adjusts their configuration to include the correct community or communities for each route, and the ISP is responsible for controlling who the prefix is advertised to. The end user has no technical ability to enforce correct actions being taken by the ISP, though problems in this area are generally rare and accidental. It is a common tactic for end customers to use BGP communities (usually ASN:70,80,90,100) to control the local preference the ISP assigns to advertised routes instead of using MED (the effect is similar). The community attribute is transitive, but communities applied by the customer very rarely propagated outside the next-hop AS. Not all ISPs give out their communities to the public.


BGP Extended Community Attribute

The BGP Extended Community Attribute was added in 2006, in order to extend the range of such attributes and to provide a community attribute structuring by means of a type field. The extended format consists of one or two octets for the type field followed by seven or six octets for the respective community attribute content. The definition of this Extended Community Attribute is documented in RFC 4360. The IANA administers the registry for BGP Extended Communities Types. The Extended Communities Attribute itself is a transitive optional BGP attribute. A bit in the type field within the attribute decides whether the encoded extended community is of a transitive or non-transitive nature. The IANA registry therefore provides different number ranges for the attribute types. Due to the extended attribute range, its usage can be manifold. RFC 4360 exemplarily defines the "Two-Octet AS Specific Extended Community", the "IPv4 Address Specific Extended Community", the "Opaque Extended Community", the "Route Target Community", and the "Route Origin Community". A number of BGP QoS drafts also use this Extended Community Attribute structure for inter-domain QoS signalling. With the introduction of 32-bit AS numbers, some issues were immediately obvious with the community attribute that only defines a 16 bits ASN field, which prevents the matching between this field and the real ASN value. Since RFC 7153, extended communities are compatible with 32-bit ASNs. RFC 8092 and RFC 8195 introduce a Large Community attribute of 12 bytes, divided in three field of 4 bytes each (AS:function:parameter).


Multi-exit discriminators

MEDs, defined in the main BGP standard, were originally intended to show to another neighbor AS the advertising AS's preference as to which of several links are preferred for inbound traffic. Another application of MEDs is to advertise the value, typically based on delay, of multiple ASs that have a presence at an
IXP Internet exchange points (IXes or IXPs) are common grounds of IP networking, allowing participant Internet service providers (ISPs) to exchange data destined for their respective networks. IXPs are generally located at places with preexisting ...
, that they impose to send traffic to some destination. Some routers (like Juniper) will used the Metric from OSPF to set MED. Examples of MED used with BGP when exported to BGP on juniper SRX # run show ospf route Topology default Route Table: Prefix Path Route NH Metric NextHop Nexthop Type Type Type Interface Address/LSP 10.32.37.0/24 Inter Discard IP 16777215 10.32.37.0/26 Intra Network IP 101 ge-0/0/1.0 10.32.37.241 10.32.37.64/26 Intra Network IP 102 ge-0/0/1.0 10.32.37.241 10.32.37.128/26 Intra Network IP 101 ge-0/0/1.0 10.32.37.241 #show route advertising-protocol bgp 10.32.94.169 Prefix Nexthop MED Lclpref AS path * 10.32.37.0/24 Self 16777215 I * 10.32.37.0/26 Self 101 I * 10.32.37.64/26 Self 102 I * 10.32.37.128/26 Self 101 I


Packet format


Message header format

* Marker: Included for compatibility, must be set to all ones. * Length: Total length of the message in octets, including the header. * Type: Type of BGP message. The following values are defined: ** Open (1) ** Update (2) ** Notification (3) ** KeepAlive (4) ** Route-Refresh (5) note: "Marker" and "Length" is omitted from the examples.


Open Packet

; Version (8bit): Version of BGP used. ; My AS (16bit): Senders autonomous system number. ; Hold Time (16bit): Timeout timer, used to calculate KeepAlive messages.Default 90 seconds. ; BGP Identifier (32bit): IP-address of sender. : Optional Parameters Length (8 bit): total length of the Optional parameters field. Example of Open Message Type: Open Message (1) Version: 4 My AS: 64496 Hold Time: 90 BGP Identifier: Optional Parameters Length: 16 Optional Parameters: Capability: Multiprotocol extensions capability (1) Capability: Route refresh capability (2) Capability: Route refresh capability (Cisco) (128)


Update Packet

Only changes are sent, after initial exchange, only difference (add/change/removed) are sent. Example of UPDATE Message Type: UPDATE Message (2) Withdrawn Routes Length: 0 Total Path Attribute Length: 25 Path attributes ORIGIN: IGP AS_PATH: 64500 NEXT_HOP: 192.0.2.254 MULTI_EXIT_DISC: 0 Network Layer Reachability Information (NLRI)


Notification

If there is an error it is because one of the fields in the OPEN or UPDATE message does not match between the peers, e.g., BGP version mismatch, the peering router expects a different My AS, etc. The router then sends a Notification message to the peer indicating why the error occurred. Example of NOTIFICATION Message Type: NOTIFICATION Message (3) Major error Code: OPEN Message Error (2) Minor error Code (Open Message): Bad Peer AS (2) Bad Peer AS: 65200


KeepAlive

KeepAlive messages are sent periodically, to verify that remote peer is still alive. keepalives should be sent at intervals of one third the holdtime. Example of KEEPALIVE Message Type: KEEPALIVE Message (4)


Route-Refresh

Defined in RFC. Allows for soft updating of Adj-RIB-in, without resetting connection. Example of ROUTE-REFRESH Message Type: ROUTE-REFRESH Message (5) Address family identifier (AFI): IPv4 (1) Subtype: Normal route refresh request FC2918with/without ORF FC5291(0) Subsequent address family identifier (SAFI): Unicast (1)


Internal scalability

BGP is "the most scalable of all routing protocols." An autonomous system with internal BGP (iBGP) must have all of its iBGP peers connect to each other in a full mesh (where everyone speaks to everyone directly). This full-mesh configuration requires that each router maintain a session with every other router. In large networks, this number of sessions may degrade the performance of routers, due to either a lack of memory, or high CPU process requirements.


Route reflectors

Route reflectors (RRs) reduce the number of connections required in an AS. A single router (or two for redundancy) can be made an RR: other routers in the AS need only be configured as peers to them. An RR offers an alternative to the logical full-mesh requirement of iBGP. The purpose of the RR is concentration. Multiple BGP routers can peer with a central point, the RR acting as an RR server rather than peer with every other router in a full mesh. All the other iBGP routers become RR clients. This approach, similar to
OSPF Open Shortest Path First (OSPF) is a routing protocol for Internet Protocol (IP) networks. It uses a link state routing (LSR) algorithm and falls into the group of interior gateway protocols (IGPs), operating within a single autonomous syst ...
's DR/BDR feature, provides large networks with added iBGP scalability. In a fully meshed iBGP network of 10 routers, 90 individual CLI statements (spread throughout all routers in the topology) are needed just to define the remote-AS of each peer: this quickly becomes a headache to manage. An RR topology can cut these 90 statements down to 18, offering a viable solution for the larger networks administered by ISPs. An RR is a
single point of failure A single point of failure (SPOF) is a part of a system that, if it fails, will stop the entire system from working. SPOFs are undesirable in any system with a goal of high availability or reliability, be it a business practice, software appl ...
, therefore at least a second RR may be configured in order to provide redundancy. As it is an additional peer for the other 10 routers, it approximately doubles the number of CLI statements, requiring an additional statements in this case. In a BGP multipath environment the additional RR also can benefit the network by adding local routing throughput if the RRs are acting as traditional routers instead of just a dedicated RR server role. RRs and confederations both reduce the number of iBGP peers to each router and thus reduce processing overhead. RRs are a pure performance-enhancing technique, while confederations also can be used to implement more fine-grained policy.


Rules

RR servers propagate routes inside the AS based on the following rules: * Routes are always reflected to eBGP peers. * Routes are never reflected to the originator of the route. * If a route is received from a non-client peer, reflect to client peers. * If a route is received from a client peer, reflect to client and non-client peers.


Cluster

An RR and its clients form a ''cluster''. The ''cluster ID'' is then attached to every route advertised by the RR to its client or nonclient peers. A cluster ID is a cumulative, non-transitive BGP attribute, and every RR must prepend the local cluster ID to the cluster list to avoid routing loops.


Confederation

Confederations are sets of autonomous systems. In common practice, only one of the confederation AS numbers is seen by the Internet as a whole. Confederations are used in very large networks where a large AS can be configured to encompass smaller more manageable internal ASs. The confederated AS is composed of multiple ASs. Each confederated AS alone has iBGP fully meshed and has connections to other ASs inside the confederation. Even though these ASs have eBGP peers to ASs within the confederation, the ASs exchange routing as if they used iBGP. In this way, the confederation preserves next hop, metric, and local preference information. To the outside world, the confederation appears to be a single AS. With this solution, iBGP transit AS problems can be resolved as iBGP requires a full mesh between all BGP routers: large number of TCP sessions and unnecessary duplication of routing traffic. Confederations can be used in conjunction with route reflectors. Both confederations and route reflectors can be subject to persistent oscillation unless specific design rules, affecting both BGP and the interior routing protocol, are followed. These alternatives can introduce problems of their own, including the following: * route oscillation * sub-optimal routing * increase of BGP convergence time Additionally, route reflectors and BGP confederations were not designed to ease BGP router configuration. Nevertheless, these are common tools for experienced BGP network architects. These tools may be combined, for example, as a hierarchy of route reflectors.


Stability

The routing tables managed by a BGP implementation are adjusted continually to reflect actual changes in the network, such as links or routers going down and coming back up. In the network as a whole, it is normal for these changes to happen almost continuously, but for any particular router or link, changes are expected to be relatively infrequent. If a router is misconfigured or mismanaged then it may get into a rapid cycle between down and up states. This pattern of repeated withdrawal and re-announcement known as
route flapping In computer networking and telecommunications, route flapping occurs when a router alternately advertises a destination network via one route then another, or as unavailable and then available again, in quick sequence. Route flapping is caused b ...
can cause excessive activity in all the other routers that know about the cycling entity, as the same route is continually injected and withdrawn from the routing tables. The BGP design is such that delivery of traffic may not function while routes are being updated. On the Internet, a BGP routing change may cause outages for several minutes. A feature known as ''route flap damping''
RFC 2439
is built into many BGP implementations in an attempt to mitigate the effects of route flapping. Without damping, the excessive activity can cause a heavy processing load on routers, which may in turn delay updates on other routes, and so affect overall routing stability. With damping, a route's flapping is exponentially decayed. At the first instance when a route becomes unavailable and quickly reappears, damping does not take effect, so as to maintain the normal fail-over times of BGP. At the second occurrence, BGP shuns that prefix for a certain length of time; subsequent occurrences are timed out exponentially. After the abnormalities have ceased and a suitable length of time has passed for the offending route, prefixes can be reinstated and its slate wiped clean. Damping can also mitigate
denial of service In computing, a denial-of-service attack (DoS attack) is a cyber-attack in which the perpetrator seeks to make a machine or network resource unavailable to its intended users by temporarily or indefinitely disrupting services of a host connec ...
attacks; damping timings are highly customizable. It is also suggested in RFC 2439 (under "Design Choices -> Stability Sensitive Suppression of Route Advertisement") that route flap damping is a feature more desirable if implemented to Exterior Border Gateway Protocol Sessions (eBGP sessions or simply called exterior peers) and not on Interior Border Gateway Protocol Sessions (iBGP sessions or simply called internal peers); With this approach when a route flaps inside an autonomous system, it is not propagated to the external ASs flapping a route to an eBGP will have a chain of flapping for the particular route throughout the backbone. This method also successfully avoids the overhead of route flap damping for iBGP sessions. Subsequent research has shown that flap damping can actually lengthen convergence times in some cases, and can cause interruptions in connectivity even when links are not flapping. Moreover, as backbone links and router processors have become faster, some network architects have suggested that flap damping may not be as important as it used to be, since changes to the routing table can be handled much faster by routers. This has led the RIPE Routing Working Group to write that "with the current implementations of BGP flap damping, the application of flap damping in ISP networks is NOT recommended. ... If flap damping is implemented, the ISP operating that network will cause side-effects to their customers and the Internet users of their customers' content and services ... . These side-effects would quite likely be worse than the impact caused by simply not running flap damping at all." Improving stability without the problems of flap damping is the subject of current research.


Routing table growth

One of the largest problems faced by BGP, and indeed the Internet infrastructure as a whole, is the growth of the Internet routing table. If the global routing table grows to the point where some older, less capable routers cannot cope with the memory requirements or the CPU load of maintaining the table, these routers will cease to be effective gateways between the parts of the Internet they connect. In addition, and perhaps even more importantly, larger routing tables take longer to stabilize (see above) after a major connectivity change, leaving network service unreliable, or even unavailable, in the interim. Until late 2001, the global routing table was growing exponentially, threatening an eventual widespread breakdown of connectivity. In an attempt to prevent this, ISPs cooperated in keeping the global routing table as small as possible, by using
Classless Inter-Domain Routing Classless Inter-Domain Routing (CIDR ) is a method for allocating IP addresses and for IP routing. The Internet Engineering Task Force introduced CIDR in 1993 to replace the previous classful network addressing architecture on the Internet. Its g ...
(CIDR) and
route aggregation A supernetwork, or supernet, is an Internet Protocol (IP) network that is formed by aggregation of multiple networks (or subnets) into a larger network. The new routing prefix for the aggregate network represents the constituent networks in a ...
. While this slowed the growth of the routing table to a linear process for several years, with the expanded demand for
multihoming Multihoming is the practice of connecting a host or a computer network to more than one network. This can be done in order to increase reliability or performance. A typical host or end-user network is connected to just one network. Connecting ...
by end user networks the growth was once again superlinear by the middle of 2004.


512k day

A
Y2K The year 2000 problem, also known as the Y2K problem, Y2K scare, millennium bug, Y2K bug, Y2K glitch, Y2K error, or simply Y2K refers to potential computer errors related to the formatting and storage of calendar data for dates in and after ...
-like overflow triggered in 2014 for those models that were not appropriately updated. While a full IPv4 BGP table (512k day) was in excess of 512,000 prefixes, many older routers had a limit of 512k (512,000–524,288) routing table entries. On August 12, 2014, outages resulting from full tables hit
eBay eBay Inc. ( ) is an American multinational e-commerce company based in San Jose, California, that facilitates consumer-to-consumer and business-to-consumer sales through its website. eBay was founded by Pierre Omidyar in 1995 and became ...
,
LastPass LastPass is a password manager distributed in subscription form as well as a freemium model with limited functionality. The standard version of LastPass comes with a web interface, but also includes plugins for various web browsers and apps fo ...
and Microsoft Azure among others. A number of Cisco routers commonly in use had TCAM, a form of high-speed
content-addressable memory Content-addressable memory (CAM) is a special type of computer memory used in certain very-high-speed searching applications. It is also known as associative memory or associative storage and compares input search data against a table of stored d ...
, for storing BGP advertised routes. On impacted routers, the TCAM was default allocated as 512k IPv4 routes and 256k IPv6 routes. While the reported number of IPv6 advertised routes was only about 20k, the number of advertised IPv4 routes reached the default limit, causing a
spillover effect In economics a spillover is an economic event in one context that occurs because of something else in a seemingly unrelated context. For example, externalities of economic activity are non-monetary spillover effects upon non-participants. Odors f ...
as routers attempted to compensate for the issue by using slow software routing (as opposed to fast hardware routing via TCAM). The main method for dealing with this issue involves operators changing the TCAM allocation to allow more IPv4 entries, by reallocating some of the TCAM reserved for IPv6 routes, which requires a reboot on most routers. The 512k problem was predicted by a number of IT professionals. The actual allocations which pushed the number of routes above 512k was the announcement of about 15,000 new routes in short order, starting at 07:48 UTC. Almost all of these routes were to
Verizon Verizon Communications Inc., commonly known as Verizon, is an American multinational telecommunications conglomerate and a corporate component of the Dow Jones Industrial Average. The company is headquartered at 1095 Avenue of the Americas ...
Autonomous Systems An autonomous robot is a robot that acts without recourse to human control. The first autonomous robots environment were known as Elmer and Elsie, which were constructed in the late 1940s by W. Grey Walter. They were the first robots in history t ...
701 and 705, created as a result of deaggregation of larger blocks, introducing thousands of new routes, and making the routing table reach 515,000 entries. The new routes appear to have been reaggregated within 5 minutes, but instability across the Internet apparently continued for a number of hours. Even if Verizon had not caused the routing table to exceed 512k entries in the short spike, it would have happened soon anyway through natural growth. Route summarization is often used to improve aggregation of the BGP global routing table, thereby reducing the necessary table size in routers of an AS. Consider AS1 has been allocated the big address space of , this would be counted as one route in the table, but due to customer requirement or traffic engineering purposes, AS1 wants to announce smaller, more specific routes of , , and . The prefix does not have any hosts so AS1 does not announce a specific route . This all counts as AS1 announcing four routes. AS2 will see the four routes from AS1 (, , , and ) and it is up to the routing policy of AS2 to decide whether or not to take a copy of the four routes or, as overlaps all the other specific routes, to just store the summary, . If AS2 wants to send data to prefix , it will be sent to the routers of AS1 on route . At AS1's router, it will either be dropped or a destination unreachable ICMP message will be sent back, depending on the configuration of AS1's routers. If AS1 later decides to drop the route , leaving , , and , AS1 will drop the number of routes it announces to three. AS2 will see the three routes, and depending on the routing policy of AS2, it will store a copy of the three routes, or aggregate the prefix's and to , thereby reducing the number of routes AS2 stores to only two: and . If AS2 wants to send data to prefix , it will be dropped or a destination unreachable ICMP message will be sent back at the routers of AS2 (not AS1 as before), because would not be in the routing table.


AS numbers depletion and 32-bit ASNs

The RFC 1771 (''A Border Gateway Protocol 4 (BGP-4)'') planned the coding of AS numbers on 16 bits, for 64510 possible public AS, since ASN 64512 to 65534 were reserved for private use (0 and 65535 being forbidden). In 2011, only 15000 AS numbers were still available, and projections were envisioning a complete depletion of available AS numbers in September 2013. RFC 6793 extends AS coding from 16 to 32 bits (keeping the 16 bits AS range 0 to 65535, and its reserved AS numbers), which now allows up to 4 billion available AS. An additional private AS range is also defined in RFC 6996 (from 4200000000 to 4294967294, 4294967295 being forbidden by RFC 7300). To allow the traversal of router groups not able to manage those new ASNs, the new attribute OT AS4_PATH is used. 32-bit ASN assignments started in 2007.


Load balancing

Another factor causing this growth of the routing table is the need for load balancing of multi-homed networks. It is not a trivial task to balance the inbound traffic to a multi-homed network across its multiple inbound paths, due to limitation of the BGP route selection process. For a multi-homed network, if it announces the same network blocks across all of its BGP peers, the result may be that one or several of its inbound links become congested while the other links remain under-utilized, because external networks all picked that set of congested paths as optimal. Like most other routing protocols, BGP does not detect congestion. To work around this problem, BGP administrators of that multihomed network may divide a large contiguous IP address block into smaller blocks and tweak the route announcement to make different blocks look optimal on different paths, so that external networks will choose a different path to reach different blocks of that multi-homed network. Such cases will increase the number of routes as seen on the global BGP table. One method growing in popularity to address the load balancing issue is to deploy BGP/LISP (
Locator/Identifier Separation Protocol Locator/ID Separation Protocol (LISP) () is a "map-and-encapsulate" protocol which is developed by the Internet Engineering Task Force LISP Working Group. The basic idea behind the separation is that the Internet architecture combines two functio ...
) gateways within an
Internet exchange point Internet exchange points (IXes or IXPs) are common grounds of IP networking, allowing participant Internet service providers (ISPs) to exchange data destined for their respective networks. IXPs are generally located at places with preexisting ...
to allow ingress traffic engineering across multiple links. This technique does not increase the number of routes seen on the global BGP table.


Security

By design, routers running BGP accept advertised routes from other BGP routers by default. This allows for automatic and decentralized routing of traffic across the Internet, but it also leaves the Internet potentially vulnerable to accidental or malicious disruption, known as
BGP hijacking BGP hijacking (sometimes referred to as prefix hijacking, route hijacking or IP hijacking) is the illegitimate takeover of groups of IP addresses by corrupting Internet routing tables maintained using the Border Gateway Protocol (BGP). Background ...
. Due to the extent to which BGP is embedded in the core systems of the Internet, and the number of different networks operated by many different organizations which collectively make up the Internet, correcting this vulnerability (such as by introducing the use of cryptographic keys to verify the identity of BGP routers) is a technically and economically challenging problem.


Extensions

An extension to BGP is the use of multipathing this typically requires identical MED, weight, origin, and AS-path although some implementations provide the ability to relax the AS-path checking to only expect an equal path length rather than the actual AS numbers in the path being expected to match too. This can then be extended further with features like Cisco's dmzlink-bw which enables a ratio of traffic sharing based on bandwidth values configured on individual links. Multiprotocol Extensions for BGP (MBGP), sometimes referred to as Multiprotocol BGP or Multicast BGP and defined in IETF RFC 4760, is an extension to (BGP) that allows different types of addresses (known as address families) to be distributed in parallel. Whereas standard BGP supports only IPv4 unicast addresses, Multiprotocol BGP supports IPv4 and IPv6 addresses and it supports unicast and multicast variants of each. Multiprotocol BGP allows information about the topology of IP multicast-capable routers to be exchanged separately from the topology of normal IPv4 unicast routers. Thus, it allows a multicast routing topology different from the unicast routing topology. Although MBGP enables the exchange of inter-domain multicast routing information, other protocols such as the Protocol Independent Multicast family are needed to build trees and forward multicast traffic. Multiprotocol BGP is also widely deployed in case of
MPLS Multiprotocol Label Switching (MPLS) is a routing technique in telecommunications networks that directs data from one node to the next based on labels rather than network addresses. Whereas network addresses identify endpoints the labels identif ...
L3 VPN, to exchange VPN labels learned for the routes from the customer sites over the MPLS network, in order to distinguish between different customer sites when the traffic from the other customer sites comes to the Provider Edge router (PE router) for routing.


Uses

BGP4 is standard for Internet routing and required of most
Internet service provider An Internet service provider (ISP) is an organization that provides services for accessing, using, or participating in the Internet. ISPs can be organized in various forms, such as commercial, community-owned, non-profit, or otherwise private ...
s (ISPs) to establish routing between one another. Very large private IP networks use BGP internally. An example is the joining of a number of large
Open Shortest Path First Open Shortest Path First (OSPF) is a routing protocol for Internet Protocol (IP) networks. It uses a link state routing (LSR) algorithm and falls into the group of interior gateway protocols (IGPs), operating within a single autonomous syst ...
(OSPF) networks, when OSPF by itself does not scale to the size required. Another reason to use BGP is multihoming a network for better redundancy, either to multiple access points of a single ISP or to multiple ISPs.


Implementations

Routers, especially small ones intended for small office/home office (SOHO) use, may not include BGP software. Some SOHO routers simply are not capable of running BGP or using BGP routing tables of any size. Other commercial routers may need a specific software executable image that contains BGP, or a license that enables it. Devices marketed as
Layer 3 switch A multilayer switch (MLS) is a computer networking device that switches on OSI layer 2 like an ordinary network switch and provides extra functions on higher OSI layers. The MLS was invented by engineers at Digital Equipment Corporation. Switch ...
es are less likely to support BGP than devices marketed as routers, but high-end Layer 3 switches usually can run BGP. Products marketed as switches may or may not have a size limitation on BGP tables, such as 20,000 routes, far smaller than a full Internet table plus internal routes. These devices may be perfectly reasonable and useful when used for BGP routing of some smaller part of the network, such as a confederation-AS representing one of several smaller enterprises that are linked, by a BGP backbone of backbones, or a small enterprise that announces routes to an ISP but only accepts a
default route In computer networking, the default route is a configuration of the Internet Protocol (IP) that establishes a forwarding rule for packets when no specific address of a next-hop host is available from the routing table or other routing mechanisms ...
and perhaps a small number of aggregated routes. A BGP router used only for a network with a single point of entry to the Internet may have a much smaller routing table size (and hence RAM and CPU requirement) than a multihomed network. Even simple multihoming can have modest routing table size. See RFC 4098 for vendor-independent performance parameters for single-BGP-router convergence in the control plane. The actual amount of memory required in a BGP router depends on the amount of BGP information exchanged with other BGP speakers and the way in which the particular router stores BGP information. The router may have to keep more than one copy of a route, so it can manage different policies for route advertising and acceptance to a specific neighboring AS. The term ''view'' is often used for these different policy relationships on a running router. If one router implementation takes more memory per route than another implementation, this may be a legitimate design choice, trading processing speed against memory. A full IPv4 BGP table is in excess of 590,000 prefixes. Large ISPs may add another 50% for internal and customer routes. Again depending on implementation, separate tables may be kept for each view of a different peer AS. Notable free and open-source implementations of BGP include: *
BIRD Birds are a group of warm-blooded vertebrates constituting the class Aves (), characterised by feathers, toothless beaked jaws, the laying of hard-shelled eggs, a high metabolic rate, a four-chambered heart, and a strong yet lightweig ...
, a
GPL The GNU General Public License (GNU GPL or simply GPL) is a series of widely used free software licenses that guarantee end users the four freedoms to run, study, share, and modify the software. The license was the first copyleft for general u ...
routing package for Unix-like systems. *
FRRouting Free Range Routing or FRRouting or FRR is a network routing software suite running on Unix-like platforms, particularly Linux, Solaris, OpenBSD, FreeBSD and NetBSD. It was created as a fork from Quagga. FRRouting is distributed under the terms ...
, a fork of Quagga for
Unix-like A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-li ...
systems ; and its ancestors: **
Quagga The quagga ( or ) (''Equus quagga quagga'') is a subspecies of the plains zebra that was endemic to South Africa until it was hunted to extinction in the late 19th century. It was long thought to be a distinct species, but early genetic ...
, a fork of GNU Zebra for
Unix-like A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-li ...
systems (no longer developed). **
GNU Zebra Zebra is a routing software package that provides TCP/IP based routing services with routing protocols support such as RIP, OSPF and BGP. Zebra also supports special BGP Route Reflector and Route Server behavior. In addition to traditional IP ...
, a GPL routing suite supporting BGP4 (decommissioned). *
OpenBGPD OpenBGPD, also known as OpenBSD Border Gateway Protocol Daemon, is a server software program that allows general purpose computers to be used as routers. It is a Unix system daemon that provides a free, open-source implementation of the Border G ...
, a BSD-licensed implementation by the OpenBSD team. *
XORP XORP is an open-source Internet Protocol routing software suite originally designed at the International Computer Science Institute in Berkeley, California. The name is derived from ''eXtensible Open Router Platform''. It supports OSPF, BGP, RI ...
, the eXtensible Open Router Platform, a BSD-licensed suite of routing protocols. Systems for testing BGP conformance, load or stress performance come from vendors such as: *
Agilent Technologies Agilent Technologies, Inc. is an American life sciences company that provides instruments, software, services, and consumables for the entire laboratory workflow. Its global headquarters is located in Santa Clara, California. Agilent was establi ...
* GNS3 open source
network simulator In computer network research, network simulation is a technique whereby a software program replicates the behavior of a real network. This is achieved by calculating the interactions between the different network entities such as routers, switche ...
*
Ixia ''Ixia'' is a genus of cormous plants native to South Africa from the family Iridaceae. Some of them are known as the corn lily. Some distinctive traits include sword-like leaves and long wiry stems with star-shaped flowers. It usually prefers w ...
*
Spirent Communications Spirent Communications plc is a British Multinational corporation, multinational telecommunications testing company headquartered in Crawley, West Sussex, in the United Kingdom. It is listed on the London Stock Exchange and is a constituent of t ...


Standards documents

* , Application of the Border Gateway Protocol in the Internet Protocol (BGP-4) using SMIv2 * , BGP Communities Attribute * , BGP Route Flap Damping * , Route Refresh Capability for BGP-4 * , NOPEER Community for Border Gateway Protocol (BGP) Route Scope Control * , A Border Gateway Protocol 4 (BGP-4) * , BGP Security Vulnerabilities Analysis * , Definitions of Managed Objects for BGP-4 * , BGP-4 Protocol Analysis * , BGP-4 MIB Implementation Survey * , BGP-4 Implementation Report * , Experience with the BGP-4 Protocol * , Standards Maturity Variance Regarding the TCP MD5 Signature Option (RFC 2385) and the BGP-4 Specification * , BGP Extended Communities Attribute * , BGP Route Reflection – An Alternative to Full Mesh Internal BGP (iBGP) * , Graceful Restart Mechanism for BGP * , Multiprotocol Extensions for BGP-4 * , Autonomous System Confederations for BGP * , Capabilities Advertisement with BGP-4 * , Dissemination of Flow Specification Rules * , IPv6 Address Specific BGP Extended Community Attribute * , BGP Support for Four-Octet Autonomous System (AS) Number Space * , IANA Registries for BGP Extended Communities * , Revised Error Handling for BGP UPDATE Messages * , North-Bound Distribution of Link-State and Traffic Engineering Information Using BGP * , Advertisement of Multiple Paths in BGP * , BGP Large Communities Attribute * , Use of BGP Large Communities * , Policy Behavior for Well-Known BGP Communities
draft-ietf-idr-custom-decision-08
nbsp;– BGP Custom Decision Process, Feb 3, 2017
Selective Route Refresh for BGP
IETF draft * , Obsolete – Border Gateway Protocol (BGP) * , Obsolete – A Border Gateway Protocol 4 (BGP-4) * , Obsolete – Application of the Border Gateway Protocol in the Internet * , Obsolete – Definitions of Managed Objects for the Fourth Version of the Border Gateway * , Obsolete – A Border Gateway Protocol 4 (BGP-4) * , Obsolete – Autonomous System Confederations for BGP * , Obsolete – BGP Route Reflection An Alternative to Full Mesh iBGP * , Obsolete – Multiprotocol Extensions for BGP-4 * , Obsolete – Autonomous System Confederations for BGP * , Obsolete – Capabilities Advertisement with BGP-4 * , Obsolete – BGP Support for Four-octet AS Number Space


See also

* 2021 Facebook outage *
AS 7007 incident The AS 7007 incident was a major disruption of the Internet on April 25, 1997, that started with a router operated by autonomous system 7007 (MAI Network Services, although sometimes incorrectly attributed to the Florida Internet Exchange) accide ...
*
Internet Assigned Numbers Authority The Internet Assigned Numbers Authority (IANA) is a standards organization that oversees global IP address allocation, autonomous system number allocation, root zone management in the Domain Name System (DNS), media types, and other Interne ...
*
Packet forwarding Packet forwarding is the relaying of packets from one network segment to another by nodes in a computer network. The network layer in the OSI model is responsible for packet forwarding. Models The simplest forwarding model unicastinginvolve ...
*
Private IP PIP in telecommunications and datacommunications stands for Private Internet Protocol or Private IP. PIP refers to connectivity into a private extranet network which by its design emulates the functioning of the Internet. Specifically, the Inte ...
* QPPB *
Regional Internet registry A regional Internet registry (RIR) is an organization that manages the allocation and registration of Internet number resources within a region of the world. Internet number resources include IP addresses and autonomous system (AS) numbers. ...
*
Route analytics Network monitoring is the use of a system that constantly monitors a computer network for slow or failing components and that notifies the network administrator (via email, SMS or other alarms) in case of outages or other trouble. Network monitorin ...
*
Route filtering In the context of network routing, route filtering is the process by which certain routes are not considered for inclusion in the local route database, or not advertised to one's neighbours. Route filtering is particularly important for the Borde ...
*
Routing Assets Database The Routing Assets Database (RADb), formerly known as the Routing Arbiter Database is a public database in which the operators of Internet networks publish authoritative declarations of routing policy for their Autonomous System (AS) which are, in ...


Notes


References


Further reading


Chapter "Border Gateway Protocol (BGP)"
in the
Cisco Cisco Systems, Inc., commonly known as Cisco, is an American-based multinational digital communications technology conglomerate corporation headquartered in San Jose, California. Cisco develops, manufactures, and sells networking hardware, ...
"IOS Technology Handbook"


External links


BGP Routing Resources
(includes a dedicated section o
BGP & ISP Core Security

BGP table statistics
{{Authority control Internet Standards Internet protocols Routing protocols Internet architecture