Bufferbloat is a cause of high latency and jitter in

packet-switched network In telecommunications, packet switching is a method of grouping data into '' packets'' that are transmitted over a digital network. Packets are made of a header and a payload. Data in the header is used by networking hardware to direct the pack ...

s caused by excess buffering of packets. Bufferbloat can also cause

packet delay variation In computer networking, packet delay variation (PDV) is the difference in end-to-end one-way delay between selected packets in a flow with any lost packets being ignored.RFC 3393 The effect is sometimes referred to as packet jitter, although th ...

(also known as jitter), as well as reduce the overall network

throughput Network throughput (or just throughput, when in context) refers to the rate of message delivery over a communication channel, such as Ethernet or packet radio, in a communication network. The data that these messages contain may be delivered ove ...

. When a router or

switch In electrical engineering, a switch is an electrical component that can disconnect or connect the conducting path in an electrical circuit, interrupting the electric current or diverting it from one conductor to another. The most common type of ...

is configured to use excessively large buffers, even very high-speed networks can become practically unusable for many interactive applications like

voice over IP Voice over Internet Protocol (VoIP), also called IP telephony, is a method and group of technologies for the delivery of voice communications and multimedia sessions over Internet Protocol (IP) networks, such as the Internet. The terms Internet t ...

(VoIP),

audio streaming Streaming media is multimedia that is delivered and consumed in a continuous manner from a source, with little or no intermediate storage in network elements. ''Streaming'' refers to the delivery method of content, rather than the content it ...

, online gaming, and even ordinary web browsing. Some communications equipment manufacturers designed unnecessarily large buffers into some of their network products. In such equipment, bufferbloat occurs when a network link becomes congested, causing packets to become queued for long periods in these oversized buffers. In a first-in first-out queuing system, overly large buffers result in longer queues and higher latency, and do not improve network throughput. It can also be induced by specific slow-speed connections hindering the on-time delivery of other packets. The bufferbloat phenomenon was described as early as 1985. It gained more widespread attention starting in 2009.

Buffering

An established

rule of thumb In English, the phrase ''rule of thumb'' refers to an approximate method for doing something, based on practical experience rather than theory. This usage of the phrase can be traced back to the 17th century and has been associated with various t ...

for the network equipment manufacturers was to provide buffers large enough to accommodate at least 250 ms of buffering for a stream of traffic passing through a device. For example, a router's

Gigabit Ethernet In computer networking, Gigabit Ethernet (GbE or 1 GigE) is the term applied to transmitting Ethernet frames at a rate of a gigabit per second. The most popular variant, 1000BASE-T, is defined by the IEEE 802.3ab standard. It came into use ...

interface would require a relatively large 32 MB buffer. Such sizing of the buffers can lead to failure of the

TCP congestion control algorithm Transmission Control Protocol (TCP) uses a network congestion-avoidance algorithm that includes various aspects of an additive increase/multiplicative decrease (AIMD) scheme, along with other schemes including slow start and congestion wind ...

. The buffers then take some time to drain, before congestion control resets and the TCP connection ramps back up to speed and fills the buffers again. Bufferbloat thus causes problems such as high and variable latency, and choking network bottlenecks for all other flows as the buffer becomes full of the packets of one TCP stream and other packets are then dropped. A bloated buffer has an effect only when this buffer is actually used. In other words, oversized buffers have a damaging effect only when the link they buffer becomes a bottleneck. The size of the buffer serving a bottleneck can be measured using the ping utility provided by most operating systems. First, the other host should be pinged continuously; then, a several-seconds-long download from it should be started and stopped a few times. By design, the TCP congestion avoidance algorithm will rapidly fill up the bottleneck on the route. If downloading (and uploading, respectively) correlates with a direct and important increase of the round trip time reported by ping, then it demonstrates that the buffer of the current bottleneck in the download (and upload, respectively) direction is bloated. Since the increase of the round trip time is caused by the buffer on the bottleneck, the maximum increase gives a rough estimation of its size in milliseconds. In the previous example, using an advanced traceroute tool instead of the simple pinging (for example,

MTR The Mass Transit Railway (MTR) is a major public transport network serving :Hong Kong. Operated by the MTR Corporation Limited (MTRCL), it consists of heavy rail, light rail, and feeder bus service centred on a 10-line rapid transit network ...

) will not only demonstrate the existence of a bloated buffer on the bottleneck, but will also pinpoint its location in the network. Traceroute achieves this by displaying the route (path) and measuring transit delays of packets across the network. The history of the route is recorded as round-trip times of the packets received from each successive host (remote node) in the route (path).

Mechanism

Most

TCP congestion control Transmission Control Protocol (TCP) uses a network congestion-avoidance algorithm that includes various aspects of an additive increase/multiplicative decrease (AIMD) scheme, along with other schemes including slow start and congestion windo ...

algorithms rely on measuring the occurrence of packet drops to determine the available

bandwidth Bandwidth commonly refers to: * Bandwidth (signal processing) or ''analog bandwidth'', ''frequency bandwidth'', or ''radio bandwidth'', a measure of the width of a frequency range * Bandwidth (computing), the rate of data transfer, bit rate or thr ...

between two ends of a connection. The algorithms speed up the data transfer until packets start to drop, then slow down the transmission rate. Ideally, they keep adjusting the transmission rate until it reaches an equilibrium speed of the link. So that the algorithms can select a suitable transfer speed, the feedback about packet drops must occur in a timely manner. With a large buffer that has been filled, the packets will arrive at their destination, but with a higher latency. The packets were not dropped, so TCP does not slow down once the uplink has been saturated, further filling the buffer. Newly arriving packets are dropped only when the buffer is fully saturated. Once this happens TCP may even decide that the path of the connection has changed, and again go into the more aggressive search for a new operating point. Packets are queued within a network buffer before being transmitted; in problematic situations, packets are dropped only if the buffer is full. On older routers, buffers were fairly small so they filled quickly and therefore packets began to drop shortly after the link became saturated, so the TCP protocol could adjust and the issue would not become apparent. On newer routers, buffers have become large enough to hold several seconds of buffered data. To TCP, a congested link can appear to be operating normally as the buffer fills. The TCP algorithm is unaware the link is congested and does not start to take corrective action until the buffer finally overflows and packets are dropped. All packets passing through a simple buffer implemented as a single queue will experience similar delay, so the latency of any connection that passes through a filled buffer will be affected. Available channel bandwidth can also end up being unused, as some fast destinations may not be promptly reached due to buffers clogged with data awaiting delivery to slow destinations. These effects impair interactivity of applications using other

network protocol A communication protocol is a system of rules that allows two or more entities of a communications system to transmit information via any kind of variation of a physical quantity. The protocol defines the rules, syntax, semantics and synchroniza ...

s, including UDP used in latency-sensitive applications like VoIP and online gaming.

Impact on applications

Regardless of bandwidth requirements, any type of a service which requires consistently low latency or jitter-free transmission can be affected by bufferbloat. Examples include voice calls, online gaming,

video chat Videotelephony, also known as videoconferencing and video teleconferencing, is the two-way or multipoint reception and transmission of audio and video signals by people in different locations for real time communication.McGraw-Hill Concise Ency ...

, and other interactive applications such as

instant messaging Instant messaging (IM) technology is a type of online chat allowing real-time text transmission over the Internet or another computer network. Messages are typically transmitted between two or more parties, when each user inputs text and tri ...

, radio streaming,

video on demand Video on demand (VOD) is a media distribution system that allows users to access videos without a traditional video playback device and the constraints of a typical static broadcasting schedule. In the 20th century, broadcasting in the form of ...

, and

remote login Remote may refer to: Arts, entertainment, and media * Remote (1993 film), ''Remote'' (1993 film), a 1993 movie * Remote (2004 film), ''Remote'' (2004 film), a Tamil-language action drama film * Remote (album), ''Remote'' (album), a 1988 album by ...

. When the bufferbloat phenomenon is present and the network is under load, even normal web page loads can take many seconds to complete, or simple DNS queries can fail due to timeouts. Actually any

TCP connection TCP may refer to: Science and technology * Transformer coupled plasma * Tool Center Point, see Robot end effector Computing * Transmission Control Protocol, a fundamental Internet standard * Telephony control protocol, a Bluetooth communicati ...

can timeout and disconnect, and UDP packets can get lost. Since the continuation of a TCP download stream depends on ACK packets in the upload stream, a bufferbloat problem in the upload can cause failure of other non-related download applications, because the client ACK packets do not timely reach the internet server. You might e.g. limit the transmission rate of an upload

OneDrive Microsoft OneDrive (formerly SkyDrive) is a file hosting service operated by Microsoft. First launched in August 2007, it enables registered users to share and synchronize their files. OneDrive also works as the storage backend of the web vers ...

synchronisation in order not to disturb other

home network A home network or home area network (HAN) is a type of computer network that facilitates communication among devices within the close vicinity of a home. Devices capable of participating in this network, for example, smart devices such as netw ...

users, like listening to

internet radio Online radio (also web radio, net radio, streaming radio, e-radio, IP radio, Internet radio) is a digital audio service transmitted via the Internet. Broadcasting on the Internet is usually referred to as webcasting since it is not transmitted ...

Detection

The DSL Reports Speedtest is an easy-to-use test that includes a score for bufferbloat. The ICSI Netalyzr was another on-line tool that could be used for checking networks for the presence of bufferbloat, together with checking for many other common configuration problems. The service was shut down in March 2019. The bufferbloat.net web site lists tools and procedures for determining whether a connection has excess buffering that will slow it down.

Solutions and mitigations

Several technical solutions exist which can be broadly grouped into two categories: solutions that target the network and solutions that target the endpoints. The two types of solutions are often complementary. The problem sometimes arrives with a combination of fast and slow network paths. Network solutions generally take the form of queue management algorithms. This type of solution has been the focus of the

IETF The Internet Engineering Task Force (IETF) is a standards organization for the Internet and is responsible for the technical standards that make up the Internet protocol suite (TCP/IP). It has no formal membership roster or requirements an ...

AQM working group. Notable examples include: * Limiting the IP queue length, see TCP tuning * AQM algorithms such as CoDel and

PIE A pie is a baked dish which is usually made of a pastry dough casing that contains a filling of various sweet or savoury ingredients. Sweet pies may be filled with fruit (as in an apple pie), nuts (pecan pie), brown sugar ( sugar pie), sweete ...

. * Hybrid AQM and packet scheduling algorithms such as FQ-CoDel. * Amendments to the

DOCSIS Data Over Cable Service Interface Specification (DOCSIS) is an international telecommunications standard that permits the addition of high-bandwidth data transfer to an existing cable television (CATV) system. It is used by many cable televisio ...

standard to enable smarter buffer control in

cable modem A cable modem is a type of network bridge that provides bi-directional data communication via radio frequency channels on a hybrid fibre-coaxial (HFC), radio frequency over glass (RFoG) and coaxial cable infrastructure. Cable modems are primar ...

s. * Integration of queue management ( FQ-CoDel) into the

WiFi Wi-Fi () is a family of wireless network protocols, based on the IEEE 802.11 family of standards, which are commonly used for local area networking of devices and Internet access, allowing nearby digital devices to exchange data by radio wa ...

subsystem of the

Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, whi ...

operating system as Linux is commonly used in wireless access points.source code
Notable examples of solutions targeting the endpoints are: * The BBR congestion control algorithm for TCP. * The Micro Transport Protocol employed by many BitTorrent clients. * Techniques for using fewer connections, such as

HTTP pipelining HTTP pipelining is a feature of HTTP/1.1 which allows multiple HTTP requests to be sent over a single TCP connection without waiting for the corresponding responses. HTTP/1.1 requires servers to respond to pipelined requests correctly, with non-p ...

or HTTP/2 instead of the plain HTTP protocol. The problem may also be mitigated by reducing the buffer size on the OS and network hardware; however, this is often not configurable and optimal buffer size is dependent on line rate which may differ for different destinations. Utilizing

DiffServ Differentiated services or DiffServ is a computer networking architecture that specifies a mechanism for classifying and managing network traffic and providing quality of service (QoS) on modern IP networks. DiffServ can, for example, be used ...

(and employing multiple priority-based queues) helps in prioritizing transmission of low-latency traffic (such as VoIP, videoconferencing, gaming), relegating dealing with congestion and bufferbloat onto non-prioritized traffic.

Optimal buffer size

For the longest delay TCP connections to still get their fair share of the bandwidth, the buffer size should be at least the

bandwidth-delay product In data communications, the bandwidth-delay product is the product of a data link's capacity (in bits per second) and its round-trip delay time (in seconds). The result, an amount of data measured in bits (or bytes), is equivalent to the maxim ...

divided by the square root of the number of simultaneous streams. A typical rule of thumb is 50 ms of line rate data, but some popular consumer grade switches only have 1 ms, which may result in extra bandwidth loss on the longer delay connections in case of local contention with others.

References

External links

BufferBloat: What's Wrong with the Internet?
A discussion with

Vint Cerf Vinton Gray Cerf (; born June 23, 1943) is an American Internet pioneer and is recognized as one of " the fathers of the Internet", sharing this title with TCP/IP co-developer Bob Kahn. He has received honorary degrees and awards that include ...

, Van Jacobson, Nick Weaver, and

Jim Gettys Jim Gettys (born 15 October 1953) is an American computer programmer. He was involved in multiple computer related projects. Activity Gettys worked at DEC's Cambridge Research Laboratory. Until January 2009, he was the Vice President of Sof ...

* April, 2011, by Jim Gettys, introduction by Vint Cerf * April, 2011, by

, introduction by

* 21 minute demonstration and explanation of typical broadband bufferbloat * {{youtube, uQ9ziQg_1zU, LACNIC - BufferBloat May 2012, by Fred Baker (IETF chair) in Spanish, English slides availabl

TSO sizing and the FQ scheduler
(Jonathan Corbet, LWN.net) Flow control (data) Internet architecture Network performance