![Reverse proxy h2g2bob](https://upload.wikimedia.org/wikipedia/commons/6/67/Reverse_proxy_h2g2bob.svg)
In
computer network
A computer network is a set of computers sharing resources located on or provided by network nodes. The computers use common communication protocols over digital interconnections to communicate with each other. These interconnections ar ...
s, a reverse proxy is the application that sits in front of back-end applications and forwards client (e.g. browser) requests to those applications. Reverse proxies help increase scalability, performance, resilience and security. The resources returned to the client appear as if they originated from the web server itself.
Large websites and
content delivery network
A content delivery network, or content distribution network (CDN), is a geographically distributed network of proxy servers and their data centers. The goal is to provide high availability and performance by distributing the service spatially r ...
s use reverse proxies, together with other techniques, to
balance the load between internal servers. Reverse proxies can keep a
cache of static content, which further reduces the load on these internal servers and the internal network. It is also common for reverse proxies to add features such as
compression or
TLS encryption to the communication channel between the client and the reverse proxy.
Reverse proxies are typically owned or managed by the
web service, and they are accessed by clients from the public Internet. In contrast, a
forward proxy is typically managed by a client (or their company) who is restricted to a private, internal network, except that the client can ask the forward proxy to retrieve resources from the public Internet on behalf of the client.
Reverse proxy servers are implemented in popular
open-source web servers such as
Apache,
Nginx, and
Caddy
Caddy may refer to:
* Caddie, also spelled caddy, a golfer's assistant
* A shopping caddy
* A box or bin, such as a "green bin" for food waste
* Caddy (bridge), an assistant to a tournament director
* Caddy (surname)
* Caddy (given name)
* C ...
. This software can inspect HTTP headers, which, for example, allows it to present a single
IP address
An Internet Protocol address (IP address) is a numerical label such as that is connected to a computer network that uses the Internet Protocol for communication.. Updated by . An IP address serves two main functions: network interface ident ...
to the Internet while relaying requests to different internal servers based on the
domain name
A domain name is a string that identifies a realm of administrative autonomy, authority or control within the Internet. Domain names are often used to identify services provided through the Internet, such as websites, email services and more. ...
of the HTTP request. Dedicated reverse proxy servers such as the open source software
HAProxy and
Squid
True squid are molluscs with an elongated soft body, large eyes, eight arms, and two tentacles in the superorder Decapodiformes, though many other molluscs within the broader Neocoleoidea are also called squid despite not strictly fitting ...
are used by some of the biggest websites on the Internet.
Uses
* Reverse proxies can hide the existence and characteristics of
origin server
In computer networking, upstream server refers to a server that provides service to another server. In other words, upstream server is a server that is located higher in a hierarchy of servers. The highest server in the hierarchy is sometimes ca ...
s.
*
Application firewall features can protect against common web-based attacks, like a
denial-of-service attack (DoS) or distributed denial-of-service attacks (DDoS). Without a reverse proxy, removing malware or initiating
takedowns, for example, can be difficult.
* In the case of
secure websites, a web server may not perform
TLS
TLS may refer to:
Computing
* Transport Layer Security, a cryptographic protocol for secure computer network communication
* Thread level speculation, an optimisation on multiprocessor CPUs
* Thread-local storage, a mechanism for allocating vari ...
encryption
In cryptography, encryption is the process of encoding information. This process converts the original representation of the information, known as plaintext, into an alternative form known as ciphertext. Ideally, only authorized parties can dec ...
itself, but instead offload the task to a reverse proxy that may be equipped with
TLS acceleration hardware. (See
TLS termination proxy.)
* A reverse proxy can
distribute the load from incoming requests to several servers, with each server supporting its own application area. In the case of reverse proxying
web servers, the reverse proxy may have to rewrite the
URL
A Uniform Resource Locator (URL), colloquially termed as a web address, is a reference to a web resource that specifies its location on a computer network and a mechanism for retrieving it. A URL is a specific type of Uniform Resource Identifie ...
in each incoming request in order to match the relevant internal location of the requested resource.
* A reverse proxy can reduce load on its origin servers by
caching
In computing, a cache ( ) is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewher ...
static content and
dynamic content, known as
web acceleration. Proxy caches of this sort can often satisfy a considerable number of website requests, greatly reducing the load on the origin server(s).
* A reverse proxy can optimize content by
compressing it in order to speed up loading times.
* In a technique named "spoon-feeding",
a dynamically generated page can be produced all at once and served to the reverse proxy, which can then return it to the client a little bit at a time. The program that generates the page need not remain open, thus releasing server resources during the possibly extended time the client requires to complete the transfer.
* Reverse proxies can operate wherever multiple web-servers must be accessible via a single public IP address. The web servers listen on different ports in the same machine, with the same local IP address or, possibly, on different machines with different local IP addresses. The reverse proxy analyzes each incoming request and delivers it to the right server within the
local area network.
* Reverse proxies can perform
A/B testing
A/B testing (also known as bucket testing, split-run testing, or split testing) is a user experience research methodology. A/B tests consist of a randomized experiment that usually involves two variants (A and B), although the concept can be als ...
and
multivariate testing without placing JavaScript tags or code into pages.
* A reverse proxy can add access authentication to a web server that does not have any authentication.
Risks
* A reverse proxy can track all IP addresses making requests through it and it can also read and modify any non-encrypted traffic. Thus it can log passwords or inject malware, and might do so if compromised or run by a malicious party.
* When the transit traffic is encrypted and the reverse proxy needs to filter/cache/compress or otherwise modify or improve the traffic, the proxy first must decrypt and re-encrypt communications. This requires the proxy to possess the TLS certificate and its corresponding private key, extending the number of systems that can have access to non-encrypted data and making it a more valuable target for attackers.
* The vast majority of external
data breaches happen either when hackers succeed in abusing an existing reverse proxy that was intentionally deployed by an organisation, or when hackers succeed in
converting an existing Internet-facing server into a reverse proxy server. Compromised or converted systems allow external attackers to specify where they want their attacks proxied to, enabling their access to internal networks and systems.
* Applications that were developed for the internal use of a company are not typically hardened to public standards and are not necessarily designed to withstand all hacking attempts. When an organisation allows external access to such internal applications via a reverse proxy, they might unintentionally increase their own attack surface and invite hackers.
* If a reverse proxy is not configured to filter attacks or it does not receive daily updates to keep its attack signature database up to date, a
zero-day vulnerability can pass through unfiltered, enabling attackers to gain control of the system(s) that are behind the reverse proxy server.
* Using the reverse proxy of a third party (e.g.
Cloudflare, Imperva) places the entire
triad of confidentiality, integrity and availability in the hands of the third party who operates the proxy.
* If a reverse proxy is fronting many different domains, its outage (e.g. by a misconfiguration or DDoS attack) could bring down all fronted domains.
* Reverse proxies can also become a
single point of failure if there is no other way to access the back end server.
See also
*
Network address translation
References
{{Use dmy dates, date=November 2017
Computer networks
Internet architecture