Web caching
   HOME

TheInfoList



OR:

A Web cache (or HTTP cache) is a system for optimizing the
World Wide Web The World Wide Web (WWW), commonly known as the Web, is an information system enabling documents and other web resources to be accessed over the Internet. Documents and downloadable media are made available to the network through web ...
. It is implemented both
client-side Client-side refers to operations that are performed by the client in a client–server relationship in a computer network. General concepts Typically, a client is a computer application, such as a web browser, that runs on a user's local comput ...
and
server-side In the client–server model, server-side refers to programs and operations that run on the server. This is in contrast to client-side programs and operations which run on the client. General concepts Typically, a server is a computer applicati ...
. The caching of
multimedia Multimedia is a form of communication that uses a combination of different content forms such as text, audio, images, animations, or video into a single interactive presentation, in contrast to tradit ...
s and other files can result in less overall delay when
browsing Browsing is a kind of orienting strategy. It is supposed to identify something of relevance for the browsing organism. When used about human beings it is a metaphor taken from the animal kingdom. It is used, for example, about people browsing o ...
the Web.


Parts of the system


Forward and reverse

A forward cache is a cache outside the
web server A web server is computer software and underlying hardware that accepts requests via HTTP (the network protocol created to distribute web content) or its secure variant HTTPS. A user agent, commonly a web browser or web crawler, initia ...
's network, e.g. in the client's
web browser A web browser is application software for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's screen. Browsers are used o ...
, in an
ISP An Internet service provider (ISP) is an organization that provides services for accessing, using, or participating in the Internet. ISPs can be organized in various forms, such as commercial, community-owned, non-profit, or otherwise private ...
, or within a corporate network. A network-aware forward cache only caches heavily accessed items. A
proxy server In computer networking, a proxy server is a server application that acts as an intermediary between a client requesting a resource and the server providing that resource. Instead of connecting directly to a server that can fulfill a reques ...
sitting between the client and web server can evaluate
HTTP The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide We ...
headers and choose whether to store web content. A reverse cache sits in front of one or more web servers, accelerating requests from the Internet and reducing peak server load. This is usually a
content delivery network A content delivery network, or content distribution network (CDN), is a geographically distributed network of proxy servers and their data centers. The goal is to provide high availability and performance by distributing the service spatially rel ...
(CDN) that retains copies of web content at various points throughout a network.


HTTP options

The
Hypertext Transfer Protocol The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide We ...
(HTTP) defines three basic mechanisms for controlling caches: freshness, validation, and invalidation. This is specified in the header of HTTP response messages from the server. Freshness allows a response to be used without re-checking it on the origin server, and can be controlled by both the server and the client. For example, the Expires response header gives a date when the document becomes stale, and the Cache-Control: max-age directive tells the cache how many seconds the response is fresh for. Validation can be used to check whether a cached response is still good after it becomes stale. For example, if the response has a Last-Modified header, a cache can make a ''conditional request'' using the If-Modified-Since header to see if it has changed. The ETag (entity tag) mechanism also allows for both strong and weak validation. Invalidation is usually a side effect of another request that passes through the cache. For example, if a URL associated with a cached response subsequently gets a POST, PUT or DELETE request, the cached response will be invalidated. Many CDNs and manufacturers of network equipment have replaced this standard HTTP cache control with dynamic caching.


Legality

In 1998, the
DMCA The Digital Millennium Copyright Act (DMCA) is a 1998 United States copyright law that implements two 1996 treaties of the World Intellectual Property Organization (WIPO). It criminalizes production and dissemination of technology, devices, or ...
added rules to the
United States Code In the law of the United States, the Code of Laws of the United States of America (variously abbreviated to Code of Laws of the United States, United States Code, U.S. Code, U.S.C., or USC) is the official compilation and codification of the ...
( 17 U.S.C. §: 512) that exempts system operators from
copyright A copyright is a type of intellectual property that gives its owner the exclusive right to copy, distribute, adapt, display, and perform a creative work, usually for a limited time. The creative work may be in a literary, artistic, educatio ...
liability for the purposes of caching.


Server-side software

This is a list of server-side web caching software.


See also

* InterPlanetary File System - makes web caches redundant *
Cache Discovery Protocol The Cache Discovery Protocol (CDP) is an extension to the BitTorrent file-distribution system. It is designed to support the discovery and utilisation of local data caches by BitTorrent peers, typically set up by ISPs wishing to minimise the impac ...
* Cache manifest in HTML5 *
Content delivery network A content delivery network, or content distribution network (CDN), is a geographically distributed network of proxy servers and their data centers. The goal is to provide high availability and performance by distributing the service spatially rel ...
* Harvest project *
Proxy server In computer networking, a proxy server is a server application that acts as an intermediary between a client requesting a resource and the server providing that resource. Instead of connecting directly to a server that can fulfill a reques ...
*
Web accelerator A web accelerator is a proxy server that reduces website access time. They can be a self-contained hardware appliance or installable software. Web accelerators may be installed on the client computer or mobile device, on ISP servers, on the serve ...
* Search engine cache


References


Further reading

* Ari Luotonen, ''Web Proxy Servers'' (Prentice Hall, 1997) * Duane Wessels, ''Web Caching'' (O'Reilly and Associates, 2001). * Michael Rabinovich and Oliver Spatschak, ''Web Caching and Replication'' (Addison Wesley, 2001). {{DEFAULTSORT:Web Cache Hypertext Transfer Protocol Cache (computing) * Web caching protocol