computing Computing is any goal-oriented activity requiring, benefiting from, or creating computer, computing machinery. It includes the study and experimentation of algorithmic processes, and the development of both computer hardware, hardware and softw ...

, a cache ( ) is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewhere. A cache hit occurs when the requested data can be found in a cache, while a cache miss occurs when it cannot. Cache hits are served by reading data from the cache, which is faster than recomputing a result or reading from a slower data store; thus, the more requests that can be served from the cache, the faster the system performs. To be cost-effective, caches must be relatively small. Nevertheless, caches are effective in many areas of computing because typical

computer applications Application software is any computer program that is intended for end-user use not operating, administering or programming the computer. An application (app, application program, software application) is any program that can be categorized as ...

access data with a high degree of

locality of reference In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference localit ...

. Such access patterns exhibit temporal locality, where data is requested that has been recently requested, and spatial locality, where data is requested that is stored near data that has already been requested.

Motivation

In memory design, there is an inherent trade-off between capacity and speed because larger capacity implies larger size and thus greater physical distances for signals to travel causing

propagation delay Propagation delay is the time duration taken for a signal to reach its destination, for example in the electromagnetic field, a wire, speed of sound, gas, fluid or seismic wave, solid body. Physics * An electromagnetic wave travelling through ...

s. There is also a tradeoff between high-performance technologies such as SRAM and cheaper, easily mass-produced commodities such as

DRAM Dram, DRAM, or drams may refer to: Technology and engineering * Dram (unit), a unit of mass and volume, and an informal name for a small amount of liquor, especially whisky or whiskey * Dynamic random-access memory, a type of electronic semicondu ...

, flash, or

hard disk A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating hard disk drive platter, pla ...

s. The buffering provided by a cache benefits one or both of latency and

throughput Network throughput (or just throughput, when in context) refers to the rate of message delivery over a communication channel in a communication network, such as Ethernet or packet radio. The data that these messages contain may be delivered ov ...

(

bandwidth Bandwidth commonly refers to: * Bandwidth (signal processing) or ''analog bandwidth'', ''frequency bandwidth'', or ''radio bandwidth'', a measure of the width of a frequency range * Bandwidth (computing), the rate of data transfer, bit rate or thr ...

). A larger resource incurs a significant latency for access – e.g. it can take hundreds of clock cycles for a modern 4 GHz processor to reach DRAM. This is mitigated by reading large chunks into the cache, in the hope that subsequent reads will be from nearby locations and can be read from the cache. Prediction or explicit prefetching can be used to guess where future reads will come from and make requests ahead of time; if done optimally, the latency is bypassed altogether. The use of a cache also allows for higher throughput from the underlying resource, by assembling multiple fine-grain transfers into larger, more efficient requests. In the case of DRAM circuits, the additional throughput may be gained by using a wider data bus.

Operation

Hardware implements cache as a

block Block or blocked may refer to: Arts, entertainment and media Broadcasting * Block programming, the result of a programming strategy in broadcasting * W242BX, a radio station licensed to Greenville, South Carolina, United States known as ''96.3 ...

of memory for temporary storage of data likely to be used again.

Central processing unit A central processing unit (CPU), also called a central processor, main processor, or just processor, is the primary Processor (computing), processor in a given computer. Its electronic circuitry executes Instruction (computing), instructions ...

s (CPUs),

solid-state drive A solid-state drive (SSD) is a type of solid-state storage device that uses integrated circuits to store data persistently. It is sometimes called semiconductor storage device, solid-state device, or solid-state disk. SSDs rely on non- ...

s (SSDs) and hard disk drives (HDDs) frequently include hardware-based cache, while

web browser A web browser, often shortened to browser, is an application for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's scr ...

s and

web server A web server is computer software and underlying Computer hardware, hardware that accepts requests via Hypertext Transfer Protocol, HTTP (the network protocol created to distribute web content) or its secure variant HTTPS. A user agent, co ...

s commonly rely on software caching. A cache is made up of a pool of entries. Each entry has associated ''data'', which is a copy of the same data in some ''backing store''. Each entry also has a ''tag'', which specifies the identity of the data in the backing store of which the entry is a copy. When the cache client (a CPU, web browser,

operating system An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ...

) needs to access data presumed to exist in the backing store, it first checks the cache. If an entry can be found with a tag matching that of the desired data, the data in the entry is used instead. This situation is known as a cache hit. For example, a web browser program might check its local cache on disk to see if it has a local copy of the contents of a web page at a particular

URL A uniform resource locator (URL), colloquially known as an address on the Web, is a reference to a resource that specifies its location on a computer network and a mechanism for retrieving it. A URL is a specific type of Uniform Resource Identi ...

. In this example, the URL is the tag, and the content of the web page is the data. The percentage of accesses that result in cache hits is known as the hit rate or hit ratio of the cache. The alternative situation, when the cache is checked and found not to contain any entry with the desired tag, is known as a cache miss. This requires a more expensive access of data from the backing store. Once the requested data is retrieved, it is typically copied into the cache, ready for the next access. During a cache miss, some other previously existing cache entry is typically removed in order to make room for the newly retrieved data. The

heuristic A heuristic or heuristic technique (''problem solving'', '' mental shortcut'', ''rule of thumb'') is any approach to problem solving that employs a pragmatic method that is not fully optimized, perfected, or rationalized, but is nevertheless ...

used to select the entry to replace is known as the replacement policy. One popular replacement policy, least recently used (LRU), replaces the oldest entry, the entry that was accessed less recently than any other entry. More sophisticated caching algorithms also take into account the frequency of use of entries.

Write policies

Cache writes must eventually be propagated to the backing store. The timing for this is governed by the ''write policy''. The two primary write policies are: * ''Write-through'': Writes are performed synchronously to both the cache and the backing store. * ''Write-back'': Initially, writing is done only to the cache. The write to the backing store is postponed until the modified content is about to be replaced by another cache block. A write-back cache is more complex to implement since it needs to track which of its locations have been written over and mark them as ''dirty'' for later writing to the backing store. The data in these locations are written back to the backing store only when they are evicted from the cache, a process referred to as a ''lazy write''. For this reason, a read miss in a write-back cache may require two memory accesses to the backing store: one to write back the dirty data, and one to retrieve the requested data. Other policies may also trigger data write-back. The client may make many changes to data in the cache, and then explicitly notify the cache to write back the data. Write operations do not return data. Consequently, a decision needs to be made for write misses: whether or not to load the data into the cache. This is determined by these ''write-miss policies'': * ''Write allocate'' (also called ''fetch on write''): Data at the missed-write location is loaded to cache, followed by a write-hit operation. In this approach, write misses are similar to read misses. * ''No-write allocate'' (also called ''write-no-allocate'' or ''write around''): Data at the missed-write location is not loaded to cache, and is written directly to the backing store. In this approach, data is loaded into the cache on read misses only. While both write policies can Implement either write-miss policy, they are typically paired as follows: * A write-back cache typically employs write allocate, anticipating that subsequent writes or reads to the same location will benefit from having the data already in the cache. * A write-through cache uses no-write allocate. Here, subsequent writes have no advantage, since they still need to be written directly to the backing store. Entities other than the cache may change the data in the backing store, in which case the copy in the cache may become out-of-date or ''stale''. Alternatively, when the client updates the data in the cache, copies of that data in other caches will become stale. Communication protocols between the cache managers that keep the data consistent are associated with

cache coherence In computer architecture, cache coherence is the uniformity of shared resource data that is stored in multiple local caches. In a cache coherent system, if multiple clients have a cached copy of the same region of a shared memory resource, all ...

Prefetch

On a cache read miss, caches with a ''

demand paging In computer operating systems, demand paging (as opposed to anticipatory paging) is a method of virtual memory management. In a system that uses demand paging, the operating system copies a disk page into physical memory only when an attempt is m ...

policy'' read the minimum amount from the backing store. A typical demand-paging virtual memory implementation reads one page of virtual memory (often 4 KB) from disk into the disk cache in RAM. A typical CPU reads a single L2 cache line of 128 bytes from DRAM into the L2 cache, and a single L1 cache line of 64 bytes from the L2 cache into the L1 cache. Caches with a

prefetch input queue Fetching the instruction opcodes from program memory well in advance is known as prefetching and it is served by using a prefetch input queue (PIQ). The pre-fetched instructions are stored in a queue. The fetching of opcodes well in advance, pr ...

or more general ''anticipatory paging policy'' go further—they not only read the data requested, but guess that the next chunk or two of data will soon be required, and so prefetch that data into the cache ahead of time. Anticipatory paging is especially helpful when the backing store has a long latency to read the first chunk and much shorter times to sequentially read the next few chunks, such as

disk storage Disc or disk may refer to: * Disk (mathematics) In geometry, a disk (Spelling of disc, also spelled disc) is the region in a plane (geometry), plane bounded by a circle. A disk is said to be ''closed'' if it contains the circle that constitut ...

and DRAM. A few operating systems go further with a loader that always pre-loads the entire executable into RAM. A few caches go even further, not only pre-loading an entire file, but also starting to load other related files that may soon be requested, such as the

page cache In computing, a page cache, sometimes also called disk cache, is a transparent cache for the pages originating from a secondary storage device such as a hard disk drive (HDD) or a solid-state drive (SSD). The operating system keeps a page ca ...

associated with a

prefetcher The Prefetcher is a component of Microsoft Windows which was introduced in Windows XP. It is a component of the Memory management, Memory Manager that can speed up the Windows booting, boot Windows NT Startup Process, process and shorten the amount ...

or the

web cache A web cache (or HTTP cache) is a system for optimizing the World Wide Web. It is implemented both client-side and server-side. The caching of multimedia and other files can result in less overall delay when web browser, browsing the Web. Parts o ...

associated with link prefetching.

Examples of hardware caches

CPU cache

Small memories on or close to the CPU can operate faster than the much larger

main memory Computer data storage or digital data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers. The central processin ...

. Most CPUs since the 1980s have used one or more caches, sometimes in cascaded levels; modern high-end embedded,

desktop A desktop traditionally refers to: * The surface of a desk (often to distinguish office appliances that fit on a desk, such as photocopiers and printers, from larger equipment covering its own area on the floor) Desktop may refer to various compu ...

and server

microprocessor A microprocessor is a computer processor (computing), processor for which the data processing logic and control is included on a single integrated circuit (IC), or a small number of ICs. The microprocessor contains the arithmetic, logic, a ...

s may have as many as six types of cache (between levels and functions). Some examples of caches with a specific function are the D-cache, I-cache and the

translation lookaside buffer A translation lookaside buffer (TLB) is a memory CPU cache, cache that stores the recent translations of virtual memory address to a physical memory Memory_address, location. It is used to reduce the time taken to access a user memory location. It ...

for the

memory management unit A memory management unit (MMU), sometimes called paged memory management unit (PMMU), is a computer hardware unit that examines all references to computer memory, memory, and translates the memory addresses being referenced, known as virtual mem ...

(MMU).

GPU cache

Earlier

graphics processing unit A graphics processing unit (GPU) is a specialized electronic circuit designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal ...

s (GPUs) often had limited read-only texture caches and used swizzling to improve 2D

Cache miss In computing, a cache ( ) is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsew ...

es would drastically affect performance, e.g. if mipmapping was not used. Caching was important to leverage 32-bit (and wider) transfers for texture data that was often as little as 4 bits per pixel. As GPUs advanced, supporting

general-purpose computing on graphics processing units General-purpose computing on graphics processing units (GPGPU, or less often GPGP) is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics, to perform computation in applications traditional ...

and

compute kernel In computing, a compute kernel is a routine compiled for high throughput accelerators (such as graphics processing units (GPUs), digital signal processors (DSPs) or field-programmable gate arrays (FPGAs)), separate from but used by a main pro ...

s, they have developed progressively larger and increasingly general caches, including

instruction cache A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, which ...

s for

shader In computer graphics, a shader is a computer program that calculates the appropriate levels of light, darkness, and color during the rendering of a 3D scene—a process known as '' shading''. Shaders have evolved to perform a variety of s ...

s, exhibiting functionality commonly found in CPU caches. These caches have grown to handle synchronization primitives between threads and

atomic operation In concurrent programming, an operation (or set of operations) is linearizable if it consists of an ordered list of Execution (computing), invocation and response Event (computing), events, that may be extended by adding response events such tha ...

s, and interface with a CPU-style MMU.

DSPs

Digital signal processor A digital signal processor (DSP) is a specialized microprocessor chip, with its architecture optimized for the operational needs of digital signal processing. DSPs are fabricated on metal–oxide–semiconductor (MOS) integrated circuit chips. ...

s have similarly generalized over the years. Earlier designs used scratchpad memory fed by

direct memory access Direct memory access (DMA) is a feature of computer systems that allows certain hardware subsystems to access main system computer memory, memory independently of the central processing unit (CPU). Without DMA, when the CPU is using programmed i ...

, but modern DSPs such as

Qualcomm Hexagon Hexagon is the brand name for a family of digital signal processor (DSP) and later neural processing unit (NPU) products by Qualcomm. Hexagon is also known as QDSP6, standing for “sixth generation digital signal processor.” According to Qua ...

often include a very similar set of caches to a CPU (e.g. Modified Harvard architecture with shared L2, split L1 I-cache and D-cache).

Translation lookaside buffer

A memory management unit (MMU) that fetches page table entries from main memory has a specialized cache, used for recording the results of virtual address to

physical address In computing, a physical address (also real address, or binary address), is a memory address that is represented in the form of a binary number on the address bus circuitry in order to enable the data bus to access a ''particular'' storage cell o ...

translations. This specialized cache is called a translation lookaside buffer (TLB).

In-network cache

Information-centric networking

Information-centric networking (ICN) is an approach to evolve the

Internet The Internet (or internet) is the Global network, global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a internetworking, network of networks ...

infrastructure away from a host-centric paradigm, based on perpetual connectivity and the

end-to-end principle The end-to-end principle is a design principle in computer networking that requires application-specific features (such as reliability and security) to be implemented in the communicating end nodes of the network, instead of in the network itse ...

, to a network architecture in which the focal point is identified information. Due to the inherent caching capability of the nodes in an ICN, it can be viewed as a loosely connected network of caches, which has unique requirements for caching policies. However, ubiquitous content caching introduces the challenge to content protection against unauthorized access, which requires extra care and solutions. Unlike proxy servers, in ICN the cache is a network-level solution. Therefore, it has rapidly changing cache states and higher request arrival rates; moreover, smaller cache sizes impose different requirements on the content eviction policies. In particular, eviction policies for ICN should be fast and lightweight. Various cache replication and eviction schemes for different ICN architectures and applications have been proposed.

Policies

=Time aware least recently used

= The time aware least recently used (TLRU) is a variant of LRU designed for the situation where the stored contents in cache have a valid lifetime. The algorithm is suitable in network cache applications, such as ICN,

content delivery network A content delivery network (CDN) or content distribution network is a geographically distributed network of proxy servers and their data centers. The goal is to provide high availability and performance ("speed") by distributing the service spat ...

s (CDNs) and distributed networks in general. TLRU introduces a new term: time to use (TTU). TTU is a time stamp on content which stipulates the usability time for the content based on the locality of the content and information from the content publisher. Owing to this locality-based time stamp, TTU provides more control to the local administrator to regulate in-network storage. In the TLRU algorithm, when a piece of content arrives, a cache node calculates the local TTU value based on the TTU value assigned by the content publisher. The local TTU value is calculated by using a locally-defined function. Once the local TTU value is calculated the replacement of content is performed on a subset of the total content stored in cache node. The TLRU ensures that less popular and short-lived content should be replaced with incoming content.

=Least frequent recently used

= The least frequent recently used (LFRU) cache replacement scheme combines the benefits of LFU and LRU schemes. LFRU is suitable for network cache applications, such as ICN, CDNs and distributed networks in general. In LFRU, the cache is divided into two partitions called privileged and unprivileged partitions. The privileged partition can be seen as a protected partition. If content is highly popular, it is pushed into the privileged partition. Replacement of the privileged partition is done by first evicting content from the unprivileged partition, then pushing content from the privileged partition to the unprivileged partition, and finally inserting new content into the privileged partition. In the above procedure, the LRU is used for the privileged partition and an approximated LFU (ALFU) scheme is used for the unprivileged partition. The basic idea is to cache the locally popular content with the ALFU scheme and push the popular content to the privileged partition.

Weather forecast

In 2011, the use of smartphones with weather forecasting options was overly taxing

AccuWeather AccuWeather, Inc. is a private-sector American media company that provides commercial weather forecasting services. AccuWeather was founded in 1962 by Joel N. Myers. The company adopted the name 'AccuWeather' in 1971. AccuWeather is headquar ...

servers; two requests from the same area would generate separate requests. An optimization by edge-servers to truncate the GPS coordinates to fewer decimal places meant that the cached results from a nearby query would be used. The number of to-the-server lookups per day dropped by half.

Software caches

Disk cache

While CPU caches are generally managed entirely by hardware, a variety of software manages other caches. The

in main memory is managed by the

operating system kernel A kernel is a computer program at the core of a computer's operating system that always has complete control over everything in the system. The kernel is also responsible for preventing and mitigating conflicts between different processes. It is ...

. While the

disk buffer In computer storage, a disk buffer (often ambiguously called a disk cache or a cache buffer) is the embedded memory in a hard disk drive (HDD) or solid-state drive (SSD) acting as a buffer (computer science), buffer between the rest of the com ...

, which is an integrated part of the hard disk drive or solid state drive, is sometimes misleadingly referred to as ''disk cache'', its main functions are write sequencing and read prefetching. High-end

disk controller A disk controller is a controller circuit that enables a CPU to communicate with a hard disk, floppy disk or other kind of disk drive. It also provides an interface between the disk drive and the bus connecting it to the rest of the system.{ ...

s often have their own on-board cache for the hard disk drive's data blocks. Finally, a fast local hard disk drive can also cache information held on even slower data storage devices, such as remote servers (

) or local

tape drive A tape drive is a data storage device that reads and writes data on a magnetic tape. Magnetic-tape data storage is typically used for offline, archival data storage. Tape media generally has a favorable unit cost and long archival stability. ...

s or optical jukeboxes; such a scheme is the main concept of

hierarchical storage management Hierarchical storage management (HSM), also known as tiered storage, is a Computer data storage, data storage and data management technique that automatically moves data between high-cost and low-cost data storage media, storage media. HSM systems ...

. Also, fast flash-based solid-state drives (SSDs) can be used as caches for slower rotational-media hard disk drives, working together as

hybrid drive A hybrid drive (solid state hybrid drive – SSHD, and dual-storage drive) is a logical or physical computer storage device that combines a faster storage medium such as solid-state drive (SSD) with a higher-capacity hard disk drive (HDD). Th ...

Web cache

Web browsers and web proxy servers, either locally or at the

Internet service provider An Internet service provider (ISP) is an organization that provides a myriad of services related to accessing, using, managing, or participating in the Internet. ISPs can be organized in various forms, such as commercial, community-owned, no ...

(ISP), employ web caches to store previous responses from web servers, such as

web page A web page (or webpage) is a World Wide Web, Web document that is accessed in a web browser. A website typically consists of many web pages hyperlink, linked together under a common domain name. The term "web page" is therefore a metaphor of pap ...

s and

image An image or picture is a visual representation. An image can be Two-dimensional space, two-dimensional, such as a drawing, painting, or photograph, or Three-dimensional space, three-dimensional, such as a carving or sculpture. Images may be di ...

s. Web caches reduce the amount of information that needs to be transmitted across the network, as information previously stored in the cache can often be re-used. This reduces bandwidth and processing requirements of the web server, and helps to improve

responsiveness Responsiveness as a concept of computer science refers to the specific ability of a system or functional unit to complete assigned tasks within a given time. For example, it would refer to the ability of an artificial intelligence system to und ...

for users of the web. Another form of cache is

P2P caching Peer-to-peer caching (P2P caching) is a computer network traffic management technology used by Internet Service Providers (ISPs) to accelerate content delivered over peer-to-peer (P2P) networks while reducing related bandwidth costs. P2P Cache (c ...

, where the files most sought for by

peer-to-peer Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network, forming a peer-to-peer network of Node ...

applications are stored in an ISP cache to accelerate P2P transfers. Similarly, decentralised equivalents exist, which allow communities to perform the same task for P2P traffic, for example, Corelli.

Memoization

A cache can store data that is computed on demand rather than retrieved from a backing store.

Memoization In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls to pure functions and returning the cached result when the same inputs occur ag ...

is an

optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfiel ...

technique that stores the results of resource-consuming

function call In computer programming, a function (also procedure, method, subroutine, routine, or subprogram) is a callable unit of software logic that has a well-defined interface and behavior and can be invoked multiple times. Callable units provide a p ...

s within a lookup table, allowing subsequent calls to reuse the stored results and avoid repeated computation. It is related to the dynamic programming algorithm design methodology, which can also be thought of as a means of caching.

Content delivery network

(CDN) is a network of distributed servers that deliver pages and other

web content Web content is the text, visual or audio content that is made available online and user encountered as part of the online usage and experience on websites. It may include text, images, sounds and audio, online videos, among other items place ...

to a user, based on the geographic locations of the user, the origin of the web page and the content delivery server. CDNs were introduced in the late 1990s as a way to speed up the delivery of static content, such as HTML pages, images and videos. By replicating content on multiple servers around the world and delivering it to users based on their location, CDNs can significantly improve the speed and availability of a website or application. When a user requests a piece of content, the CDN will check to see if it has a copy of the content in its cache. If it does, the CDN will deliver the content to the user from the cache.

Cloud storage gateway

A cloud storage gateway is a hybrid cloud storage device that connects a local network to one or more

cloud storage service A file-hosting service, also known as cloud-storage service, online file-storage provider, or cyberlocker, is an internet hosting service specifically designed to host user files. These services allow users to upload files that can be accessed o ...

s, typically object storage services such as

Amazon S3 Amazon Simple Storage Service (S3) is a service offered by Amazon Web Services (AWS) that provides object storage through a web service interface. Amazon S3 uses the same scalable storage infrastructure that Amazon.com uses to run its e-commerc ...

. It provides a cache for frequently accessed data, providing high speed local access to frequently accessed data in the cloud storage service. Cloud storage gateways also provide additional benefits such as accessing cloud object storage through traditional file serving protocols as well as continued access to cached data during connectivity outages.

Other caches

The BIND DNS daemon caches a mapping of domain names to

IP address An Internet Protocol address (IP address) is a numerical label such as that is assigned to a device connected to a computer network that uses the Internet Protocol for communication. IP addresses serve two main functions: network interface i ...

es, as does a DNS resolver library. Write-through operation is common when operating over unreliable networks, because of the enormous complexity of the coherency protocol required between multiple write-back caches when communication is unreliable. For instance, web page caches and client-side caches for

distributed file system A clustered file system (CFS) is a file system which is shared by being simultaneously Mount (computing), mounted on multiple Server (computing), servers. There are several approaches to computer cluster, clustering, most of which do not emplo ...

s (like those in NFS or SMB) are typically read-only or write-through specifically to keep the network protocol simple and reliable.

Web search engine A search engine is a software system that provides hyperlinks to web pages, and other relevant information on World Wide Web, the Web in response to a user's web query, query. The user enters a query in a web browser or a mobile app, and the sea ...

s also frequently make web pages they have indexed available from their cache. This can prove useful when web pages from a web server are temporarily or permanently inaccessible. Database caching can substantially improve the throughput of

database In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...

applications, for example in the processing of

indexes Index (: indexes or indices) may refer to: Arts, entertainment, and media Fictional entities * Index (A Certain Magical Index), Index (''A Certain Magical Index''), a character in the light novel series ''A Certain Magical Index'' * The Index, a ...

, data dictionaries, and frequently used subsets of data. A

distributed cache In computing, a distributed cache is an extension of the traditional concept of cache used in a single locale. A distributed cache may span multiple servers so that it can grow in size and in transactional capacity. It is mainly used to store ap ...

uses networked hosts to provide scalability, reliability and performance to the application. The hosts can be co-located or spread over different geographical regions.

Buffer vs. cache

The semantics of a "buffer" and a "cache" are not totally different; even so, there are fundamental differences in intent between the process of caching and the process of buffering. Fundamentally, caching realizes a performance increase for transfers of data that is being repeatedly transferred. While a caching system may realize a performance increase upon the initial (typically write) transfer of a data item, this performance increase is due to buffering occurring within the caching system. With read caches, a data item must have been fetched from its residing location at least once in order for subsequent reads of the data item to realize a performance increase by virtue of being able to be fetched from the cache's (faster) intermediate storage rather than the data's residing location. With write caches, a performance increase of writing a data item may be realized upon the first write of the data item by virtue of the data item immediately being stored in the cache's intermediate storage, deferring the transfer of the data item to its residing storage at a later stage or else occurring as a background process. Contrary to strict buffering, a caching process must adhere to a (potentially distributed) cache coherency protocol in order to maintain consistency between the cache's intermediate storage and the location where the data resides. Buffering, on the other hand, * reduces the number of transfers for otherwise novel data amongst communicating processes, which amortizes overhead involved for several small transfers over fewer, larger transfers, * provides an intermediary for communicating processes which are incapable of direct transfers amongst each other, or * ensures a minimum data size or representation required by at least one of the communicating processes involved in a transfer. With typical caching implementations, a data item that is read or written for the first time is effectively being buffered; and in the case of a write, mostly realizing a performance increase for the application from where the write originated. Additionally, the portion of a caching protocol where individual writes are deferred to a batch of writes is a form of buffering. The portion of a caching protocol where individual reads are deferred to a batch of reads is also a form of buffering, although this form may negatively impact the performance of at least the initial reads (even though it may positively impact the performance of the sum of the individual reads). In practice, caching almost always involves some form of buffering, while strict buffering does not involve caching. A buffer is a temporary memory location that is traditionally used because CPU instructions cannot directly address data stored in peripheral devices. Thus, addressable memory is used as an intermediate stage. Additionally, such a buffer may be feasible when a large block of data is assembled or disassembled (as required by a storage device), or when data may be delivered in a different order than that in which it is produced. Also, a whole buffer of data is usually transferred sequentially (for example to hard disk), so buffering itself sometimes increases transfer performance or reduces the variation or jitter of the transfer's latency as opposed to caching where the intent is to reduce the latency. These benefits are present even if the buffered data are written to the buffer once and read from the buffer once. A cache also increases transfer performance. A part of the increase similarly comes from the possibility that multiple small transfers will combine into one large block. But the main performance-gain occurs because there is a good chance that the same data will be read from cache multiple times, or that written data will soon be read. A cache's sole purpose is to reduce accesses to the underlying slower storage. Cache is also usually an

abstraction layer In computing, an abstraction layer or abstraction level is a way of hiding the working details of a subsystem. Examples of software models that use layers of abstraction include the OSI model for network protocols, OpenGL, and other graphics libra ...

that is designed to be invisible from the perspective of neighboring layers.