In
computer architecture
In computer engineering, computer architecture is a description of the structure of a computer system made from component parts. It can sometimes be a high-level description that ignores details of the implementation. At a more detailed level, t ...
, the memory hierarchy separates
computer storage
Computer data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers.
The central processing unit (CPU) of a compute ...
into a hierarchy based on response time. Since response time, complexity, and capacity are related, the levels may also be distinguished by their performance and controlling technologies.
Memory hierarchy affects performance in computer architectural design, algorithm predictions, and lower level
programming constructs involving
locality of reference
In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference localit ...
.
Designing for high performance requires considering the restrictions of the memory hierarchy, i.e. the size and capabilities of each component. Each of the various components can be viewed as part of a hierarchy of memories (m
1, m
2, ..., m
n) in which each member m
i is typically smaller and faster than the next highest member m
i+1 of the hierarchy. To limit waiting by higher levels, a lower level will respond by filling a buffer and then signaling for activating the transfer.
There are four major storage levels.
* ''Internal'' –
Processor register
A processor register is a quickly accessible location available to a computer's processor. Registers usually consist of a small amount of fast storage, although some registers have specific hardware functions, and may be read-only or write-only. ...
s and
cache
Cache, caching, or caché may refer to:
Places United States
* Cache, Idaho, an unincorporated community
* Cache, Illinois, an unincorporated community
* Cache, Oklahoma, a city in Comanche County
* Cache, Utah, Cache County, Utah
* Cache County ...
.
* Main – the system
RAM
Ram, ram, or RAM may refer to:
Animals
* A male sheep
* Ram cichlid, a freshwater tropical fish
People
* Ram (given name)
* Ram (surname)
* Ram (director) (Ramsubramaniam), an Indian Tamil film director
* RAM (musician) (born 1974), Dutch
* ...
and controller cards.
* On-line mass storage – Secondary storage.
* Off-line bulk storage – Tertiary and Off-line storage.
This is a general memory hierarchy structuring. Many other structures are useful. For example, a paging algorithm may be considered as a level for
virtual memory
In computing, virtual memory, or virtual storage is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very l ...
when designing a
computer architecture
In computer engineering, computer architecture is a description of the structure of a computer system made from component parts. It can sometimes be a high-level description that ignores details of the implementation. At a more detailed level, t ...
, and one can include a level of
nearline storage
Nearline storage (a portmanteau of " near" and "online storage") is a term used in computer science to describe an intermediate type of data storage that represents a compromise between online storage (supporting frequent, very rapid access to dat ...
between online and offline storage.
Properties of the technologies in the memory hierarchy
* Adding complexity slows down the ''memory hierarchy''.
* CMOx memory technology stretches the Flash space in the memory hierarchy
* One of the main ways to increase system performance is minimising how far down the memory hierarchy one has to go to manipulate data.
* Latency and bandwidth are two metrics associated with caches. Neither of them is uniform, but is specific to a particular component of the memory hierarchy.
* Predicting where in the memory hierarchy the data resides is difficult.
* ...the location in the memory hierarchy dictates the time required for the prefetch to occur.
Examples
The number of levels in the memory hierarchy and the performance at each level has increased over time. The type of memory or storage components also change historically. For example, the memory hierarchy of an Intel Haswell Mobile processor circa 2013 is:
*
Processor register
A processor register is a quickly accessible location available to a computer's processor. Registers usually consist of a small amount of fast storage, although some registers have specific hardware functions, and may be read-only or write-only. ...
s – the fastest possible access (usually 1 CPU cycle). A few thousand bytes in size
*
Cache
Cache, caching, or caché may refer to:
Places United States
* Cache, Idaho, an unincorporated community
* Cache, Illinois, an unincorporated community
* Cache, Oklahoma, a city in Comanche County
* Cache, Utah, Cache County, Utah
* Cache County ...
** Level 0 (L0)
Micro operations cache – 6,144 bytes (6 KiB) in size
** Level 1 (L1)
Instruction cache – 128 KiB in size
** Level 1 (L1) Data cache – 128 KiB in size. Best access speed is around 700
GB/s
** Level 2 (L2) Instruction and data (shared) – 1
MiB in size. Best access speed is around 200 GB/s
** Level 3 (L3) Shared cache – 6 MiB in size. Best access speed is around 100 GB/s
** Level 4 (L4) Shared cache – 128 MiB in size. Best access speed is around 40 GB/s
*
Main memory
Computer data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers.
The central processing unit (CPU) of a computer ...
(
Primary storage
Computer data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers.
The central processing unit (CPU) of a compute ...
) –
GiB
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
in size. Best access speed is around 10 GB/s.
In the case of a
NUMA
Nuclear mitotic apparatus protein 1 is a protein that in humans is encoded by the ''NUMA1'' gene.
Interactions
Nuclear mitotic apparatus protein 1 has been shown to interact with PIM1, Band 4.1, GPSM2
G-protein-signaling modulator 2, also call ...
machine, access times may not be uniform
*
Disk storage
Disk storage (also sometimes called drive storage) is a general category of storage mechanisms where data is recorded by various electronic, magnetic, optical, or mechanical changes to a surface layer of one or more rotating disks. A disk drive is ...
(
Secondary storage
Computer data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers.
The central processing unit (CPU) of a computer ...
) –
Terabyte
The byte is a units of information, unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character (computing), character of text in a computer and for this ...
s in size. As of 2017, best access speed is from a consumer
solid state drive
A solid-state drive (SSD) is a solid-state storage device that uses integrated circuit assemblies to store data persistently, typically using flash memory, and functioning as secondary storage in the hierarchy of computer storage. It is ...
is about 2000 MB/s
*
Nearline storage
Nearline storage (a portmanteau of " near" and "online storage") is a term used in computer science to describe an intermediate type of data storage that represents a compromise between online storage (supporting frequent, very rapid access to dat ...
(
Tertiary storage
Computer data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers.
The central processing unit (CPU) of a compute ...
) – Up to
exabytes
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
in size. As of 2013, best access speed is about 160 MB/s
*
Offline storage
Computer data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers.
The central processing unit (CPU) of a compute ...
The lower levels of the hierarchy – from disks downwards – are also known as
tiered storage
In computer architecture, the memory hierarchy separates computer storage into a hierarchy based on response time. Since response time, complexity, and capacity are related, the levels may also be distinguished by their performance and controlli ...
. The formal distinction between online, nearline, and offline storage is:
* Online storage is immediately available for I/O.
* Nearline storage is not immediately available, but can be made online quickly without human intervention.
* Offline storage is not immediately available, and requires some human intervention to bring online.
For example, always-on spinning disks are online, while spinning disks that spin-down, such as massive array of idle disk (
MAID
A maid, or housemaid or maidservant, is a female domestic worker. In the Victorian era domestic service was the second largest category of employment in England and Wales, after agricultural work. In developed Western nations, full-time maids ...
), are nearline. Removable media such as tape cartridges that can be automatically loaded, as in a
tape library
In computer storage, a tape library, sometimes called a tape silo, tape robot or tape jukebox, is a storage device that contains one or more tape drives, a number of slots to hold tape cartridges, a barcode reader to identify tape cartridges a ...
, are nearline, while cartridges that must be manually loaded are offline.
Most modern
CPUs
A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, and ...
are so fast that for most program workloads, the
bottleneck
Bottleneck literally refers to the narrowed portion (neck) of a bottle near its opening, which limit the rate of outflow, and may describe any object of a similar shape. The literal neck of a bottle was originally used to play what is now known as ...
is the
locality of reference
In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference localit ...
of memory accesses and the efficiency of the
caching and memory transfer between different levels of the hierarchy. As a result, the CPU spends much of its time idling, waiting for memory I/O to complete. This is sometimes called the ''space cost'', as a larger memory object is more likely to overflow a small/fast level and require use of a larger/slower level. The resulting load on memory use is known as ''pressure'' (respectively ''register pressure'', ''cache pressure'', and (main) ''memory pressure''). Terms for data being missing from a higher level and needing to be fetched from a lower level are, respectively:
register spilling
In compiler optimization, register allocation is the process of assigning local automatic variables and expression results to a limited number of processor registers.
Register allocation can happen over a basic block (''local register allocati ...
(due to
register pressure
Register or registration may refer to:
Arts entertainment, and media Music
* Register (music), the relative "height" or range of a note, melody, part, instrument, etc.
* ''Register'', a 2017 album by Travis Miller
* Registration (organ), th ...
: register to cache),
cache miss
In computing, a cache ( ) is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewher ...
(cache to main memory), and (hard)
page fault
In computing, a page fault (sometimes called PF or hard fault) is an exception that the memory management unit (MMU) raises when a process accesses a memory page without proper preparations. Accessing the page requires a mapping to be added to t ...
(main memory to disk).
Modern
programming language
A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language.
The description of a programming ...
s mainly assume two levels of memory, main memory and disk storage, though in
assembly language
In computer programming, assembly language (or assembler language, or symbolic machine code), often referred to simply as Assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence be ...
and
inline assembler In computer programming, an inline assembler is a feature of some compilers that allows low-level code written in assembly language to be embedded within a program, among code that otherwise has been compiled from a higher-level language such as C ...
s in languages such as
C, registers can be directly accessed. Taking optimal advantage of the memory hierarchy requires the cooperation of programmers, hardware, and compilers (as well as underlying support from the operating system):
*''Programmers'' are responsible for moving data between disk and memory through file I/O.
*''Hardware'' is responsible for moving data between memory and caches.
*''
Optimizing compiler
In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power cons ...
s'' are responsible for generating code that, when executed, will cause the hardware to use caches and registers efficiently.
Many programmers assume one level of memory. This works fine until the application hits a performance wall. Then the memory hierarchy will be assessed during
code refactoring
In computer programming and software design, code refactoring is the process of restructuring existing computer code—changing the '' factoring''—without changing its external behavior. Refactoring is intended to improve the design, structur ...
.
See also
*
Cache hierarchy
Cache hierarchy, or multi-level caches, refers to a memory architecture that uses a hierarchy of memory stores based on varying access speeds to cache data. Highly requested data is cached in high-speed access memory stores, allowing swifter access ...
*
Use of spatial and temporal locality: hierarchical memory
*
Buffer vs. cache
*
Cache hierarchy in a modern processor
*
Memory wall
Random-access memory (RAM; ) is a form of computer memory that can be read and changed in any order, typically used to store working data and machine code. A random-access memory device allows data items to be read or written in almost the ...
*
Computer memory
In computing, memory is a device or system that is used to store information for immediate use in a computer or related computer hardware and digital electronic devices. The term ''memory'' is often synonymous with the term ''primary storage ...
*
Hierarchical storage management
Hierarchical storage management (HSM), also known as Tiered storage, is a data storage and Data management technique that automatically moves data between high-cost and low-cost storage media. HSM systems exist because high-speed storage devices, ...
*
Cloud storage
Cloud storage is a model of computer data storage in which the digital data is stored in logical pools, said to be on "the cloud". The physical storage spans multiple servers (sometimes in multiple locations), and the physical environment is t ...
*
Memory access pattern In computing, a memory access pattern or IO access pattern is the pattern with which a system or program reads and writes memory on secondary storage. These patterns differ in the level of locality of reference and drastically affect cache performa ...
*
Communication-avoiding algorithm
References
{{DEFAULTSORT:Memory Hierarchy
Computer architecture
Computer data storage
Hierarchy