
In
computing, a memory address is a reference to a specific
memory location used at various levels by
software and
hardware. Memory addresses are fixed-length sequences of
digits conventionally displayed and manipulated as
unsigned integers. Such numerical semantic bases itself upon features of CPU (such as the
instruction pointer and incremental
address registers), as well upon use of the memory like an
array endorsed by various
programming languages.
Types
Physical addresses
A
digital computer's
main memory consists of many memory locations. Each memory location has a
physical address which is a code. The CPU (or other device) can use the code to access the corresponding memory location. Generally only
system software, i.e. the
BIOS, operating systems, and some specialized
utility programs (e.g.,
memory testers), address physical memory using machine code
operands or
processor registers, instructing the CPU to direct a hardware device, called the
memory controller, to use the
memory bus or
system bus, or separate
control,
address, and
data busses, to execute the program's commands. The memory controllers'
bus consists of a number of
parallel lines, each represented by a
binary digit (bit). The width of the bus, and thus the number of addressable storage units, and the number of bits in each unit, varies among computers.
Logical addresses
A
computer program uses memory addresses to execute
machine code, and to store and retrieve
data. In early computers logical and physical addresses corresponded, but since the introduction of
virtual memory most
application programs do not have a knowledge of physical addresses. Rather, they address
logical addresses, or
virtual addresses, using the computer's
memory management unit and
operating system memory mapping; see
below.
Unit of address resolution
Most modern computers are ''
byte-addressable''. Each address identifies a single
byte (
eight bits) of storage. Data larger than a single byte may be stored in a sequence of consecutive addresses. There exist ''
word-addressable'' computers, where the minimal addressable storage unit is exactly the processor's
word. For example, the
Data General Nova minicomputer, and the
Texas Instruments TMS9900 and
National Semiconductor IMP-16 microcomputers used 16 bit
words, and there were many
36-bit mainframe computers (e.g.,
PDP-10) which used 18-bit
word addressing, not
byte addressing, giving an address space of 2
18 36-bit words, approximately 1 megabyte of storage. The efficiency of addressing of memory depends on the bit size of the bus used for addresses – the more bits used, the more addresses are available to the computer. For example, an 8-bit-byte-addressable machine with a 20-bit
address bus (e.g.
Intel 8086) can address 2
20 (1,048,576) memory locations, or one
MiB of memory, while a 32-bit bus (e.g.
Intel 80386) addresses 2
32 (4,294,967,296) locations, or a 4
GiB address space. In contrast, a 36-bit word-addressable machine with an 18-bit address bus addresses only 2
18 (262,144) 36-bit locations (9,437,184 bits), equivalent to 1,179,648 8-bit bytes, or 1152 KB, or 1.125
MiB—slightly more than the 8086.
Some older computers (
decimal computers), were ''
decimal digit-addressable''. For example, each address in the
IBM 1620's
magnetic-core memory identified a single six bit
binary-coded decimal digit, consisting of a
parity bit,
flag bit and four numerical bits. The 1620 used 5-digit decimal addresses, so in theory the highest possible address was 99,999. In practice, the CPU supported 20,000 memory locations, and up to two optional external memory units could be added, each supporting 20,000 addresses, for a total of 60,000 (00000–59999).
Word size versus address size
Word size is a characteristic given to
computer architecture. It denotes the number of bits that a CPU can process at one time. Modern processors, including embedded systems, usually have a word size of 8, 16, 24, 32 or 64 bits; most current general purpose computers use 32 or 64 bits. Many different sizes have been used historically, including 8, 9, 10, 12, 18, 24, 36, 39, 40, 48 and 60 bits.
Very often, when referring to the ''word size'' of a modern computer, one is also describing the size of address space on that computer. For instance, a computer said to be "
32-bit" also usually allows 32-bit memory addresses; a byte-addressable 32-bit computer can address 2
32 = 4,294,967,296 bytes of memory, or 4
gibibytes (GiB). This allows one memory address to be efficiently stored in one word.
However, this does not always hold true. Computers can have memory addresses larger or smaller than their word size. For instance, many
8-bit processors, such as the
MOS Technology 6502,
supported 16-bit addresses— if not, they would have been limited to a mere 256
bytes of memory addressing. The 16-bit
Intel 8088 and
Intel 8086 supported 20-bit addressing via
segmentation, allowing them to access 1
MiB rather than 64
KiB of memory. All Intel
Pentium processors since the
Pentium Pro include
Physical Address Extensions (PAE) which support mapping 36-bit physical addresses to 32-bit virtual addresses.
Many early processors held
2 addresses per word , such as
36-bit processors.
In theory, modern byte-addressable
64-bit computers can address 2
64 bytes (16
exbibytes), but in practice the amount of memory is limited by the CPU, the
memory controller, or the
printed circuit board design (e.g. number of physical memory connectors or amount of soldered-on memory).
Contents of each memory location
Each memory location in a
stored-program computer holds a
binary number or
decimal number ''of some sort''. Its interpretation, as data of some
data type or as an instruction, and use are determined by the
instructions which retrieve and manipulate it.
Some early programmers combined instructions and data in words as a way to save memory, when it was expensive: The
Manchester Mark 1 had space in its 40-bit words to store little bits of data – its processor ignored a small section in the middle of a word – and that was often exploited as extra data storage.
Self-replicating programs such as
viruses treat themselves sometimes as data and sometimes as instructions.
Self-modifying code is generally
deprecated nowadays, as it makes testing and maintenance disproportionally difficult to the saving of a few bytes, and can also give incorrect results because of the compiler or processor's assumptions about the machine's
state, but is still sometimes used deliberately, with great care.
Address space in application programming
In modern
multitasking environment, an
application process usually has in its address space (or spaces) chunks of memory of following types:
*
Machine code, including:
** program's own code (historically known as ''
code segment'' or ''text segment'');
**
shared libraries.
*
Data, including:
** initialized data (
data segment);
**
uninitialized (but allocated) variables;
**
run-time stack;
**
heap;
**
shared memory and
memory mapped files.
Some parts of address space may be not mapped at all.
Some systems have a "split"
memory architecture where machine code, constants, and data are in different locations, and may have different address sized.
For example,
PIC18 microcontrollers have a 21-bit program counter to address machine code and constants in Flash memory, and 12-bit address registers to address data in SRAM.
Addressing schemes
A computer program can access an address given explicitly – in low-level programming this is usually called an , or sometimes a specific address, and is known as
pointer data type in higher-level languages. But a program can also use
relative address which specifies a location in relation to somewhere else (the ''
base address''). There are many more
addressing modes.
Mapping logical addresses to physical and virtual memory also adds several levels of indirection; see below.
Memory models
Many programmers prefer to address memory such that there is no distinction between code space and data space (cf.
above), as well as from physical and virtual memory (see
below) — in other words, numerically identical pointers refer to exactly the same byte of RAM.
However, many early computers did not support such a ''flat memory model'' — in particular,
Harvard architecture machines force program storage to be completely separate from data storage.
Many modern
DSPs (such as the
Motorola 56000) have three separate storage areas — program storage, coefficient storage, and data storage. Some commonly used instructions fetch from all three areas simultaneously — fewer storage areas (even if there were the same total bytes of storage) would make those instructions run slower.
Memory models in x86 architecture
Early x86 computers use the
segmented memory model addresses based on a combination of two numbers: a
memory segment, and an
offset within that segment.
Some segments are implicitly treated as ''code segments'', dedicated for
instructions, ''
stack segments'', or normal ''
data segments''. Although the usages are different, the segments do not have different
memory protections reflecting this.
In the
flat memory model all segments (segment registers) are generally set to zero, and only offsets are variable.
See also
*
Memory model (programming)
*
Memory allocation
*
Memory address register
*
Base address
*
Offset (computer science), also known as a ''displacement''
*
Endianness
*
Memory management unit (MMU)
*
Page table
*
Memory protection
*
Memory segmentation
*
Low-level programming language
References
{{Data types
Category:Computer memory