pointer structure
   HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includi ...
, a pointer is an object in many
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
s that stores a
memory address In computing, a memory address is a reference to a specific memory location used at various levels by software and hardware. Memory addresses are fixed-length sequences of digits conventionally displayed and manipulated as unsigned integers. ...
. This can be that of another value located in
computer memory In computing, memory is a device or system that is used to store information for immediate use in a computer or related computer hardware and digital electronic devices. The term ''memory'' is often synonymous with the term '' primary storag ...
, or in some cases, that of memory-mapped computer hardware. A pointer ''references'' a location in memory, and obtaining the value stored at that location is known as ''
dereferencing In computer programming, the dereference operator or indirection operator, sometimes denoted by "*" (i.e. an asterisk), is a unary operator (i.e. one with a single operand) found in List of C-family programming languages, C-like languages that in ...
'' the pointer. As an analogy, a page number in a book's index could be considered a pointer to the corresponding page; dereferencing such a pointer would be done by flipping to the page with the given page number and reading the text found on that page. The actual format and content of a pointer variable is dependent on the underlying
computer architecture In computer engineering, computer architecture is a description of the structure of a computer system made from component parts. It can sometimes be a high-level description that ignores details of the implementation. At a more detailed level, the ...
. Using pointers significantly improves performance for repetitive operations, like traversing iterable data
structures A structure is an arrangement and organization of interrelated elements in a material object or system, or the object or system so organized. Material structures include man-made objects such as buildings and machines and natural objects such as ...
(e.g. strings, lookup tables,
control table Control tables are tables that control the control flow or play a major part in program control. There are no rigid rules about the structure or content of a control table—its qualifying attribute is its ability to direct control flow in some w ...
s and
tree In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, including only woody plants with secondary growth, plants that are ...
structures). In particular, it is often much cheaper in time and space to copy and dereference pointers than it is to copy and access the data to which the pointers point. Pointers are also used to hold the addresses of entry points for
call Call or Calls may refer to: Arts, entertainment, and media Games * Call, a type of betting in poker * Call, in the game of contract bridge, a bid, pass, double, or redouble in the bidding stage Music and dance * Call (band), from Lahore, Paki ...
ed subroutines in
procedural programming Procedural programming is a programming paradigm, derived from imperative programming, based on the concept of the '' procedure call''. Procedures (a type of routine or subroutine) simply contain a series of computational steps to be carrie ...
and for run-time linking to dynamic link libraries (DLLs). In
object-oriented programming Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of ...
, pointers to functions are used for binding methods, often using
virtual method table In computer programming, a virtual method table (VMT), virtual function table, virtual call table, dispatch table, vtable, or vftable is a mechanism used in a programming language to support dynamic dispatch (or run-time method binding). Whe ...
s. A pointer is a simple, more concrete implementation of the more abstract ''
reference Reference is a relationship between objects in which one object designates, or acts as a means by which to connect to or link to, another object. The first object in this relation is said to ''refer to'' the second object. It is called a '' name'' ...
'' data type. Several languages, especially low-level languages, support some type of pointer, although some have more restrictions on their use than others. While "pointer" has been used to refer to references in general, it more properly applies to
data structures In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, a ...
whose
interface Interface or interfacing may refer to: Academic journals * ''Interface'' (journal), by the Electrochemical Society * '' Interface, Journal of Applied Linguistics'', now merged with ''ITL International Journal of Applied Linguistics'' * '' Int ...
explicitly allows the pointer to be manipulated (arithmetically via ') as a memory address, as opposed to a
magic cookie In computing, a magic cookie, or just cookie for short, is a token or short packet of data passed between communicating programs. The cookie is often used to identify a particular event or as "handle, transaction ID, or other token of agreement be ...
or capability which does not allow such. Because pointers allow both protected and unprotected access to memory addresses, there are risks associated with using them, particularly in the latter case. Primitive pointers are often stored in a format similar to an
integer An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the languag ...
; however, attempting to dereference or "look up" such a pointer whose value is not a valid memory address could cause a program to
crash Crash or CRASH may refer to: Common meanings * Collision, an impact between two or more objects * Crash (computing), a condition where a program ceases to respond * Cardiac arrest, a medical condition in which the heart stops beating * Couch su ...
(or contain invalid data). To alleviate this potential problem, as a matter of
type safety In computer science, type safety and type soundness are the extent to which a programming language discourages or prevents type errors. Type safety is sometimes alternatively considered to be a property of facilities of a computer language; that i ...
, pointers are considered a separate type parameterized by the type of data they point to, even if the underlying representation is an integer. Other measures may also be taken (such as validation &
bounds checking In computer programming, bounds checking is any method of detecting whether a variable is within some bounds before it is used. It is usually used to ensure that a number fits into a given type (range checking), or that a variable being used as ...
), to verify that the pointer variable contains a value that is both a valid memory address and within the numerical range that the processor is capable of addressing.


History

In 1955, Soviet computer scientist
Kateryna Yushchenko Kateryna Mykhaylivna Yushchenko ( uk, Катерина Михайлівна Ющенко, née Chumachenko (Чумаченко); September 1, 1961 in Chicago) was the First Lady of Ukraine from 2005 to 2010. She is married to former Ukrainian Pr ...
invented the
Address programming language The Address programming language (russian: Адресный язык программирования uk, Адресна мова програмування) is one of the world's first high-level programming languages. It was created in 1955 by ...
that made possible indirect addressing and addresses of the highest rank – analogous to pointers. This language was widely used on the Soviet Union computers. However, it was unknown outside the Soviet Union and usually
Harold Lawson Harold W. "Bud" Lawson (1937–2019) was a software engineer, computer architect and systems engineer. Lawson is credited with the 1964 invention of the pointer in high-level programming languages (with "a lot of comments" from Donald Knuth an ...
is credited with the invention, in 1964, of the pointer. In 2000, Lawson was presented the Computer Pioneer Award by the
IEEE The Institute of Electrical and Electronics Engineers (IEEE) is a 501(c)(3) professional association for electronic engineering and electrical engineering (and associated disciplines) with its corporate office in New York City and its operat ...
" r inventing the pointer variable and introducing this concept into PL/I, thus providing for the first time, the capability to flexibly treat linked lists in a general-purpose high-level language". His seminal paper on the concepts appeared in the June 1967 issue of CACM entitled: PL/I List Processing. According to the
Oxford English Dictionary The ''Oxford English Dictionary'' (''OED'') is the first and foundational historical dictionary of the English language, published by Oxford University Press (OUP). It traces the historical development of the English language, providing a co ...
, the word ''pointer'' first appeared in print as a ''stack pointer'' in a technical memorandum by the
System Development Corporation System Development Corporation (SDC) was a computer software company based in Santa Monica, California. Founded in 1955, it is considered the first company of its kind. History SDC began as the systems engineering group for the SAGE air-defens ...
.


Formal description

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includi ...
, a pointer is a kind of
reference Reference is a relationship between objects in which one object designates, or acts as a means by which to connect to or link to, another object. The first object in this relation is said to ''refer to'' the second object. It is called a '' name'' ...
. A ''data primitive'' (or just ''primitive'') is any datum that can be read from or written to
computer memory In computing, memory is a device or system that is used to store information for immediate use in a computer or related computer hardware and digital electronic devices. The term ''memory'' is often synonymous with the term '' primary storag ...
using one memory access (for instance, both a ''
byte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable uni ...
'' and a ''
word A word is a basic element of language that carries an objective or practical meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of what a word is, there is no conse ...
'' are primitives). A ''data aggregate'' (or just ''aggregate'') is a group of primitives that are logically contiguous in memory and that are viewed collectively as one datum (for instance, an aggregate could be 3 logically contiguous bytes, the values of which represent the 3 coordinates of a point in space). When an aggregate is entirely composed of the same type of primitive, the aggregate may be called an ''array''; in a sense, a multi-byte ''word'' primitive is an array of bytes, and some programs use words in this way. In the context of these definitions, a ''byte'' is the smallest primitive; each
memory address In computing, a memory address is a reference to a specific memory location used at various levels by software and hardware. Memory addresses are fixed-length sequences of digits conventionally displayed and manipulated as unsigned integers. ...
specifies a different byte. The memory address of the initial byte of a datum is considered the memory address (or ''base memory address'') of the entire datum. A ''memory pointer'' (or just ''pointer'') is a primitive, the value of which is intended to be used as a memory address; it is said that ''a pointer points to a memory address''. It is also said that ''a pointer points to a datum n memory' when the pointer's value is the datum's memory address. More generally, a pointer is a kind of
reference Reference is a relationship between objects in which one object designates, or acts as a means by which to connect to or link to, another object. The first object in this relation is said to ''refer to'' the second object. It is called a '' name'' ...
, and it is said that ''a pointer references a datum stored somewhere in memory''; to obtain that datum is ''to dereference the pointer''. The feature that separates pointers from other kinds of reference is that a pointer's value is meant to be interpreted as a memory address, which is a rather low-level concept. References serve as a level of indirection: A pointer's value determines which memory address (that is, which datum) is to be used in a calculation. Because indirection is a fundamental aspect of algorithms, pointers are often expressed as a fundamental data type in
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
s; in statically (or strongly) typed programming languages, the type of a pointer determines the type of the datum to which the pointer points.


Architectural roots

Pointers are a very thin abstraction on top of the addressing capabilities provided by most modern
architecture Architecture is the art and technique of designing and building, as distinguished from the skills associated with construction. It is both the process and the product of sketching, conceiving, planning, designing, and constructing building ...
s. In the simplest scheme, an ''
address An address is a collection of information, presented in a mostly fixed format, used to give the location of a building, apartment, or other structure or a plot of land, generally using political boundaries and street names as references, along ...
'', or a numeric index, is assigned to each unit of memory in the system, where the unit is typically either a
byte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable uni ...
or a
word A word is a basic element of language that carries an objective or practical meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of what a word is, there is no conse ...
– depending on whether the architecture is byte-addressable or
word-addressable In computer architecture, ''word addressing'' means that addresses of memory on a computer uniquely identify Word (computer architecture), words of memory. It is usually used in contrast with byte addressing, where addresses uniquely identify byt ...
– effectively transforming all of memory into a very large
array An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
. The system would then also provide an operation to retrieve the value stored in the memory unit at a given address (usually utilizing the machine's
general purpose register A processor register is a quickly accessible location available to a computer's processor. Registers usually consist of a small amount of fast storage, although some registers have specific hardware functions, and may be read-only or write-only. ...
s). In the usual case, a pointer is large enough to hold more addresses than there are units of memory in the system. This introduces the possibility that a program may attempt to access an address which corresponds to no unit of memory, either because not enough memory is installed (i.e. beyond the range of available memory) or the architecture does not support such addresses. The first case may, in certain platforms such as the
Intel x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was intr ...
architecture, be called a
segmentation fault In computing, a segmentation fault (often shortened to segfault) or access violation is a fault, or failure condition, raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restrict ...
(segfault). The second case is possible in the current implementation of
AMD64 x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit version of the x86 instruction set, first released in 1999. It introduced two new modes of operation, 64-bit mode and compatibility mode, along with a new 4-level paging m ...
, where pointers are 64 bit long and addresses only extend to 48 bits. Pointers must conform to certain rules (canonical addresses), so if a non-canonical pointer is dereferenced, the processor raises a
general protection fault A general protection fault (GPF) in the x86 instruction set architectures (ISAs) is a fault (a type of interrupt) initiated by ISA-defined protection mechanisms in response to an access violation caused by some running code, either in the kernel ...
. On the other hand, some systems have more units of memory than there are addresses. In this case, a more complex scheme such as
memory segmentation Memory segmentation is an operating system memory management technique of division of a computer's primary memory into segments or sections. In a computer system using segmentation, a reference to a memory location includes a value that identifi ...
or
paging In computer operating systems, memory paging is a memory management scheme by which a computer stores and retrieves data from secondary storage for use in main memory. In this scheme, the operating system retrieves data from secondary storage ...
is employed to use different parts of the memory at different times. The last incarnations of the x86 architecture support up to 36 bits of physical memory addresses, which were mapped to the 32-bit linear address space through the PAE paging mechanism. Thus, only 1/16 of the possible total memory may be accessed at a time. Another example in the same computer family was the 16-bit protected mode of the
80286 The Intel 80286 (also marketed as the iAPX 286 and often called Intel 286) is a 16-bit microprocessor that was introduced on February 1, 1982. It was the first 8086-based CPU with separate, non- multiplexed address and data buses and also the ...
processor, which, though supporting only 16 MB of physical memory, could access up to 1 GB of virtual memory, but the combination of 16-bit address and segment registers made accessing more than 64 KB in one data structure cumbersome. In order to provide a consistent interface, some architectures provide
memory-mapped I/O Memory-mapped I/O (MMIO) and port-mapped I/O (PMIO) are two complementary methods of performing input/output (I/O) between the central processing unit (CPU) and peripheral devices in a computer. An alternative approach is using dedicated I/O pr ...
, which allows some addresses to refer to units of memory while others refer to
device register A Device Register is the view any device presents to a programmer. Each programmable bit in the device is presented with a logical address and it appears as a part of a byte The byte is a unit of digital information that most commonly consis ...
s of other devices in the computer. There are analogous concepts such as file offsets, array indices, and remote object references that serve some of the same purposes as addresses for other types of objects.


Uses

Pointers are directly supported without restrictions in languages such as
PL/I PL/I (Programming Language One, pronounced and sometimes written PL/1) is a procedural, imperative computer programming language developed and published by IBM. It is designed for scientific, engineering, business and system programming. I ...
, C,
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
, Pascal,
FreeBASIC FreeBASIC is a free and open source multiplatform compiler and programming language based on BASIC licensed under the GNU GPL for Microsoft Windows, protected-mode MS-DOS (DOS extender), Linux, FreeBSD and Xbox. The Xbox version is no longer m ...
, and implicitly in most assembly languages. They are primarily used for constructing
reference Reference is a relationship between objects in which one object designates, or acts as a means by which to connect to or link to, another object. The first object in this relation is said to ''refer to'' the second object. It is called a '' name'' ...
s, which in turn are fundamental to constructing nearly all data structures, as well as in passing data between different parts of a program. In functional programming languages that rely heavily on lists, data references are managed abstractly by using primitive constructs like
cons In computer programming, ( or ) is a fundamental function in most dialects of the Lisp programming language. ''constructs'' memory objects which hold two values or pointers to two values. These objects are referred to as (cons) cells, conses, ...
and the corresponding elements car and cdr, which can be thought of as specialised pointers to the first and second components of a cons-cell. This gives rise to some of the idiomatic "flavour" of functional programming. By structuring data in such cons-lists, these languages facilitate
recursive Recursion (adjective: ''recursive'') occurs when a thing is defined in terms of itself or of its type. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in mathematics ...
means for building and processing data—for example, by recursively accessing the head and tail elements of lists of lists; e.g. "taking the car of the cdr of the cdr". By contrast, memory management based on pointer dereferencing in some approximation of an
array An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
of memory addresses facilitates treating variables as slots into which data can be assigned imperatively. When dealing with arrays, the critical
lookup In computer science, a lookup table (LUT) is an array that replaces runtime computation with a simpler array indexing operation. The process is termed as "direct addressing" and LUTs differ from hash tables in a way that, to retrieve a value v wi ...
operation typically involves a stage called ''address calculation'' which involves constructing a pointer to the desired data element in the array. In other data structures, such as linked lists, pointers are used as references to explicitly tie one piece of the structure to another. Pointers are used to pass parameters by reference. This is useful if the programmer wants a function's modifications to a parameter to be visible to the function's caller. This is also useful for returning multiple values from a function. Pointers can also be used to allocate and deallocate dynamic variables and arrays in memory. Since a variable will often become redundant after it has served its purpose, it is a waste of memory to keep it, and therefore it is good practice to deallocate it (using the original pointer reference) when it is no longer needed. Failure to do so may result in a ''
memory leak In computer science, a memory leak is a type of resource leak that occurs when a computer program incorrectly manages memory allocations in a way that memory which is no longer needed is not released. A memory leak may also happen when an object ...
'' (where available free memory gradually, or in severe cases rapidly, diminishes because of an accumulation of numerous redundant memory blocks).


C pointers

The basic syntax to define a pointer is: int *ptr; This declares ptr as the identifier of an object of the following type: * pointer that points to an object of type int This is usually stated more succinctly as "ptr is a pointer to int." Because the C language does not specify an implicit initialization for objects of automatic storage duration, care should often be taken to ensure that the address to which ptr points is valid; this is why it is sometimes suggested that a pointer be explicitly initialized to the null pointer value, which is traditionally specified in C with the standardized macro NULL: ISO/IEC 9899, clause 7.17, paragraph 3: ''NULL... which expands to an implementation-defined null pointer constant...'' int *ptr = NULL; Dereferencing a null pointer in C produces
undefined behavior In computer programming, undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification to which the computer code adheres. This is different from unspecified behavior ...
, which could be catastrophic. However, most implementations simply halt execution of the program in question, usually with a
segmentation fault In computing, a segmentation fault (often shortened to segfault) or access violation is a fault, or failure condition, raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restrict ...
. However, initializing pointers unnecessarily could hinder program analysis, thereby hiding bugs. In any case, once a pointer has been declared, the next logical step is for it to point at something: int a = 5; int *ptr = NULL; ptr = &a; This assigns the value of the address of a to ptr. For example, if a is stored at memory location of 0x8130 then the value of ptr will be 0x8130 after the assignment. To dereference the pointer, an asterisk is used again: *ptr = 8; This means take the contents of ptr (which is 0x8130), "locate" that address in memory and set its value to 8. If a is later accessed again, its new value will be 8. This example may be clearer if memory is examined directly. Assume that a is located at address 0x8130 in memory and ptr at 0x8134; also assume this is a 32-bit machine such that an int is 32-bits wide. The following is what would be in memory after the following code snippet is executed: int a = 5; int *ptr = NULL; : (The NULL pointer shown here is 0x00000000.) By assigning the address of a to ptr: ptr = &a; yields the following memory values: : Then by dereferencing ptr by coding: *ptr = 8; the computer will take the contents of ptr (which is 0x8130), 'locate' that address, and assign 8 to that location yielding the following memory: : Clearly, accessing a will yield the value of 8 because the previous instruction modified the contents of a by way of the pointer ptr.


Use in data structures

When setting up data structures like lists, queues and trees, it is necessary to have pointers to help manage how the structure is implemented and controlled. Typical examples of pointers are start pointers, end pointers, and stack pointers. These pointers can either be absolute (the actual
physical address In computing, a physical address (also real address, or binary address), is a memory address that is represented in the form of a binary number on the address bus circuitry in order to enable the data bus to access a ''particular'' storage cell ...
or a
virtual address In computing, a virtual address space (VAS) or address space is the set of ranges of virtual addresses that an operating system makes available to a process. The range of virtual addresses usually starts at a low address and can extend to the hig ...
in
virtual memory In computing, virtual memory, or virtual storage is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very ...
) or relative (an offset from an absolute start address ("base") that typically uses fewer bits than a full address, but will usually require one additional arithmetic operation to resolve). Relative addresses are a form of manual
memory segmentation Memory segmentation is an operating system memory management technique of division of a computer's primary memory into segments or sections. In a computer system using segmentation, a reference to a memory location includes a value that identifi ...
, and share many of its advantages and disadvantages. A two-byte offset, containing a 16-bit, unsigned integer, can be used to provide relative addressing for up to 64 KiB (216 bytes) of a data structure. This can easily be extended to 128, 256 or 512 KiB if the address pointed to is forced to be aligned on a half-word, word or double-word boundary (but, requiring an additional "shift left" bitwise operation—by 1, 2 or 3 bits—in order to adjust the offset by a factor of 2, 4 or 8, before its addition to the base address). Generally, though, such schemes are a lot of trouble, and for convenience to the programmer absolute addresses (and underlying that, a '' flat address space'') is preferred. A one byte offset, such as the hexadecimal
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because ...
value of a character (e.g. X'29') can be used to point to an alternative integer value (or index) in an array (e.g., X'01'). In this way, characters can be very efficiently translated from '
raw data Raw data, also known as primary data, are ''data'' (e.g., numbers, instrument readings, figures, etc.) collected from a source. In the context of examinations, the raw data might be described as a raw score (after test scores). If a scientist ...
' to a usable sequential index and then to an absolute address without a lookup table.


C arrays

In C, array indexing is formally defined in terms of pointer arithmetic; that is, the language specification requires that array /code> be equivalent to *(array + i). Thus in C, arrays can be thought of as pointers to consecutive areas of memory (with no gaps), and the syntax for accessing arrays is identical for that which can be used to dereference pointers. For example, an array array can be declared and used in the following manner: int array /* Declares 5 contiguous integers */ int *ptr = array; /* Arrays can be used as pointers */ ptr = 1; /* Pointers can be indexed with array syntax */ *(array + 1) = 2; /* Arrays can be dereferenced with pointer syntax */ *(1 + array) = 2; /* Pointer addition is commutative */ 2 rray= 4; /* Subscript operator is commutative */ This allocates a block of five integers and names the block array, which acts as a pointer to the block. Another common use of pointers is to point to dynamically allocated memory from
malloc C dynamic memory allocation refers to performing manual memory management for dynamic memory allocation in the C programming language via a group of functions in the C standard library, namely , , , and . The C++ programming language includes t ...
which returns a consecutive block of memory of no less than the requested size that can be used as an array. While most operators on arrays and pointers are equivalent, the result of the
sizeof sizeof is a unary operator in the programming languages C and C++. It generates the storage size of an expression or a data type, measured in the number of ''char''-sized units. Consequently, the construct ''sizeof (char)'' is guaranteed to be ' ...
operator differs. In this example, sizeof(array) will evaluate to 5*sizeof(int) (the size of the array), while sizeof(ptr) will evaluate to sizeof(int*), the size of the pointer itself. Default values of an array can be declared like: int array = ; If array is located in memory starting at address 0x1000 on a 32-bit
little-endian In computing, endianness, also known as byte sex, is the order or sequence of bytes of a word of digital data in computer memory. Endianness is primarily expressed as big-endian (BE) or little-endian (LE). A big-endian system stores the most si ...
machine then memory will contain the following (values are in hexadecimal, like the addresses): : Represented here are five integers: 2, 4, 3, 1, and 5. These five integers occupy 32 bits (4 bytes) each with the least-significant byte stored first (this is a little-endian
CPU architecture Processor design is a subfield of computer engineering and electronics engineering (fabrication) that deals with creating a processor, a key component of computer hardware. The design process involves choosing an instruction set and a certain ...
) and are stored consecutively starting at address 0x1000. The syntax for C with pointers is: * array means 0x1000; * array + 1 means 0x1004: the "+ 1" means to add the size of 1 int, which is 4 bytes; * *array means to dereference the contents of array. Considering the contents as a memory address (0x1000), look up the value at that location (0x0002); * array /code> means element number i, 0-based, of array which is translated into *(array + i). The last example is how to access the contents of array. Breaking it down: * array + i is the memory location of the (i)th element of array, starting at i=0; * *(array + i) takes that memory address and dereferences it to access the value.


C linked list

Below is an example definition of a linked list in C. /* the empty linked list is represented by NULL * or some other sentinel value */ #define EMPTY_LIST NULL struct link ; This pointer-recursive definition is essentially the same as the reference-recursive definition from the
Haskell programming language Haskell () is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research and industrial applications, Haskell has pioneered a number of programming la ...
: data Link a = Nil , Cons a (Link a) Nil is the empty list, and Cons a (Link a) is a
cons In computer programming, ( or ) is a fundamental function in most dialects of the Lisp programming language. ''constructs'' memory objects which hold two values or pointers to two values. These objects are referred to as (cons) cells, conses, ...
cell of type a with another link also of type a. The definition with references, however, is type-checked and does not use potentially confusing signal values. For this reason, data structures in C are usually dealt with via
wrapper function A wrapper function is a function (another word for a ''subroutine'') in a software library or a computer program whose main purpose is to call a second subroutine or a system call with little or no additional computation. Wrapper functions are u ...
s, which are carefully checked for correctness.


Pass-by-address using pointers

Pointers can be used to pass variables by their address, allowing their value to be changed. For example, consider the following C code: /* a copy of the int n can be changed within the function without affecting the calling code */ void passByValue(int n) /* a pointer m is passed instead. No copy of the value pointed to by m is created */ void passByAddress(int *m) int main(void)


Dynamic memory allocation

In some programs, the required amount of memory depends on what ''the user'' may enter. In such cases the programmer needs to allocate memory dynamically. This is done by allocating memory at the ''heap'' rather than on the ''stack'', where variables usually are stored (variables can also be stored in the CPU registers, but that's another matter). Dynamic memory allocation can only be made through pointers, and names (like with common variables) can't be given. Pointers are used to store and manage the addresses of dynamically allocated blocks of memory. Such blocks are used to store data objects or arrays of objects. Most structured and object-oriented languages provide an area of memory, called the ''heap'' or ''free store'', from which objects are dynamically allocated. The example C code below illustrates how structure objects are dynamically allocated and referenced. The
standard C library The C standard library or libc is the standard library for the C (programming language), C programming language, as specified in the ISO C standard.International Organization for Standardization, ISO/International Electrotechnical Commission, IEC ...
provides the function malloc() for allocating memory blocks from the heap. It takes the size of an object to allocate as a parameter and returns a pointer to a newly allocated block of memory suitable for storing the object, or it returns a null pointer if the allocation failed. /* Parts inventory item */ struct Item ; /* Allocate and initialize a new Item object */ struct Item * make_item(const char *name) The code below illustrates how memory objects are dynamically deallocated, i.e., returned to the heap or free store. The standard C library provides the function free() for deallocating a previously allocated memory block and returning it back to the heap. /* Deallocate an Item object */ void destroy_item(struct Item *item)


Memory-mapped hardware

On some computing architectures, pointers can be used to directly manipulate memory or memory-mapped devices. Assigning addresses to pointers is an invaluable tool when programming microcontrollers. Below is a simple example declaring a pointer of type int and initialising it to a hexadecimal address in this example the constant 0x7FFF: int *hardware_address = (int *)0x7FFF; In the mid 80s, using the BIOS to access the video capabilities of PCs was slow. Applications that were display-intensive typically used to access CGA video memory directly by casting the hexadecimal constant 0xB8000 to a pointer to an array of 80 unsigned 16-bit int values. Each value consisted of an
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because ...
code in the low byte, and a colour in the high byte. Thus, to put the letter 'A' at row 5, column 2 in bright white on blue, one would write code like the following: #define VID ((unsigned short (*) 00xB8000) void foo(void)


Use in control tables

Control table Control tables are tables that control the control flow or play a major part in program control. There are no rigid rules about the structure or content of a control table—its qualifying attribute is its ability to direct control flow in some w ...
s that are used to control
program flow In computer science, control flow (or flow of control) is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. The emphasis on explicit control flow distinguishes an ''imp ...
usually make extensive use of pointers. The pointers, usually embedded in a table entry, may, for instance, be used to hold the entry points to subroutines to be executed, based on certain conditions defined in the same table entry. The pointers can however be simply indexes to other separate, but associated, tables comprising an array of the actual addresses or the addresses themselves (depending upon the programming language constructs available). They can also be used to point to earlier table entries (as in loop processing) or forward to skip some table entries (as in a
switch In electrical engineering, a switch is an electrical component that can disconnect or connect the conducting path in an electrical circuit, interrupting the electric current or diverting it from one conductor to another. The most common type of ...
or "early" exit from a loop). For this latter purpose, the "pointer" may simply be the table entry number itself and can be transformed into an actual address by simple arithmetic.


Typed pointers and casting

In many languages, pointers have the additional restriction that the object they point to has a specific type. For example, a pointer may be declared to point to an
integer An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the languag ...
; the language will then attempt to prevent the programmer from pointing it to objects which are not integers, such as
floating-point number In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can be r ...
s, eliminating some errors. For example, in C int *money; char *bags; money would be an integer pointer and bags would be a char pointer. The following would yield a compiler warning of "assignment from incompatible pointer type" under GCC bags = money; because money and bags were declared with different types. To suppress the compiler warning, it must be made explicit that you do indeed wish to make the assignment by
typecasting In film, television, and theatre, typecasting is the process by which a particular actor becomes strongly identified with a specific character, one or more particular roles, or characters having the same traits or coming from the same social or ...
it bags = (char *)money; which says to cast the integer pointer of money to a char pointer and assign to bags. A 2005 draft of the C standard requires that casting a pointer derived from one type to one of another type should maintain the alignment correctness for both types (6.3.2.3 Pointers, par. 7): char *external_buffer = "abcdef"; int *internal_data; internal_data = (int *)external_buffer; // UNDEFINED BEHAVIOUR if "the resulting pointer // is not correctly aligned" In languages that allow pointer arithmetic, arithmetic on pointers takes into account the size of the type. For example, adding an integer number to a pointer produces another pointer that points to an address that is higher by that number times the size of the type. This allows us to easily compute the address of elements of an array of a given type, as was shown in the C arrays example above. When a pointer of one type is cast to another type of a different size, the programmer should expect that pointer arithmetic will be calculated differently. In C, for example, if the money array starts at 0x2000 and sizeof(int) is 4 bytes whereas sizeof(char) is 1 byte, then money + 1 will point to 0x2004, but bags + 1 would point to 0x2001. Other risks of casting include loss of data when "wide" data is written to "narrow" locations (e.g. bags = 65537;), unexpected results when bit-shifting values, and comparison problems, especially with signed vs unsigned values. Although it is impossible in general to determine at compile-time which casts are safe, some languages store
run-time type information In computer programming, run-time type information or run-time type identification (RTTI) is a feature of some programming languages (such as C++, Object Pascal, and Ada) that exposes information about an object's data type at runtime. Run-time typ ...
which can be used to confirm that these dangerous casts are valid at runtime. Other languages merely accept a conservative approximation of safe casts, or none at all.


Value of pointers

In C and C++, even if two pointers compare as equal that doesn't mean they are equivalent. In these languages ''and''
LLVM LLVM is a set of compiler and toolchain technologies that can be used to develop a front end for any programming language and a back end for any instruction set architecture. LLVM is designed around a language-independent intermediate repre ...
, the rule is interpreted to mean that "just because two pointers point to the same address, does not mean they are equal in the sense that they can be used interchangeably", the difference between the pointers referred to as their ''provenance''. Casting to an integer type such as uintptr_t is implementation-defined and the comparison it provides does not provide any more insight as to whether the two pointers are interchangeable. In addition, further conversion to bytes and arithmetic will throw off optimizers trying to keep track the use of pointers, a problem still being elucidated in academic research.


Making pointers safer

As a pointer allows a program to attempt to access an object that may not be defined, pointers can be the origin of a variety of programming errors. However, the usefulness of pointers is so great that it can be difficult to perform programming tasks without them. Consequently, many languages have created constructs designed to provide some of the useful features of pointers without some of their pitfalls, also sometimes referred to as ''pointer hazards''. In this context, pointers that directly address memory (as used in this article) are referred to as s, by contrast with smart pointers or other variants. One major problem with pointers is that as long as they can be directly manipulated as a number, they can be made to point to unused addresses or to data which is being used for other purposes. Many languages, including most
functional programming language In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that ...
s and recent imperative languages like
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
, replace pointers with a more opaque type of reference, typically referred to as simply a ''reference'', which can only be used to refer to objects and not manipulated as numbers, preventing this type of error. Array indexing is handled as a special case. A pointer which does not have any address assigned to it is called a
wild pointer Wild, wild, wilds or wild may refer to: Common meanings * Wild animal * Wilderness, a wild natural environment * Wildness, the quality of being wild or untamed Art, media and entertainment Film and television * ''Wild'' (2014 film), a 2014 A ...
. Any attempt to use such uninitialized pointers can cause unexpected behavior, either because the initial value is not a valid address, or because using it may damage other parts of the program. The result is often a
segmentation fault In computing, a segmentation fault (often shortened to segfault) or access violation is a fault, or failure condition, raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restrict ...
,
storage violation In computing a storage violation is a hardware or software fault that occurs when a task attempts to access an area of computer storage which it is not permitted to access. Types of storage violation Storage violation can, for instance, consist ...
or wild branch (if used as a function pointer or branch address). In systems with explicit memory allocation, it is possible to create a
dangling pointer Dangling pointers and wild pointers in computer programming are pointers that do not point to a valid object of the appropriate type. These are special cases of memory safety violations. More generally, dangling references and wild references ar ...
by deallocating the memory region it points into. This type of pointer is dangerous and subtle because a deallocated memory region may contain the same data as it did before it was deallocated but may be then reallocated and overwritten by unrelated code, unknown to the earlier code. Languages with
garbage collection Waste collection is a part of the process of waste management. It is the transfer of solid waste from the point of use and disposal to the point of treatment or landfill. Waste collection also includes the curbside collection of recyclabl ...
prevent this type of error because deallocation is performed automatically when there are no more references in scope. Some languages, like
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
, support smart pointers, which use a simple form of reference counting to help track allocation of dynamic memory in addition to acting as a reference. In the absence of reference cycles, where an object refers to itself indirectly through a sequence of smart pointers, these eliminate the possibility of dangling pointers and memory leaks. Delphi strings support reference counting natively. The
Rust programming language Rust is a multi-paradigm, general-purpose programming language. Rust emphasizes performance, type safety, and concurrency. Rust enforces memory safety—that is, that all references point to valid memory—without requiring the use of a garb ...
introduces a ''borrow checker'', ''pointer lifetimes'', and an optimisation based around
optional type In programming languages (especially functional programming languages) and type theory, an option type or maybe type is a polymorphic type that represents encapsulation of an optional value; e.g., it is used as the return type of functions whic ...
s for null pointers to eliminate pointer bugs, without resorting to
garbage collection Waste collection is a part of the process of waste management. It is the transfer of solid waste from the point of use and disposal to the point of treatment or landfill. Waste collection also includes the curbside collection of recyclabl ...
.


Special kinds of pointers


Kinds defined by value


Null pointer

A null pointer has a value reserved for indicating that the pointer does not refer to a valid object. Null pointers are routinely used to represent conditions such as the end of a
list A ''list'' is any set of items in a row. List or lists may also refer to: People * List (surname) Organizations * List College, an undergraduate division of the Jewish Theological Seminary of America * SC Germania List, German rugby unio ...
of unknown length or the failure to perform some action; this use of null pointers can be compared to
nullable type Nullable types are a feature of some programming languages which allow a value to be set to the special value NULL instead of the usual possible values of the data type. In statically typed languages, a nullable type is an option type, while in ...
s and to the ''Nothing'' value in an
option type In programming languages (especially functional programming languages) and type theory, an option type or maybe type is a polymorphic type that represents encapsulation of an optional value; e.g., it is used as the return type of functions whic ...
.


Dangling pointer

A dangling pointer is a pointer that does not point to a valid object and consequently may make a program crash or behave oddly. In the Pascal or C programming languages, pointers that are not specifically initialized may point to unpredictable addresses in memory. The following example code shows a dangling pointer: int func(void) Here, p2 may point to anywhere in memory, so performing the assignment *p2 = 'b'; can corrupt an unknown area of memory or trigger a
segmentation fault In computing, a segmentation fault (often shortened to segfault) or access violation is a fault, or failure condition, raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restrict ...
.


Wild branch

Where a pointer is used as the address of the entry point to a program or start of a function which doesn't return anything and is also either uninitialized or corrupted, if a call or
jump Jumping is a form of locomotion or movement in which an organism or non-living (e.g., robotic) mechanical system propels itself through the air along a ballistic trajectory. Jump or Jumping also may refer to: Places * Jump, Kentucky or Jump S ...
is nevertheless made to this address, a " wild branch" is said to have occurred. In other words, a wild branch is a function pointer that is wild (dangling). The consequences are usually unpredictable and the error may present itself in several different ways depending upon whether or not the pointer is a "valid" address and whether or not there is (coincidentally) a valid instruction (opcode) at that address. The detection of a wild branch can present one of the most difficult and frustrating debugging exercises since much of the evidence may already have been destroyed beforehand or by execution of one or more inappropriate instructions at the branch location. If available, an
instruction set simulator An instruction set simulator (ISS) is a simulation model, usually coded in a high-level programming language, which mimics the behavior of a mainframe or microprocessor by "reading" instructions and maintaining internal variables which represent t ...
can usually not only detect a wild branch before it takes effect, but also provide a complete or partial trace of its history.


Kinds defined by structure


Autorelative pointer

An autorelative pointer is a pointer whose value is interpreted as an offset from the address of the pointer itself; thus, if a data structure has an autorelative pointer member that points to some portion of the data structure itself, then the data structure may be relocated in memory without having to update the value of the auto relative pointer. The cited patent also uses the term self-relative pointer to mean the same thing. However, the meaning of that term has been used in other ways: * to mean an offset from the address of a structure rather than from the address of the pointer itself; * to mean a pointer containing its own address, which can be useful for reconstructing in any arbitrary region of memory a collection of data structures that point to each other.


Based pointer

A based pointer is a pointer whose value is an offset from the value of another pointer. This can be used to store and load blocks of data, assigning the address of the beginning of the block to the base pointer.


Kinds defined by use or datatype


Multiple indirection

In some languages, a pointer can reference another pointer, requiring multiple dereference operations to get to the original value. While each level of indirection may add a performance cost, it is sometimes necessary in order to provide correct behavior for complex
data structures In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, a ...
. For example, in C it is typical to define a linked list in terms of an element that contains a pointer to the next element of the list: struct element ; struct element *head = NULL; This implementation uses a pointer to the first element in the list as a surrogate for the entire list. If a new value is added to the beginning of the list, head has to be changed to point to the new element. Since C arguments are always passed by value, using double indirection allows the insertion to be implemented correctly, and has the desirable side-effect of eliminating special case code to deal with insertions at the front of the list: // Given a sorted list at *head, insert the element item at the first // location where all earlier elements have lesser or equal value. void insert(struct element **head, struct element *item) // Caller does this: insert(&head, item); In this case, if the value of item is less than that of head, the caller's head is properly updated to the address of the new item. A basic example is in the argv argument to the main function in C (and C++), which is given in the prototype as char **argv—this is because the variable argv itself is a pointer to an array of strings (an array of arrays), so *argv is a pointer to the 0th string (by convention the name of the program), and **argv is the 0th character of the 0th string.


Function pointer

In some languages, a pointer can reference executable code, i.e., it can point to a function, method, or procedure. A
function pointer A function pointer, also called a subroutine pointer or procedure pointer, is a pointer that points to a function. As opposed to referencing a data value, a function pointer points to executable code within memory. Dereferencing the function poi ...
will store the address of a function to be invoked. While this facility can be used to call functions dynamically, it is often a favorite technique of virus and other malicious software writers. int sum(int n1, int n2) int main(void)


Back pointer

In doubly linked lists or tree structures, a back pointer held on an element 'points back' to the item referring to the current element. These are useful for navigation and manipulation, at the expense of greater memory use.


Simulation using an array index

It is possible to simulate pointer behavior using an index to an (normally one-dimensional) array. Primarily for languages which do not support pointers explicitly but ''do'' support arrays, the
array An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
can be thought of and processed as if it were the entire memory range (within the scope of the particular array) and any index to it can be thought of as equivalent to a
general purpose register A processor register is a quickly accessible location available to a computer's processor. Registers usually consist of a small amount of fast storage, although some registers have specific hardware functions, and may be read-only or write-only. ...
in assembly language (that points to the individual bytes but whose actual value is relative to the start of the array, not its absolute address in memory). Assuming the array is, say, a contiguous 16 megabyte character data structure, individual bytes (or a string of contiguous bytes within the array) can be directly addressed and manipulated using the name of the array with a 31 bit unsigned
integer An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the languag ...
as the simulated pointer (this is quite similar to the ''C arrays'' example shown above). Pointer arithmetic can be simulated by adding or subtracting from the index, with minimal additional overhead compared to genuine pointer arithmetic. It is even theoretically possible, using the above technique, together with a suitable
instruction set simulator An instruction set simulator (ISS) is a simulation model, usually coded in a high-level programming language, which mimics the behavior of a mainframe or microprocessor by "reading" instructions and maintaining internal variables which represent t ...
to simulate ''any''
machine code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ve ...
or the intermediate (
byte code Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (norma ...
) of ''any'' processor/language in another language that does not support pointers at all (for example
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
/
JavaScript JavaScript (), often abbreviated as JS, is a programming language that is one of the core technologies of the World Wide Web, alongside HTML and CSS. As of 2022, 98% of websites use JavaScript on the client side for webpage behavior, of ...
). To achieve this, the
binary Binary may refer to: Science and technology Mathematics * Binary number, a representation of numbers using only two digits (0 and 1) * Binary function, a function that takes two arguments * Binary operation, a mathematical operation that ta ...
code can initially be loaded into contiguous bytes of the array for the simulator to "read", interpret and action entirely within the memory contained of the same array. If necessary, to completely avoid
buffer overflow In information security and programming, a buffer overflow, or buffer overrun, is an anomaly whereby a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory locations. Buffers are areas of memo ...
problems,
bounds checking In computer programming, bounds checking is any method of detecting whether a variable is within some bounds before it is used. It is usually used to ensure that a number fits into a given type (range checking), or that a variable being used as ...
can usually be actioned for the compiler (or if not, hand coded in the simulator).


Support in various programming languages


Ada

Ada Ada may refer to: Places Africa * Ada Foah, a town in Ghana * Ada (Ghana parliament constituency) * Ada, Osun, a town in Nigeria Asia * Ada, Urmia, a village in West Azerbaijan Province, Iran * Ada, Karaman, a village in Karaman Province, ...
is a strongly typed language where all pointers are typed and only safe type conversions are permitted. All pointers are by default initialized to null, and any attempt to access data through a null pointer causes an exception to be raised. Pointers in Ada are called ''
access type Ada is a structured programming, structured, statically typed, Imperative programming, imperative, and Object-oriented programming, object-oriented high-level programming language, extended from Pascal (programming language), Pascal and other lan ...
s''. Ada 83 did not permit arithmetic on access types (although many compiler vendors provided for it as a non-standard feature), but Ada 95 supports “safe” arithmetic on access types via the package System.Storage_Elements.


BASIC

Several old versions of BASIC for the Windows platform had support for STRPTR() to return the address of a string, and for VARPTR() to return the address of a variable. Visual Basic 5 also had support for OBJPTR() to return the address of an object interface, and for an ADDRESSOF operator to return the address of a function. The types of all of these are integers, but their values are equivalent to those held by pointer types. Newer dialects of BASIC, such as
FreeBASIC FreeBASIC is a free and open source multiplatform compiler and programming language based on BASIC licensed under the GNU GPL for Microsoft Windows, protected-mode MS-DOS (DOS extender), Linux, FreeBSD and Xbox. The Xbox version is no longer m ...
or BlitzMax, have exhaustive pointer implementations, however. In FreeBASIC, arithmetic on ANY pointers (equivalent to C's void*) are treated as though the ANY pointer was a byte width. ANY pointers cannot be dereferenced, as in C. Also, casting between ANY and any other type's pointers will not generate any warnings. dim as integer f = 257 dim as any ptr g = @f dim as integer ptr i = g assert(*i = 257) assert( (g + 4) = (@f + 1) )


C and C++

In C and
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
pointers are variables that store addresses and can be ''null''. Each pointer has a type it points to, but one can freely cast between pointer types (but not between a function pointer and an object pointer). A special pointer type called the “void pointer” allows pointing to any (non-function) object, but is limited by the fact that it cannot be dereferenced directly (it shall be cast). The address itself can often be directly manipulated by casting a pointer to and from an integral type of sufficient size, though the results are implementation-defined and may indeed cause undefined behavior; while earlier C standards did not have an integral type that was guaranteed to be large enough,
C99 C99 (previously known as C9X) is an informal name for ISO/IEC 9899:1999, a past version of the C programming language standard. It extends the previous version ( C90) with new features for the language and the standard library, and helps impl ...
specifies the uintptr_t ''
typedef typedef is a reserved keyword in the programming languages C, C++, and Objective-C. It is used to create an additional name (''alias'') for another data type, but does not create a new type, except in the obscure case of a qualified typedef of ...
name'' defined in , but an implementation need not provide it.
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
fully supports C pointers and C typecasting. It also supports a new group of typecasting operators to help catch some unintended dangerous casts at compile-time. Since
C++11 C++11 is a version of the ISO/ IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions b ...
, the
C++ standard library The C standard library or libc is the standard library for the C programming language, as specified in the ISO C standard. ISO/IEC (2018). '' ISO/IEC 9899:2018(E): Programming Languages - C §7'' Starting from the original ANSI C standard, it was ...
also provides smart pointers (unique_ptr, shared_ptr and weak_ptr) which can be used in some situations as a safer alternative to primitive C pointers. C++ also supports another form of reference, quite different from a pointer, called simply a ''
reference Reference is a relationship between objects in which one object designates, or acts as a means by which to connect to or link to, another object. The first object in this relation is said to ''refer to'' the second object. It is called a '' name'' ...
'' or ''reference type''. Pointer arithmetic, that is, the ability to modify a pointer's target address with arithmetic operations (as well as magnitude comparisons), is restricted by the language standard to remain within the bounds of a single array object (or just after it), and will otherwise invoke
undefined behavior In computer programming, undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification to which the computer code adheres. This is different from unspecified behavior ...
. Adding or subtracting from a pointer moves it by a multiple of the size of its datatype. For example, adding 1 to a pointer to 4-byte integer values will increment the pointer's pointed-to byte-address by 4. This has the effect of incrementing the pointer to point at the next element in a contiguous array of integers—which is often the intended result. Pointer arithmetic cannot be performed on void pointers because the
void type The void type, in several programming languages derived from C and Algol68, is the return type of a function that returns normally, but does not provide a result value to its caller. Usually such functions are called for their side effects, ...
has no size, and thus the pointed address can not be added to, although gcc and other compilers will perform byte arithmetic on void* as a non-standard extension, treating it as if it were char *. Pointer arithmetic provides the programmer with a single way of dealing with different types: adding and subtracting the number of elements required instead of the actual offset in bytes. (Pointer arithmetic with char * pointers uses byte offsets, because sizeof(char) is 1 by definition.) In particular, the C definition explicitly declares that the syntax a /code>, which is the n-th element of the array a, is equivalent to *(a + n), which is the content of the element pointed by a + n. This implies that n /code> is equivalent to a /code>, and one can write, e.g., a /code> or 3 /code> equally well to access the fourth element of an array a. While powerful, pointer arithmetic can be a source of computer bugs. It tends to confuse novice programmers, forcing them into different contexts: an expression can be an ordinary arithmetic one or a pointer arithmetic one, and sometimes it is easy to mistake one for the other. In response to this, many modern high-level computer languages (for example
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
) do not permit direct access to memory using addresses. Also, the safe C dialect Cyclone addresses many of the issues with pointers. See C programming language for more discussion. The void pointer, or void*, is supported in ANSI C and C++ as a generic pointer type. A pointer to void can store the address of any object (not function), and, in C, is implicitly converted to any other object pointer type on assignment, but it must be explicitly cast if dereferenced. K&R C used char* for the “type-agnostic pointer” purpose (before ANSI C). int x = 4; void* p1 = &x; int* p2 = p1; // void* implicitly converted to int*: valid C, but not C++ int a = *p2; int b = *(int*)p1; // when dereferencing inline, there is no implicit conversion C++ does not allow the implicit conversion of void* to other pointer types, even in assignments. This was a design decision to avoid careless and even unintended casts, though most compilers only output warnings, not errors, when encountering other casts. int x = 4; void* p1 = &x; int* p2 = p1; // this fails in C++: there is no implicit conversion from void* int* p3 = (int*)p1; // C-style cast int* p4 = reinterpret_cast(p1); // C++ cast In C++, there is no void& (reference to void) to complement void* (pointer to void), because references behave like aliases to the variables they point to, and there can never be a variable whose type is void.


Pointer-to-member

In C++ pointers to non-static members of a class can be defined. If a class C has a member T a then &C::a is a pointer to the member a of type T C::*. This member can be an object or a
function Function or functionality may refer to: Computing * Function key, a type of key on computer keyboards * Function model, a structured representation of processes in a system * Function object or functor or functionoid, a concept of object-oriente ...
. They can be used on the right-hand side of operators .* and ->* to access the corresponding member. struct S ; S s1; S* ptrS = &s1; int S::* ptr = &S::a; // pointer to S::a int (S::* fp)()const = &S::f; // pointer to S::f s1.*ptr = 1; std::cout << (s1.*fp)() << "\n"; // prints 1 ptrS->*ptr = 2; std::cout << (ptrS->*fp)() << "\n"; // prints 2


Pointer declaration syntax overview

These pointer declarations cover most variants of pointer declarations. Of course it is possible to have triple pointers, but the main principles behind a triple pointer already exist in a double pointer. The naming used here is what the expression typeid(type).name() equals for each of these types when using g++ or
clang Clang is a compiler front end for the C, C++, Objective-C, and Objective-C++ programming languages, as well as the OpenMP, OpenCL, RenderScript, CUDA, and HIP frameworks. It acts as a drop-in replacement for the GNU Compiler Collection ...
. char A5_A5_c /* array of arrays of chars */ char *A5_Pc /* array of pointers to chars */ char **PPc; /* pointer to pointer to char ("double pointer") */ char (*PA5_c) /* pointer to array(s) of chars */ char *FPcvE(); /* function which returns a pointer to char(s) */ char (*PFcvE)(); /* pointer to a function which returns a char */ char (*FPA5_cvE()) /* function which returns pointer to an array of chars */ char (*A5_PFcvE (); /* an array of pointers to functions which return a char */ The following declarations involving pointers-to-member are valid only in C++: class C; class D; char C::* M1Cc; /* pointer-to-member to char */ char C::*A5_M1Cc /* array of pointers-to-member to char */ char* C::* M1CPc; /* pointer-to-member to pointer to char(s) */ char C::** PM1Cc; /* pointer to pointer-to-member to char */ char (*M1CA5_c) /* pointer-to-member to array(s) of chars */ char C::* FM1CcvE(); /* function which returns a pointer-to-member to char */ char D::* C::* M1CM1Dc; /* pointer-to-member to pointer-to-member to pointer to char(s) */ char C::* C::* M1CMS_c; /* pointer-to-member to pointer-to-member to pointer to char(s) */ char (C::* FM1CA5_cvE()) /* function which returns pointer-to-member to an array of chars */ char (C::* M1CFcvE)() /* pointer-to-member-function which returns a char */ char (C::* A5_M1CFcvE (); /* an array of pointers-to-member-functions which return a char */ The () and [] have a higher priority than *.


C#

In the C Sharp (programming language), C# programming language, pointers are supported only under certain conditions: any block of code including pointers must be marked with the unsafe keyword. Such blocks usually require higher security permissions to be allowed to run. The syntax is essentially the same as in C++, and the address pointed can be either managed or unmanaged memory. However, pointers to managed memory (any pointer to a managed object) must be declared using the fixed keyword, which prevents the garbage collector from moving the pointed object as part of memory management while the pointer is in scope, thus keeping the pointer address valid. An exception to this is from using the IntPtr structure, which is a safe managed equivalent to int*, and does not require unsafe code. This type is often returned when using methods from the System.Runtime.InteropServices, for example: // Get 16 bytes of memory from the process's unmanaged memory IntPtr pointer = System.Runtime.InteropServices.Marshal.AllocHGlobal(16); // Do something with the allocated memory // Free the allocated memory System.Runtime.InteropServices.Marshal.FreeHGlobal(pointer); The
.NET framework The .NET Framework (pronounced as "''dot net"'') is a proprietary software framework developed by Microsoft that runs primarily on Microsoft Windows. It was the predominant implementation of the Common Language Infrastructure (CLI) until bein ...
includes many classes and methods in the System and System.Runtime.InteropServices namespaces (such as the Marshal class) which convert .NET types (for example, System.String) to and from many unmanaged types and pointers (for example, LPWSTR or void*) to allow communication with
unmanaged code Managed code is computer program code that requires and will execute only under the management of a Common Language Infrastructure (CLI); Virtual Execution System (VES); virtual machine, e.g. .NET, CoreFX, or .NET Framework; Common Language Runt ...
. Most such methods have the same security permission requirements as unmanaged code, since they can affect arbitrary places in memory.


COBOL

The COBOL programming language supports pointers to variables. Primitive or group (record) data objects declared within the LINKAGE SECTION of a program are inherently pointer-based, where the only memory allocated within the program is space for the address of the data item (typically a single memory word). In program source code, these data items are used just like any other WORKING-STORAGE variable, but their contents are implicitly accessed indirectly through their LINKAGE pointers. Memory space for each pointed-to data object is typically allocated dynamically using external CALL statements or via embedded extended language constructs such as EXEC CICS or EXEC SQL statements. Extended versions of COBOL also provide pointer variables declared with USAGE IS POINTER clauses. The values of such pointer variables are established and modified using SET and SET ADDRESS statements. Some extended versions of COBOL also provide PROCEDURE-POINTER variables, which are capable of storing the addresses of executable code.


PL/I

The
PL/I PL/I (Programming Language One, pronounced and sometimes written PL/1) is a procedural, imperative computer programming language developed and published by IBM. It is designed for scientific, engineering, business and system programming. I ...
language provides full support for pointers to all data types (including pointers to structures),
recursion Recursion (adjective: ''recursive'') occurs when a thing is defined in terms of itself or of its type. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in mathemati ...
, multitasking, string handling, and extensive built-in
function Function or functionality may refer to: Computing * Function key, a type of key on computer keyboards * Function model, a structured representation of processes in a system * Function object or functor or functionoid, a concept of object-oriente ...
s. PL/I was quite a leap forward compared to the programming languages of its time. PL/I pointers are untyped, and therefore no casting is required for pointer dereferencing or assignment. The declaration syntax for a pointer is DECLARE xxx POINTER;, which declares a pointer named "xxx". Pointers are used with BASED variables. A based variable can be declared with a default locator (DECLARE xxx BASED(ppp); or without (DECLARE xxx BASED;), where xxx is a based variable, which may be an element variable, a structure, or an array, and ppp is the default pointer). Such a variable can be address without an explicit pointer reference (xxx=1;, or may be addressed with an explicit reference to the default locator (ppp), or to any other pointer (qqq->xxx=1;). Pointer arithmetic is not part of the PL/I standard, but many compilers allow expressions of the form ptr = ptr±expression. IBM PL/I also has the builtin function PTRADD to perform the arithmetic. Pointer arithmetic is always performed in bytes. IBM ''Enterprise'' PL/I compilers have a new form of typed pointer called a HANDLE.


D

The
D programming language D, also known as dlang, is a multi-paradigm system programming language created by Walter Bright at Digital Mars and released in 2001. Andrei Alexandrescu joined the design and development effort in 2007. Though it originated as a re-engineer ...
is a derivative of C and C++ which fully supports C pointers and C typecasting.


Eiffel

The Eiffel object-oriented language employs value and reference semantics without pointer arithmetic. Nevertheless, pointer classes are provided. They offer pointer arithmetic, typecasting, explicit memory management, interfacing with non-Eiffel software, and other features.


Fortran

Fortran-90 introduced a strongly typed pointer capability. Fortran pointers contain more than just a simple memory address. They also encapsulate the lower and upper bounds of array dimensions, strides (for example, to support arbitrary array sections), and other metadata. An ''association operator'', => is used to associate a POINTER to a variable which has a TARGET attribute. The Fortran-90 ALLOCATE statement may also be used to associate a pointer to a block of memory. For example, the following code might be used to define and create a linked list structure: type real_list_t real :: sample_data(100) type (real_list_t), pointer :: next => null () end type type (real_list_t), target :: my_real_list type (real_list_t), pointer :: real_list_temp real_list_temp => my_real_list do read (1,iostat=ioerr) real_list_temp%sample_data if (ioerr /= 0) exit allocate (real_list_temp%next) real_list_temp => real_list_temp%next end do Fortran-2003 adds support for procedure pointers. Also, as part of the ''C Interoperability'' feature, Fortran-2003 supports intrinsic functions for converting C-style pointers into Fortran pointers and back.


Go

Go has pointers. Its declaration syntax is equivalent to that of C, but written the other way around, ending with the type. Unlike C, Go has garbage collection, and disallows pointer arithmetic. Reference types, like in C++, do not exist. Some built-in types, like maps and channels, are boxed (i.e. internally they are pointers to mutable structures), and are initialized using the make function. In an approach to unified syntax between pointers and non-pointers, the arrow (->) operator has been dropped: the dot operator on a pointer refers to the field or method of the dereferenced object. This, however, only works with 1 level of indirection.


Java

There is no explicit representation of pointers in
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
. Instead, more complex data structures like objects and
arrays An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
are implemented using references. The language does not provide any explicit pointer manipulation operators. It is still possible for code to attempt to dereference a null reference (null pointer), however, which results in a run-time exception being thrown. The space occupied by unreferenced memory objects is recovered automatically by
garbage collection Waste collection is a part of the process of waste management. It is the transfer of solid waste from the point of use and disposal to the point of treatment or landfill. Waste collection also includes the curbside collection of recyclabl ...
at run-time.


Modula-2

Pointers are implemented very much as in Pascal, as are VAR parameters in procedure calls. Modula-2 is even more strongly typed than Pascal, with fewer ways to escape the type system. Some of the variants of Modula-2 (such as
Modula-3 Modula-3 is a programming language conceived as a successor to an upgraded version of Modula-2 known as Modula-2+. While it has been influential in research circles (influencing the designs of languages such as Java, C#, and Python) it has not ...
) include garbage collection.


Oberon

Much as with Modula-2, pointers are available. There are still fewer ways to evade the type system and so
Oberon Oberon () is a king of the fairies in medieval and Renaissance literature. He is best known as a character in William Shakespeare's play ''A Midsummer Night's Dream'', in which he is King of the Fairies and spouse of Titania, Queen of the Fairi ...
and its variants are still safer with respect to pointers than Modula-2 or its variants. As with
Modula-3 Modula-3 is a programming language conceived as a successor to an upgraded version of Modula-2 known as Modula-2+. While it has been influential in research circles (influencing the designs of languages such as Java, C#, and Python) it has not ...
, garbage collection is a part of the language specification.


Pascal

Unlike many languages that feature pointers, standard
ISO ISO is the most common abbreviation for the International Organization for Standardization. ISO or Iso may also refer to: Business and finance * Iso (supermarket), a chain of Danish supermarkets incorporated into the SuperBest chain in 2007 * Iso ...
Pascal only allows pointers to reference dynamically created variables that are anonymous and does not allow them to reference standard static or local variables. It does not have pointer arithmetic. Pointers also must have an associated type and a pointer to one type is not compatible with a pointer to another type (e.g. a pointer to a char is not compatible with a pointer to an integer). This helps eliminate the type security issues inherent with other pointer implementations, particularly those used for
PL/I PL/I (Programming Language One, pronounced and sometimes written PL/1) is a procedural, imperative computer programming language developed and published by IBM. It is designed for scientific, engineering, business and system programming. I ...
or C. It also removes some risks caused by dangling pointers, but the ability to dynamically let go of referenced space by using the dispose standard procedure (which has the same effect as the free library function found in C) means that the risk of dangling pointers has not been entirely eliminated. However, in some commercial and open source Pascal (or derivatives) compiler implementations —like
Free Pascal Free Pascal Compiler (FPC) is a compiler for the closely related programming-language dialects Pascal and Object Pascal. It is free software released under the GNU General Public License, witexception clausesthat allow static linking against its ...
,
Turbo Pascal Turbo Pascal is a software development system that includes a compiler and an integrated development environment (IDE) for the Pascal programming language running on CP/M, CP/M-86, and DOS. It was originally developed by Anders Hejlsberg at ...
or the
Object Pascal Object Pascal is an extension to the programming language Pascal that provides object-oriented programming (OOP) features such as classes and methods. The language was originally developed by Apple Computer as ''Clascal'' for the Lisa Worksh ...
in
Embarcadero Delphi Delphi is a general-purpose programming language and a software product that uses the Delphi dialect of the Object Pascal programming language and provides an integrated development environment (IDE) for rapid application development of desktop, ...
— a pointer is allowed to reference standard static or local variables and can be cast from one pointer type to another. Moreover, pointer arithmetic is unrestricted: adding or subtracting from a pointer moves it by that number of bytes in either direction, but using the Inc or Dec standard procedures with it moves the pointer by the size of the data type it is ''declared'' to point to. An untyped pointer is also provided under the name Pointer, which is compatible with other pointer types.


Perl

The
Perl Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offic ...
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
supports pointers, although rarely used, in the form of the pack and unpack functions. These are intended only for simple interactions with compiled OS libraries. In all other cases, Perl uses references, which are typed and do not allow any form of pointer arithmetic. They are used to construct complex data structures.


See also

*
Address constant {{short description, Assembly language data type In IBM System/360 through present day z/Architecture, an address constant or "adcon" is an assembly language data type which contains the address of a location in computer memory. An address consta ...
*
Bounded pointer In computer science, a bounded pointer is a pointer (computer programming), pointer that is augmented with additional information that enable the storage bounds within which it may point to be deduced. This additional information sometimes takes th ...
*
Buffer overflow In information security and programming, a buffer overflow, or buffer overrun, is an anomaly whereby a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory locations. Buffers are areas of memo ...
*
Cray pointer In some programming languages, const is a type qualifier (a keyword applied to a data type) that indicates that the data is read-only. While this can be used to declare constants, in the C family of languages differs from similar constructs i ...
*
Fat pointer In computer science, dynamic dispatch is the process of selecting which implementation of a polymorphic operation (method or function) to call at run time. It is commonly employed in, and considered a prime characteristic of, object-oriented ...
*
Function pointer A function pointer, also called a subroutine pointer or procedure pointer, is a pointer that points to a function. As opposed to referencing a data value, a function pointer points to executable code within memory. Dereferencing the function poi ...
*
Hazard pointer In a Thread (computer science), multithreaded computer science, computing environment, hazard pointers are one approach to solving the problems posed by dynamic memory management of the nodes in a non-blocking algorithm, lock-free data structure. T ...
*
Iterator In computer programming, an iterator is an object that enables a programmer to traverse a container, particularly lists. Various types of iterators are often provided via a container's interface. Though the interface and semantics of a given iterat ...
*
Opaque pointer In computer programming, an opaque pointer is a special case of an opaque data type, a data type declared to be a pointer to a record or data structure of some unspecified type. Opaque pointers are present in several programming languages includi ...
* Pointee *
Pointer swizzling In computer science, pointer swizzling is the conversion of references based on name or position into direct pointer references (memory addresses). It is typically performed during deserialization or loading of a relocatable object from a dis ...
*
Reference (computer science) In computer programming, a reference is a value that enables a program to indirectly access a particular data, such as a variable's value or a record, in the computer's memory or in some other storage device. The reference is said to refer t ...
*
Static program analysis In computer science, static program analysis (or static analysis) is the analysis of computer programs performed without executing them, in contrast with dynamic program analysis, which is performed on programs during their execution. The term ...
*
Storage violation In computing a storage violation is a hardware or software fault that occurs when a task attempts to access an area of computer storage which it is not permitted to access. Types of storage violation Storage violation can, for instance, consist ...
*
Tagged pointer In computer science, a tagged pointer is a pointer (concretely a memory address) with additional data associated with it, such as an indirection bit or reference count. This additional data is often "folded" into the pointer, meaning stored inline ...
*
Variable (computer science) In computer programming, a variable is an abstract storage location paired with an associated symbolic name, which contains some known or unknown quantity of information referred to as a ''value''; or in simpler terms, a variable is a named cont ...
*
Zero-based numbering Zero-based numbering is a way of numbering in which the initial element of a sequence is assigned the index 0, rather than the index 1 as is typical in everyday ''non-mathematical'' or ''non-programming'' circumstances. Under zero-base ...


References


External links


PL/I List Processing
Paper from the June, 1967 issue of CACM
cdecl.org
A tool to convert pointer declarations to plain English
Over IQ.com
A beginner level guide describing pointers in a plain English.
Pointers and Memory
Introduction to pointers – Stanford Computer Science Education Library

A visual model for the beginners in C programming
0pointer.de
A terse list of minimum length source codes that dereference a null pointer in several different programming languages

*. {{DEFAULTSORT:Pointer (Computing) Articles with example C code Data types Primitive types American inventions sv:Datatyp#Pekare och referenstyper