Null Pointer
   HOME

TheInfoList



OR:

In
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, e ...
, a null pointer or null reference is a value saved for indicating that the pointer or
reference Reference is a relationship between objects in which one object designates, or acts as a means by which to connect to or link to, another object. The first object in this relation is said to ''refer to'' the second object. It is called a ''name'' ...
does not refer to a valid
object Object may refer to: General meanings * Object (philosophy), a thing, being, or concept ** Object (abstract), an object which does not exist at any particular time or place ** Physical object, an identifiable collection of matter * Goal, an ...
. Programs routinely use null pointers to represent conditions such as the end of a
list A ''list'' is any set of items in a row. List or lists may also refer to: People * List (surname) Organizations * List College, an undergraduate division of the Jewish Theological Seminary of America * SC Germania List, German rugby union ...
of unknown length or the failure to perform some action; this use of null pointers can be compared to
nullable type Nullable types are a feature of some programming languages which allow a value to be set to the special value NULL instead of the usual possible values of the data type. In statically typed languages, a nullable type is an option type, while in ...
s and to the ''Nothing'' value in an
option type In programming languages (especially functional programming languages) and type theory, an option type or maybe type is a polymorphic type that represents encapsulation of an optional value; e.g., it is used as the return type of functions whic ...
. A null pointer should not be confused with an uninitialized pointer: a null pointer is guaranteed to compare unequal to any pointer that points to a valid object. However, depending on the language and implementation, an uninitialized pointer may not have any such guarantee. It might compare equal to other, valid pointers; or it might compare equal to null pointers. It might do both at different times; or the comparison might be
undefined behaviour In computer programming, undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification to which the computer code adheres. This is different from unspecified behavior ...
.


C

In C, two null pointers of any type are guaranteed to compare equal. The preprocessor macro NULL is defined as an implementation-defined null pointer constant, which in
C99 C99 (previously known as C9X) is an informal name for ISO/IEC 9899:1999, a past version of the C programming language standard. It extends the previous version ( C90) with new features for the language and the standard library, and helps impl ...
can be portably expressed as ((void *)0) which means that the integer value 0 converted to the type void* (pointer to
void Void may refer to: Science, engineering, and technology * Void (astronomy), the spaces between galaxy filaments that contain no galaxies * Void (composites), a pore that remains unoccupied in a composite material * Void, synonym for vacuum, a ...
). The C standard does not say that the null pointer is the same as the pointer to
memory address In computing, a memory address is a reference to a specific memory location used at various levels by software and hardware. Memory addresses are fixed-length sequences of digits conventionally displayed and manipulated as unsigned integers. Su ...
 0, though that may be the case in practice.
Dereferencing In computer programming, the dereference operator or indirection operator, sometimes denoted by "*" (i.e. an asterisk), is a unary operator (i.e. one with a single operand) found in List of C-family programming languages, C-like languages that in ...
a null pointer is
undefined behavior In computer programming, undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification to which the computer code adheres. This is different from unspecified behavior ...
in C, and a conforming implementation is allowed to assume that any pointer that is dereferenced is not null. In practice, dereferencing a null pointer may result in an attempted read or write from
memory Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered, ...
that is not mapped, triggering a
segmentation fault In computing, a segmentation fault (often shortened to segfault) or access violation is a fault, or failure condition, raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restrict ...
or memory access violation. This may manifest itself as a program crash, or be transformed into a software exception that can be caught by program code. There are, however, certain circumstances where this is not the case. For example, in
x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was introd ...
real mode Real mode, also called real address mode, is an operating mode of all x86-compatible CPUs. The mode gets its name from the fact that addresses in real mode always correspond to real locations in memory. Real mode is characterized by a 20-bit seg ...
, the address 0000:0000 is readable and also usually writable, and dereferencing a pointer to that address is a perfectly valid but typically unwanted action that may lead to undefined but non-crashing behavior in the application. There are occasions when dereferencing the pointer to address zero ''is'' intentional and well-defined; for example,
BIOS In computing, BIOS (, ; Basic Input/Output System, also known as the System BIOS, ROM BIOS, BIOS ROM or PC BIOS) is firmware used to provide runtime services for operating systems and programs and to perform hardware initialization during the ...
code written in C for 16-bit real-mode x86 devices may write the
interrupt descriptor table The interrupt descriptor table (IDT) is a data structure used by the x86 architecture to implement an interrupt vector table. The IDT is used by the processor to determine the correct response to interrupts and exceptions. The details in the des ...
(IDT) at physical address 0 of the machine by dereferencing a null pointer for writing. It is also possible for the compiler to optimize away the null pointer dereference, avoiding a segmentation fault but causing othe
undesired behavior


C++

In C++, while the NULL macro was inherited from C, the integer literal for zero has been traditionally preferred to represent a null pointer constant. However,
C++11 C++11 is a version of the ISO/ IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions b ...
introduced the explicit null pointer constant nullptr to be used instead.


Other languages

In some programming language environments (at least one proprietary Lisp implementation, for example), the value used as the null pointer (called nil in
Lisp A lisp is a speech impairment in which a person misarticulates sibilants (, , , , , , , ). These misarticulations often result in unclear speech. Types * A frontal lisp occurs when the tongue is placed anterior to the target. Interdental lisping ...
) may actually be a pointer to a block of internal data useful to the implementation (but not explicitly reachable from user programs), thus allowing the same register to be used as a useful constant and a quick way of accessing implementation internals. This is known as the nil vector. In languages with a
tagged architecture In computer science, a tagged architecture is a particular type of computer architecture where every word of memory constitutes a tagged union, being divided into a number of bits of data, and a ''tag'' section that describes the type of the data: ...
, a possibly null pointer can be replaced with a
tagged union In computer science, a tagged union, also called a variant, variant record, choice type, discriminated union, disjoint union, sum type or coproduct, is a data structure used to hold a value that could take on several different, but fixed, types. O ...
which enforces explicit handling of the exceptional case; in fact, a possibly null pointer can be seen as a
tagged pointer In computer science, a tagged pointer is a pointer (concretely a memory address) with additional data associated with it, such as an indirection bit or reference count. This additional data is often "folded" into the pointer, meaning stored inline ...
with a computed tag. Programming languages use different literals for the ''null pointer''. In Python, for example, a null value is called None. In Pascal and
Swift Swift or SWIFT most commonly refers to: * SWIFT, an international organization facilitating transactions between banks ** SWIFT code * Swift (programming language) * Swift (bird), a family of birds It may also refer to: Organizations * SWIFT, ...
, a null pointer is called nil. In
Eiffel Eiffel may refer to: Places * Eiffel Peak, a summit in Alberta, Canada * Champ de Mars – Tour Eiffel station, Paris, France; a transit station Structures * Eiffel Tower, in Paris, France, designed by Gustave Eiffel * Eiffel Bridge, Ungheni, M ...
, it is called a void reference.


Null dereferencing

Because a null pointer does not point to a meaningful object, an attempt to
dereference In computer programming, the dereference operator or indirection operator, sometimes denoted by "*" (i.e. an asterisk), is a unary operator (i.e. one with a single operand) found in C-like languages that include pointer variables. It operates ...
(i.e., access the data stored at that memory location) a null pointer usually (but not always) causes a run-time error or immediate program crash. * In C, dereferencing a null pointer is
undefined behavior In computer programming, undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification to which the computer code adheres. This is different from unspecified behavior ...
. ISO/IEC 9899, clause 6.5.3.2, paragraph 4, esp. footnote 87. Many implementations cause such code to result in the program being halted with an
access violation Access may refer to: Companies and organizations * ACCESS (Australia), an Australian youth network * Access (credit card), a former credit card in the United Kingdom * Access Co., a Japanese software company * Access Healthcare, an Indian BPO se ...
, because the null pointer representation is chosen to be an address that is never allocated by the system for storing objects. However, this behavior is not universal. It's also not guaranteed, since compilers are permitted to optimize programs under the assumption that they're free of undefined behaviour. * In
Delphi Delphi (; ), in legend previously called Pytho (Πυθώ), in ancient times was a sacred precinct that served as the seat of Pythia, the major oracle who was consulted about important decisions throughout the ancient classical world. The oracle ...
and many other Pascal implementations, the constant nil represents a null pointer to the first address in memory which is also used to initialize managed variables. Dereferencing it raises an external OS exception which is being mapped onto a Pascal '' exception instance if the unit is linked in the uses clause. * In
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
, access to a null reference triggers a (NPE), which can be caught by error handling code, but the preferred practice is to ensure that such exceptions never occur. * In
Lisp A lisp is a speech impairment in which a person misarticulates sibilants (, , , , , , , ). These misarticulations often result in unclear speech. Types * A frontal lisp occurs when the tongue is placed anterior to the target. Interdental lisping ...
, is a
first class object In programming language design, a first-class citizen (also type, object, entity, or value) in a given programming language is an entity which supports all the operations generally available to other entities. These operations typically include b ...
. By convention, (first nil) is , as is (rest nil). So dereferencing in these contexts will not cause an error, but poorly written code can get into an infinite loop. * In .NET, access to null reference triggers a to be thrown. Although catching these is generally considered bad practice, this exception type can be caught and handled by the program. * In
Objective-C Objective-C is a general-purpose, object-oriented programming language that adds Smalltalk-style messaging to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was selected by NeXT for its NeXTS ...
, messages may be sent to a nil object (which is a null pointer) without causing the program to be interrupted; the message is simply ignored, and the return value (if any) is nil or 0, depending on the type. * Before the introduction of Supervisor Mode Access Prevention (SMAP), a null pointer dereference bug could be exploited by mapping pagezero into the attacker's
address space In computing, an address space defines a range of discrete addresses, each of which may correspond to a network host, peripheral device, disk sector, a memory cell or other logical or physical entity. For software programs to save and retrieve st ...
and hence causing the null pointer to point to that region. This could lead to code execution in some cases.


Mitigation

There are techniques to facilitate debugging null pointer dereferences. Bond et al. suggest to modify the Java Virtual Machine (JVM) in order to keep track of null propagation. The idea of the Casper system is to use source code transformation in order to track this propagation, without modifying the JVM. In some cases, it is possible to automatically generate a patch to fix null pointer exceptions. Pure
functional languages In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that m ...
and user code run in many interpreted or virtual-machine languages do not suffer the problem of null pointer dereferencing, since no direct access is provided to pointers and, in the case of pure functional languages, all code and data is immutable. Where a language does provide or utilise pointers which could otherwise become void, it may be possible to mitigate or avoid runtime null dereferences by providing compilation-time checking via
static analysis Static analysis, static projection, or static scoring is a simplified analysis wherein the effect of an immediate change to a system is calculated without regard to the longer-term response of the system to that change. If the short-term effect i ...
or other techniques, with a burgeoning movement toward syntactic assistance from language features such as those seen in modern versions of the
Eiffel programming language Eiffel is an object-oriented programming language designed by Bertrand Meyer (an object-orientation proponent and author of '' Object-Oriented Software Construction'') and Eiffel Software. Meyer conceived the language in 1985 with the goal of i ...
, D, and
Rust Rust is an iron oxide, a usually reddish-brown oxide formed by the reaction of iron and oxygen in the catalytic presence of water or air moisture. Rust consists of hydrous iron(III) oxides (Fe2O3·nH2O) and iron(III) oxide-hydroxide (FeO(OH ...
. Similar analysis can be performed using external tools, in some languages.


History

In 2009,
Tony Hoare Sir Charles Antony Richard Hoare (Tony Hoare or C. A. R. Hoare) (born 11 January 1934) is a British computer scientist who has made foundational contributions to programming languages, algorithms, operating systems, formal verification, and c ...
stated that he invented the null reference in 1965 as part of the
ALGOL W ALGOL W is a programming language. It is based on a proposal for ALGOL X by Niklaus Wirth and Tony Hoare as a successor to ALGOL 60. ALGOL W is a relatively simple upgrade of the original ALGOL 60, adding string, bitstring, complex number and ...
language. In that 2009 reference Hoare describes his invention as a "billion-dollar mistake":


See also

*
Memory debugger Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered, ...
*
Zero page The zero page or base page is the block of memory at the very beginning of a computer's address space; that is, the page whose starting address is zero. The size of a page depends on the context, and the significance of zero page memory versus h ...


References


Citations


Sources

* {{Nulls Data types