Dangling pointers and wild pointers in
computer programming
Computer programming is the process of performing a particular computation (or more generally, accomplishing a specific computing result), usually by designing and building an executable computer program. Programming involves tasks such as ana ...
are
pointers
Pointer may refer to:
Places
* Pointer, Kentucky
* Pointers, New Jersey
* Pointers Airport, Wasco County, Oregon, United States
* The Pointers, a pair of rocks off Antarctica
People with the name
* Pointer (surname), a surname (including a l ...
that do not point to a valid object of the appropriate type. These are special cases of
memory safety
Memory safety is the state of being protected from various software bugs and Vulnerability (computing), security vulnerabilities when dealing with random-access memory, memory access, such as buffer overflows and dangling pointers. For example, J ...
violations. More generally, dangling references and wild references are
references
Reference is a relationship between objects in which one object designates, or acts as a means by which to connect to or link to, another object. The first object in this relation is said to ''refer to'' the second object. It is called a ''name'' ...
that do not resolve to a valid destination.
Dangling pointers arise during
object destruction
In object-oriented programming (OOP), the object lifetime (or life cycle) of an object is the time between an object's creation and its destruction. Rules for object lifetime vary significantly between languages, in some cases between implement ...
, when an object that has an incoming reference is deleted or deallocated, without modifying the value of the pointer, so that the pointer still points to the memory location of the deallocated memory. The system may reallocate the previously freed memory, and if the program then
dereferences the (now) dangling pointer, ''
unpredictable behavior may result'', as the memory may now contain completely different data. If the program writes to memory referenced by a dangling pointer, a silent corruption of unrelated data may result, leading to subtle
bugs that can be extremely difficult to find. If the memory has been reallocated to another process, then attempting to dereference the dangling pointer can cause
segmentation fault
In computing, a segmentation fault (often shortened to segfault) or access violation is a fault, or failure condition, raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restricte ...
s (UNIX, Linux) or
general protection fault
A general protection fault (GPF) in the x86 instruction set architectures (ISAs) is a fault (a type of interrupt) initiated by ISA-defined protection mechanisms in response to an access violation caused by some running code, either in the kernel ...
s (Windows). If the program has sufficient privileges to allow it to overwrite the bookkeeping data used by the kernel's memory allocator, the corruption can cause system instabilities. In
object-oriented language
Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of ...
s with
garbage collection
Waste collection is a part of the process of waste management. It is the transfer of solid waste from the point of use and disposal to the point of treatment or landfill. Waste collection also includes the curbside collection of recyclabl ...
, dangling references are prevented by only destroying objects that are unreachable, meaning they do not have any incoming pointers; this is ensured either by tracing or
reference counting
In computer science, reference counting is a programming technique of storing the number of references, pointers, or handles to a resource, such as an object, a block of memory, disk space, and others.
In garbage collection algorithms, referenc ...
. However, a
finalizer may create new references to an object, requiring
object resurrection
In object-oriented programming languages with garbage collection, object resurrection is when an object comes back to life during the process of object destruction, as a side effect of a finalizer being executed.
Object resurrection causes a num ...
to prevent a dangling reference.
Wild pointers arise when a pointer is used prior to initialization to some known state, which is possible in some programming languages. They show the same erratic behavior as dangling pointers, though they are less likely to stay undetected because many compilers will raise a warning at compile time if declared variables are accessed before being initialized.
Cause of dangling pointers
In many languages (e.g., the
C programming language
''The C Programming Language'' (sometimes termed ''K&R'', after its authors' initials) is a computer programming book written by Brian Kernighan and Dennis Ritchie, the latter of whom originally designed and implemented the language, as well as ...
) deleting an object from memory explicitly or by destroying the
stack frame
In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, program stack, control stack, run-time stack, or mac ...
on return does not alter associated pointers. The pointer still points to the same location in memory even though it may now be used for other purposes.
A straightforward example is shown below:
If the operating system is able to detect run-time references to
null pointer
In computing, a null pointer or null reference is a value saved for indicating that the pointer or reference does not refer to a valid object. Programs routinely use null pointers to represent conditions such as the end of a list of unknown lengt ...
s, a solution to the above is to assign 0 (null) to dp immediately before the inner block is exited. Another solution would be to somehow guarantee dp is not used again without further initialization.
Another frequent source of dangling pointers is a jumbled combination of
malloc()
and
free()
library calls: a pointer becomes dangling when the block of memory it points to is freed. As with the previous example one way to avoid this is to make sure to reset the pointer to null after freeing its reference—as demonstrated below.
#include
void func()
An all too common misstep is returning addresses of a stack-allocated local variable: once a called function returns, the space for these variables gets deallocated and technically they have "garbage values".
int *func(void)
Attempts to read from the pointer may still return the correct value (1234) for a while after calling
func
, but any functions called thereafter may overwrite the stack storage allocated for
num
with other values and the pointer would no longer work correctly. If a pointer to
num
must be returned,
num
must have scope beyond the function—it might be declared as
static
Static may refer to:
Places
*Static Nunatak, a nunatak in Antarctica
United States
* Static, Kentucky and Tennessee
*Static Peak, a mountain in Wyoming
**Static Peak Divide, a mountain pass near the peak
Science and technology Physics
*Static el ...
.
Manual deallocation without dangling reference
(1945-1996) has created a complete object management system which is free of dangling reference phenomenon. A similar approach was proposed by Fisher and LeBlanc
[C.N. Fisher, R.J. Leblanc, ''The implementation of run-time diagnostics in Pascal '', IEEE Trans. Softw. Eng., 6(4):313-319, 1980] under the name ''
Locks-and-keys
Locks-and-keys is a solution to dangling pointers in computer programming languages.
The locks-and-keys approach represents pointers as ordered pairs (key, address) where the key is an integer value. Heap-dynamic variables are represented as the ...
''.
Cause of wild pointers
Wild pointers are created by omitting necessary initialization prior to first use. Thus, strictly speaking, every pointer in programming languages which do not enforce initialization begins as a wild pointer.
This most often occurs due to jumping over the initialization, not by omitting it. Most compilers are able to warn about this.
int f(int i)
Security holes involving dangling pointers
Like
buffer-overflow bugs, dangling/wild pointer bugs frequently become security holes. For example, if the pointer is used to make a
virtual function
In object-oriented programming, in languages such as C++, and Object Pascal, a virtual function or virtual method is an inheritable and overridable function or method for which dynamic dispatch is facilitated. This concept is an important part o ...
call, a different address (possibly pointing at exploit code) may be called due to the
vtable
In computer programming, a virtual method table (VMT), virtual function table, virtual call table, dispatch table, vtable, or vftable is a mechanism used in a programming language to support dynamic dispatch (or run-time method binding).
When ...
pointer being overwritten. Alternatively, if the pointer is used for writing to memory, some other data structure may be corrupted. Even if the memory is only read once the pointer becomes dangling, it can lead to information leaks (if interesting data is put in the next structure allocated there) or to
privilege escalation
Privilege escalation is the act of exploiting a bug, a design flaw, or a configuration oversight in an operating system or software application to gain elevated access to resources that are normally protected from an application or user. The res ...
(if the now-invalid memory is used in security checks). When a dangling pointer is used after it has been freed without allocating a new chunk of memory to it, this becomes known as a "use after free" vulnerability. For example, is a use-after-free vulnerability in Microsoft Internet Explorer 6 through 11 being used by
zero-day attack
A zero-day (also known as a 0-day) is a computer-software vulnerability previously unknown to those who should be interested in its mitigation, like the vendor of the target software. Until the vulnerability is mitigated, hackers can exploit ...
s by an
advanced persistent threat
An advanced persistent threat (APT) is a stealthy threat actor, typically a nation state or state-sponsored group, which gains unauthorized access to a computer network and remains undetected for an extended period. In recent times, the term may ...
.
Avoiding dangling pointer errors
In C, the simplest technique is to implement an alternative version of the
free()
(or alike) function which guarantees the reset of the pointer. However, this technique will not clear other pointer variables which may contain a copy of the pointer.
#include
#include
/* Alternative version for 'free()' */
static void safefree(void **pp)
int f(int i)
The alternative version can be used even to guarantee the validity of an empty pointer before calling
malloc()
:
safefree(&p); /* i'm not sure if chunk has been released */
p = malloc(1000); /* allocate now */
These uses can be masked through
#define
directives to construct useful macros (a common one being
#define XFREE(ptr) safefree((void **)&(ptr))
), creating something like a metalanguage or can be embedded into a tool library apart. In every case, programmers using this technique should use the safe versions in every instance where
free()
would be used; failing in doing so leads again to the problem. Also, this solution is limited to the scope of a single program or project, and should be properly documented.
Among more structured solutions, a popular technique to avoid dangling pointers in C++ is to use
smart pointer
In computer science, a smart pointer is an abstract data type that simulates a pointer while providing added features, such as automatic memory management or bounds checking. Such features are intended to reduce bugs caused by the misuse of poin ...
s. A smart pointer typically uses
reference counting
In computer science, reference counting is a programming technique of storing the number of references, pointers, or handles to a resource, such as an object, a block of memory, disk space, and others.
In garbage collection algorithms, referenc ...
to reclaim objects. Some other techniques include the
tombstones
A headstone, tombstone, or gravestone is a stele or marker, usually stone, that is placed over a grave. It is traditional for burials in the Christian, Jewish, and Muslim religions, among others. In most cases, it has the deceased's name, ...
method and the
locks-and-keys
Locks-and-keys is a solution to dangling pointers in computer programming languages.
The locks-and-keys approach represents pointers as ordered pairs (key, address) where the key is an integer value. Heap-dynamic variables are represented as the ...
method.
Another approach is to use the
Boehm garbage collector
The Boehm–Demers–Weiser garbage collector, often simply known as Boehm GC, is a conservative garbage collector for C and C++ developed by Hans Boehm, Alan Demers, and Mark Weiser.Hans BoehmA garbage collector for C and C++/ref> Andrew W. Appe ...
, a conservative
garbage collector
A waste collector, also known as a garbageman, garbage collector, trashman (in the US), binman or (rarely) dustman (in the UK), is a person employed by a public or private enterprise to collect and dispose of municipal solid waste (refuse) and r ...
that replaces standard memory allocation functions in C and
C++
C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
with a garbage collector. This approach completely eliminates dangling pointer errors by disabling frees, and reclaiming objects by garbage collection.
In languages like Java, dangling pointers cannot occur because there is no mechanism to explicitly deallocate memory. Rather, the garbage collector may deallocate memory, but only when the object is no longer reachable from any references.
In the language
Rust
Rust is an iron oxide, a usually reddish-brown oxide formed by the reaction of iron and oxygen in the catalytic presence of water or air moisture. Rust consists of hydrous iron(III) oxides (Fe2O3·nH2O) and iron(III) oxide-hydroxide (FeO(OH ...
, the
type system
In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a type to every "term" (a word, phrase, or other set of symbols). Usually the terms are various constructs of a computer progra ...
has been extended to include also the variables lifetimes and
resource acquisition is initialization. Unless one disables the features of the language, dangling pointers will be caught at compile time and reported as programming errors.
Dangling pointer detection
To expose dangling pointer errors, one common programming technique is to set pointers to the
null pointer
In computing, a null pointer or null reference is a value saved for indicating that the pointer or reference does not refer to a valid object. Programs routinely use null pointers to represent conditions such as the end of a list of unknown lengt ...
or to an invalid address once the storage they point to has been released. When the null pointer is dereferenced (in most languages) the program will immediately terminate—there is no potential for data corruption or unpredictable behavior. This makes the underlying programming mistake easier to find and resolve. This technique does not help when there are multiple copies of the pointer.
Some debuggers will automatically overwrite and destroy data that has been freed, usually with a specific pattern, such as
0xDEADBEEF
(Microsoft's Visual C/C++ debugger, for example, uses
0xCC
,
0xCD
or
0xDD
depending on what has been freed). This usually prevents the data from being reused by making it useless and also very prominent (the pattern serves to show the programmer that the memory has already been freed).
Tools such as
Polyspace
Polyspace is a static code analysis tool for large-scale analysis by abstract interpretation to detect, or prove the absence of, certain run-time errors in source code for the C, C++, and Ada programming languages. The tool also checks source co ...
,
TotalView,
Valgrind
Valgrind () is a programming tool for memory debugging, memory leak detection, and profiling.
Valgrind was originally designed to be a free memory debugging tool for Linux on x86, but has since evolved to become a generic framework for creati ...
, Mudflap,
AddressSanitizer
AddressSanitizer (or ASan) is an open source programming tool that detects memory corruption bugs such as buffer overflows or accesses to a dangling pointer (use-after-free). AddressSanitizer is based on compiler instrumentation and directly mappe ...
, or tools based on
LLVM
LLVM is a set of compiler and toolchain technologies that can be used to develop a front end for any programming language and a back end for any instruction set architecture. LLVM is designed around a language-independent intermediate represen ...
[Dhurjati, D. and Adve, V]
Efficiently Detecting All Dangling Pointer Uses in Production Servers
/ref> can also be used to detect uses of dangling pointers.
Other tools
SoftBound
Insure++
Insure++ is a memory debugger computer program, used by software developers to detect various errors in programs written in C and C++. It is made by Parasoft, and is functionally similar to other memory debuggers, such as Purify, Valgrind an ...
, an
CheckPointer
instrument the source code to collect and track legitimate values for pointers ("metadata") and check each pointer access against the metadata for validity.
Another strategy, when suspecting a small set of classes, is to temporarily make all their member functions virtual: after the class instance has been destructed/freed, its pointer to the Virtual Method Table
In computer programming, a virtual method table (VMT), virtual function table, virtual call table, dispatch table, vtable, or vftable is a mechanism used in a programming language to support dynamic dispatch (or run-time method binding).
When ...
is set to NULL
, and any call to a member function will crash the program and it will show the guilty code in the debugger.
See also
*Common Vulnerabilities and Exposures
The Common Vulnerabilities and Exposures (CVE) system provides a reference-method for publicly known information-security vulnerabilities and exposures. The United States' National Cybersecurity FFRDC, operated by The MITRE Corporation, maintai ...
*Link rot
Link rot (also called link death, link breaking, or reference rot) is the phenomenon of hyperlinks tending over time to cease to point to their originally targeted file, web page, or server due to that resource being relocated to a new address ...
*Memory debugger
Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered, ...
*Wild branch In computer programming, a wild branch is a GOTO instruction where the target address is indeterminate, random or otherwise unintended. It is usually the result of a software bug causing the accidental corruption of a pointer or array index. It is ...
References
{{DEFAULTSORT:Dangling Pointer
Software bugs
Computer security exploits