HOME

TheInfoList



OR:

Memory safety is the state of being protected from various
software bugs A software bug is an error, flaw or fault in the design, development, or operation of computer software that causes it to produce an incorrect or unexpected result, or to behave in unintended ways. The process of finding and correcting bugs ...
and
security vulnerabilities Vulnerabilities are flaws in a computer system that weaken the overall security of the device/system. Vulnerabilities can be weaknesses in either the hardware itself, or the software that runs on the hardware. Vulnerabilities can be exploited by ...
when dealing with
memory Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered, ...
access, such as
buffer overflow In information security and programming, a buffer overflow, or buffer overrun, is an anomaly whereby a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory locations. Buffers are areas of memor ...
s and
dangling pointer Dangling pointers and wild pointers in computer programming are pointers that do not point to a valid object of the appropriate type. These are special cases of memory safety violations. More generally, dangling references and wild references ar ...
s. For example,
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's most ...
is said to be memory-safe because its
runtime error detection Runtime error detection is a software verification method that analyzes a software application as it executes and reports defects that are detected during that execution. It can be applied during unit testing, component testing, integration test ...
checks array bounds and pointer dereferences. In contrast, C and
C++ C, or c, is the third letter in the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is ''cee'' (pronounced ), plural ''cees''. History "C" ...
allow arbitrary
pointer arithmetic In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer ''refe ...
with pointers implemented as direct memory addresses with no provision for
bounds checking In computer programming, bounds checking is any method of detecting whether a variable is within some bounds before it is used. It is usually used to ensure that a number fits into a given type (range checking), or that a variable being used as ...
, and thus are potentially memory-unsafe.


History

Memory errors were first considered in the context of
resource management In organizational studies, resource management is the efficient and effective development of an organization's resources when they are needed. Such resources may include the financial resources, inventory, human skills, production resources, or i ...
and
time-sharing In computing, time-sharing is the sharing of a computing resource among many users at the same time by means of multiprogramming and multi-tasking.DEC Timesharing (1965), by Peter Clark, The DEC Professional, Volume 1, Number 1 Its emergence a ...
systems, in an effort to avoid problems such as
fork bomb In computing, a fork bomb (also called rabbit virus or wabbit) is a denial-of-service attack wherein a process continually replicates itself to deplete available system resources, slowing down or crashing the system due to resource starvation. ...
s. Developments were mostly theoretical until the Morris worm, which exploited a buffer overflow in
fingerd In computer networking, the Name/Finger protocol and the Finger user information protocol are simple network protocols for the exchange of human-oriented status and user information. Name/Finger protocol The Name/Finger protocol is based on Req ...
. The field of computer security developed quickly thereafter, escalating with multitudes of new attacks such as the
return-to-libc attack A "return-to-libc" attack is a computer security attack usually starting with a buffer overflow in which a subroutine return address on a call stack is replaced by an address of a subroutine that is already present in the process executable memory ...
and defense techniques such as the non-executable stack and
address space layout randomization Address space layout randomization (ASLR) is a computer security technique involved in preventing exploitation of memory corruption vulnerabilities. In order to prevent an attacker from reliably jumping to, for example, a particular exploited f ...
. Randomization prevents most
buffer overflow In information security and programming, a buffer overflow, or buffer overrun, is an anomaly whereby a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory locations. Buffers are areas of memor ...
attacks and requires the attacker to use
heap spraying In computer security, heap spraying is a technique used in exploits to facilitate arbitrary code execution. The part of the source code of an exploit that implements this technique is called a heap spray. In general, code that ''sprays the heap'' a ...
or other application-dependent methods to obtain addresses, although its adoption has been slow. However, deployments of the technology are typically limited to randomizing libraries and the location of the stack.


Impact

In 2019, a
Microsoft Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washingt ...
security engineer reported that 70 percent of all security vulnerabilities were caused by memory safety issues. In 2020, a team at
Google Google LLC () is an American multinational technology company focusing on search engine technology, online advertising, cloud computing, computer software, quantum computing, e-commerce, artificial intelligence, and consumer electronics. I ...
similarly reported that 70 percent of all "severe security bugs" in Google Chromium were caused by memory safety problems. Many other high-profile vulnerabilities and exploits in critical software have ultimately stemmed from a lack of memory safety, including
Heartbleed Heartbleed was a security bug in the OpenSSL cryptography library, which is a widely used implementation of the Transport Layer Security (TLS) protocol. It was introduced into the software in 2012 and publicly disclosed in April 2014. Heartble ...
and a long-standing privilege escalation bug in
sudo sudo ( or ) is a program for Unix-like computer operating systems that enables users to run programs with the security privileges of another user, by default the superuser. It originally stood for "superuser do", as that was all it did, and it ...
. The pervasiveness and severity of vulnerabilities and exploits arising from memory safety issues have led several security researchers to describe identifying memory safety issues as "shooting fish in a barrel".


Approaches

Most modern, high-level programming languages are memory-safe by default. Automatic
memory management Memory management is a form of resource management applied to computer memory. The essential requirement of memory management is to provide ways to dynamically allocate portions of memory to programs at their request, and free it for reuse whe ...
in the form of
garbage collection Waste collection is a part of the process of waste management. It is the transfer of solid waste from the point of use and disposal to the point of treatment or landfill. Waste collection also includes the curbside collection of recyclable m ...
is the most common technique for preventing memory safety problems, since it prevents common memory safety errors like use-after-free for all data allocated within the language runtime. When combined with automatic
bounds checking In computer programming, bounds checking is any method of detecting whether a variable is within some bounds before it is used. It is usually used to ensure that a number fits into a given type (range checking), or that a variable being used as ...
on all array accesses and no support for raw pointer arithmetic, garbage collected languages provide strong memory safety guarantees (though the guarantees may be weaker for low-level operations explicitly marked unsafe, such as use of a
foreign function interface A foreign function interface (FFI) is a mechanism by which a program written in one programming language can call routines or make use of services written in another. Naming The term comes from the specification for Common Lisp, which explicit ...
). However, the performance overhead of garbage collection makes these languages unsuitable for certain performance-critical applications. For languages that use
manual memory management In computer science, manual memory management refers to the usage of manual instructions by the programmer to identify and deallocate unused objects, or garbage. Up until the mid-1990s, the majority of programming languages used in industry supp ...
, memory safety cannot be guaranteed by the runtime. Instead, memory safety properties must either be guaranteed by the compiler via
static program analysis In computer science, static program analysis (or static analysis) is the analysis of computer programs performed without executing them, in contrast with dynamic program analysis, which is performed on programs during their execution. The term i ...
and
automated theorem proving Automated theorem proving (also known as ATP or automated deduction) is a subfield of automated reasoning and mathematical logic dealing with proving mathematical theorems by computer programs. Automated reasoning over mathematical proof was a m ...
or carefully managed by the programmer at runtime. For example, the
Rust programming language Rust is a multi-paradigm, general-purpose programming language. Rust emphasizes performance, type safety, and concurrency. Rust enforces memory safety—that is, that all references point to valid memory—without requiring the use of a garba ...
implements a borrow checker to ensure memory safety, while C and
C++ C, or c, is the third letter in the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. Its name in English is ''cee'' (pronounced ), plural ''cees''. History "C" ...
provide no memory safety guarantees. The substantial amount of software written in C and C++ has motivated the development of external static analysis tools like
Coverity Coverity is a proprietary static code analysis tool from Synopsys. This product enables engineers and security teams to find and fix software defects. Coverity started as an independent software company in 2002 at the Computer Systems Laborator ...
, which offers static memory analysis for C. DieHard, its redesign DieHarder, and the Allinea Distributed Debugging Tool are special heap allocators that allocate objects in their own random virtual memory page, allowing invalid reads and writes to be stopped and debugged at the exact instruction that causes them. Protection relies upon hardware memory protection and thus overhead is typically not substantial, although it can grow significantly if the program makes heavy use of allocation. Randomization provides only probabilistic protection against memory errors, but can often be easily implemented in existing software by relinking the binary. The memcheck tool of
Valgrind Valgrind () is a programming tool for memory debugging, memory leak detection, and profiling. Valgrind was originally designed to be a free memory debugging tool for Linux on x86, but has since evolved to become a generic framework for creatin ...
uses an
instruction set simulator An instruction set simulator (ISS) is a simulation model, usually coded in a high-level programming language, which mimics the behavior of a mainframe or microprocessor by "reading" instructions and maintaining internal variables which represent ...
and runs the compiled program in a memory-checking virtual machine, providing guaranteed detection of a subset of runtime memory errors. However, it typically slows the program down by a factor of 40, and furthermore must be explicitly informed of custom memory allocators. With access to the source code, libraries exist that collect and track legitimate values for pointers ("metadata") and check each pointer access against the metadata for validity, such as the
Boehm garbage collector The Boehm–Demers–Weiser garbage collector, often simply known as Boehm GC, is a conservative garbage collector for C and C++ developed by Hans Boehm, Alan Demers, and Mark Weiser.Hans BoehmA garbage collector for C and C++/ref> Andrew W. Appe ...
. In general, memory safety can be safely assured using
tracing garbage collection In computer programming, tracing garbage collection is a form of automatic memory management that consists of determining which objects should be deallocated ("garbage collected") by tracing which objects are ''reachable'' by a chain of reference ...
and the insertion of runtime checks on every memory access; this approach has overhead, but less than that of Valgrind. All garbage-collected languages take this approach. For C and C++, many tools exist that perform a compile-time transformation of the code to do memory safety checks at runtime, such as CheckPointer and
AddressSanitizer AddressSanitizer (or ASan) is an open source programming tool that detects memory corruption bugs such as buffer overflows or accesses to a dangling pointer (use-after-free). AddressSanitizer is based on compiler instrumentation and directly mappe ...
which imposes an average slowdown factor of 2.


Types of memory errors

Many different types of memory errors can occur: * Access errors: invalid read/write of a pointer **
Buffer overflow In information security and programming, a buffer overflow, or buffer overrun, is an anomaly whereby a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory locations. Buffers are areas of memor ...
– out-of-bound writes can corrupt the content of adjacent objects, or internal data (like bookkeeping information for the heap) or return addresses. ** Buffer over-read – out-of-bound reads can reveal sensitive data or help attackers bypass
address space layout randomization Address space layout randomization (ASLR) is a computer security technique involved in preventing exploitation of memory corruption vulnerabilities. In order to prevent an attacker from reliably jumping to, for example, a particular exploited f ...
. **
Race condition A race condition or race hazard is the condition of an electronics, software, or other system where the system's substantive behavior is dependent on the sequence or timing of other uncontrollable events. It becomes a bug when one or more of ...
– concurrent reads/writes to shared memory **
Invalid page fault In computing, a page fault (sometimes called PF or hard fault) is an exception that the memory management unit (MMU) raises when a process accesses a memory page without proper preparations. Accessing the page requires a mapping to be added to ...
– accessing a pointer outside the virtual memory space. A null pointer dereference will often cause an exception or program termination in most environments, but can cause corruption in operating system
kernel Kernel may refer to: Computing * Kernel (operating system), the central component of most operating systems * Kernel (image processing), a matrix used for image convolution * Compute kernel, in GPGPU programming * Kernel method, in machine learni ...
s or systems without
memory protection Memory protection is a way to control memory access rights on a computer, and is a part of most modern instruction set architectures and operating systems. The main purpose of memory protection is to prevent a process from accessing memory that ha ...
, or when use of the null pointer involves a large or negative offset. ** Use after free – dereferencing a
dangling pointer Dangling pointers and wild pointers in computer programming are pointers that do not point to a valid object of the appropriate type. These are special cases of memory safety violations. More generally, dangling references and wild references ar ...
storing the address of an object that has been deleted. *
Uninitialized variable In computing, an uninitialized variable is a variable that is declared but is not set to a definite known value before it is used. It will have ''some'' value, but not a predictable one. As such, it is a programming error and a common source of b ...
s – a variable that has not been assigned a value is used. It may contain an undesired or, in some languages, a corrupt value. **
Null pointer In computing, a null pointer or null reference is a value saved for indicating that the pointer or reference does not refer to a valid object. Programs routinely use null pointers to represent conditions such as the end of a list of unknown leng ...
dereference – dereferencing an invalid pointer or a pointer to memory that has not been allocated **
Wild pointer Wild, wild, wilds or wild may refer to: Common meanings * Wild animal * Wilderness, a wild natural environment * Wildness, the quality of being wild or untamed Art, media and entertainment Film and television * ''Wild'' (2014 film), a 2014 A ...
s arise when a pointer is used prior to initialization to some known state. They show the same erratic behaviour as dangling pointers, though they are less likely to stay undetected. *
Memory leak In computer science, a memory leak is a type of resource leak that occurs when a computer program incorrectly manages memory allocations in a way that memory which is no longer needed is not released. A memory leak may also happen when an objec ...
– when memory usage is not tracked or is tracked incorrectly ** Stack exhaustion – occurs when a program runs out of stack space, typically because of too deep
recursion Recursion (adjective: ''recursive'') occurs when a thing is defined in terms of itself or of its type. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in mathematic ...
. A
guard page Buffer overflow protection is any of various techniques used during software development to enhance the security of executable programs by detecting buffer overflows on stack-allocated variables, and preventing them from causing program misbehavior ...
typically halts the program, preventing memory corruption, but functions with large
stack frame In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, program stack, control stack, run-time stack, or ma ...
s may bypass the page. ** Heap exhaustion – the program tries to allocate more memory than the amount available. In some languages, this condition must be checked for manually after each allocation. ** Double free – repeated calls to
free Free may refer to: Concept * Freedom, having the ability to do something, without having to obey anyone/anything * Freethought, a position that beliefs should be formed only on the basis of logic, reason, and empiricism * Emancipate, to procure ...
may prematurely free a new object at the same address. If the exact address has not been reused, other corruption may occur, especially in allocators that use
free list A free list (or freelist) is a data structure used in a scheme for dynamic memory allocation. It operates by connecting unallocated regions of memory together in a linked list, using the first word of each unallocated region as a pointer to the ne ...
s. ** Invalid free – passing an invalid address to
free Free may refer to: Concept * Freedom, having the ability to do something, without having to obey anyone/anything * Freethought, a position that beliefs should be formed only on the basis of logic, reason, and empiricism * Emancipate, to procure ...
can corrupt the heap. ** Mismatched free – when multiple allocators are in use, attempting to free memory with a deallocation function of a different allocator ** Unwanted
aliasing In signal processing and related disciplines, aliasing is an effect that causes different signals to become indistinguishable (or ''aliases'' of one another) when sampled. It also often refers to the distortion or artifact that results when a ...
– when the same memory location is allocated and modified twice for unrelated purposes.


References

{{Memory management navbox Software bugs Computer security exploits Programming language implementation