In
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, garbage collection (GC) is a form of automatic
memory management
Memory management is a form of resource management applied to computer memory. The essential requirement of memory management is to provide ways to dynamically allocate portions of memory to programs at their request, and free it for reuse when ...
. The ''garbage collector'' attempts to reclaim memory which was allocated by the program, but is no longer referenced; such memory is called ''
garbage
Garbage, trash, rubbish, or refuse is waste material that is discarded by humans, usually due to a perceived lack of utility. The term generally does not encompass bodily waste products, purely liquid or gaseous wastes, or toxic waste produc ...
''. Garbage collection was invented by American computer scientist
John McCarthy around 1959 to simplify manual memory management in
Lisp
A lisp is a speech impairment in which a person misarticulates sibilants (, , , , , , , ). These misarticulations often result in unclear speech.
Types
* A frontal lisp occurs when the tongue is placed anterior to the target. Interdental lisping ...
.
Garbage collection relieves the programmer from doing
manual memory management
In computer science, manual memory management refers to the usage of manual instructions by the programmer to identify and deallocate unused objects, or garbage. Up until the mid-1990s, the majority of programming languages used in industry supp ...
, where the programmer specifies what objects to de-allocate and return to the memory system and when to do so. Other, similar techniques include
stack allocation
Stacks in computing architectures are regions of memory where data is added or removed in a last-in-first-out (LIFO) manner.
In most modern computer systems, each thread has a reserved region of memory referred to as its stack. When a func ...
,
region inference
In computer science, region-based memory management is a type of memory management in which each allocated object is assigned to a region. A region, also called a zone, arena, area, or memory context, is a collection of allocated objects that ca ...
, and memory ownership, and combinations thereof. Garbage collection may take a significant proportion of a program's total processing time, and affect
performance
A performance is an act of staging or presenting a play, concert, or other form of entertainment. It is also defined as the action or process of carrying out or accomplishing an action, task, or function.
Management science
In the work place ...
as a result.
Resources other than memory, such as
network socket
A network socket is a software structure within a network node of a computer network that serves as an endpoint for sending and receiving data across the network. The structure and properties of a socket are defined by an application programming ...
s, database
handle
A handle is a part of, or attachment to, an object that allows it to be grasped and manipulated by hand. The design of each type of handle involves substantial ergonomic issues, even where these are dealt with intuitively or by following tra ...
s,
window
A window is an opening in a wall, door, roof, or vehicle that allows the exchange of light and may also allow the passage of sound and sometimes air. Modern windows are usually glazed or covered in some other transparent or translucent materia ...
s,
file
File or filing may refer to:
Mechanical tools and processes
* File (tool), a tool used to ''remove'' fine amounts of material from a workpiece
**Filing (metalworking), a material removal process in manufacturing
** Nail file, a tool used to gent ...
descriptors, and device descriptors, are not typically handled by garbage collection, but rather by other
method
Method ( grc, μέθοδος, methodos) literally means a pursuit of knowledge, investigation, mode of prosecuting such inquiry, or system. In recent centuries it more often means a prescribed process for completing a task. It may refer to:
*Scien ...
s (e.g.
destructors). Some such methods de-allocate memory as well.
Overview
Many
programming language
A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language.
The description of a programming ...
s require garbage collection, either as part of the
language specification
In computer programming, a programming language specification (or standard or definition) is a documentation artifact that defines a programming language so that users and implementors can agree on what programs in that language mean. Specificati ...
(e.g.,
RPL,
Java
Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
,
C#,
D,
Go, and most
scripting language
A scripting language or script language is a programming language that is used to manipulate, customize, and automate the facilities of an existing system. Scripting languages are usually interpreted at runtime rather than compiled.
A scripting ...
s) or effectively for practical implementation (e.g., formal languages like
lambda calculus
Lambda calculus (also written as ''λ''-calculus) is a formal system in mathematical logic for expressing computation based on function abstraction and application using variable binding and substitution. It is a universal model of computation ...
). These are said to be ''garbage-collected languages''. Other languages, such as
C and
C++
C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
, were designed for use with manual memory management, but have garbage-collected implementations available. Some languages, like
Ada
Ada may refer to:
Places
Africa
* Ada Foah, a town in Ghana
* Ada (Ghana parliament constituency)
* Ada, Osun, a town in Nigeria
Asia
* Ada, Urmia, a village in West Azerbaijan Province, Iran
* Ada, Karaman, a village in Karaman Province, Tur ...
,
Modula-3
Modula-3 is a programming language conceived as a successor to an upgraded version of Modula-2 known as Modula-2+. While it has been influential in research circles (influencing the designs of languages such as Java, C#, and Python) it has not be ...
, and
C++/CLI
C++/CLI is a variant of the C++ programming language, modified for Common Language Infrastructure. It has been part of Visual Studio 2005 and later, and provides interoperability with other .NET languages such as C#. Microsoft created C++/CLI t ...
, allow both garbage collection and
manual memory management
In computer science, manual memory management refers to the usage of manual instructions by the programmer to identify and deallocate unused objects, or garbage. Up until the mid-1990s, the majority of programming languages used in industry supp ...
to co-exist in the same application by using separate
heap
Heap or HEAP may refer to:
Computing and mathematics
* Heap (data structure), a data structure commonly used to implement a priority queue
* Heap (mathematics), a generalization of a group
* Heap (programming) (or free store), an area of memory f ...
s for collected and manually managed objects. Still others, like
D, are garbage-collected but allow the user to manually delete objects or even disable garbage collection entirely when speed is required.
Although many languages integrate GC into their
compiler
In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
and
runtime system
In computer programming, a runtime system or runtime environment is a sub-system that exists both in the computer where a program is created, as well as in the computers where the program is intended to be run. The name comes from the compile t ...
, ''post-hoc'' GC systems also exist, such as
Automatic Reference Counting
Automatic Reference Counting (ARC) is a memory management feature of the Clang compiler providing automatic reference counting for the Objective-C and Swift programming languages. At compile time, it inserts into the object code messages retain an ...
(ARC). Some of these ''post-hoc'' GC systems do not require recompilation. ''Post-hoc'' GC is sometimes called ''litter collection,'' to distinguish it from ordinary GC.
Advantages
GC frees the programmer from manually de-allocating memory. This helps avoid some kinds of
error
An error (from the Latin ''error'', meaning "wandering") is an action which is inaccurate or incorrect. In some usages, an error is synonymous with a mistake. The etymology derives from the Latin term 'errare', meaning 'to stray'.
In statistics ...
s:
* ''
Dangling pointer
Dangling pointers and wild pointers in computer programming are pointers that do not point to a valid object of the appropriate type. These are special cases of memory safety violations. More generally, dangling references and wild references are ...
s'', which occur when a piece of memory is freed while there are still
pointers to it, and one of those pointers is
dereferenced. By then the memory may have been reassigned to another use, with unpredictable results.
* ''Double free bugs'', which occur when the program tries to free a region of memory that has already been freed, and perhaps already been allocated again.
* Certain kinds of ''
memory leak
In computer science, a memory leak is a type of resource leak that occurs when a computer program incorrectly manages memory allocations in a way that Computer memory, memory which is no longer needed is not released. A memory leak may also happe ...
s'', in which a program fails to free memory occupied by objects that have become
unreachable
In computer programming, unreachable memory is a Block (data storage), block of dynamic memory allocation, dynamically allocated memory where the computer program, program that allocated the memory no longer has any reachable pointer (computer ...
, which can lead to memory exhaustion.
Disadvantages
GC uses computing resources to decide which memory to free. Therefore, the penalty for the convenience of not annotating object lifetime manually in the source code is
overhead, which can impair program performance.
A peer-reviewed paper from 2005 concluded that GC needs five times the memory to compensate for this overhead and to perform as fast as the same program using idealised explicit memory management. The comparison however is made to a program generated by inserting deallocation calls using an
oracle
An oracle is a person or agency considered to provide wise and insightful counsel or prophetic predictions, most notably including precognition of the future, inspired by deities. As such, it is a form of divination.
Description
The word '' ...
, implemented by collecting traces from programs run under a profiler, and the program is only correct for one particular execution of the program.
Interaction with memory hierarchy effects can make this overhead intolerable in circumstances that are hard to predict or to detect in routine testing. The impact on performance was given by Apple as a reason for not adopting garbage collection in
iOS
iOS (formerly iPhone OS) is a mobile operating system created and developed by Apple Inc. exclusively for its hardware. It is the operating system that powers many of the company's mobile devices, including the iPhone; the term also includes ...
, despite it being the most desired feature.
The moment when the garbage is actually collected can be unpredictable, resulting in stalls (pauses to shift/free memory) scattered throughout a session. Unpredictable stalls can be unacceptable in
real-time environments, in
transaction processing
Transaction processing is information processing in computer science that is divided into individual, indivisible operations called ''transactions''. Each transaction must succeed or fail as a complete unit; it can never be only partially comple ...
, or in interactive programs. Incremental, concurrent, and real-time garbage collectors address these problems, with varying trade-offs.
Strategies
Tracing
Tracing garbage collection
In computer programming, tracing garbage collection is a form of automatic memory management that consists of determining which objects should be deallocated ("garbage collected") by tracing which objects are ''reachable'' by a chain of references ...
is the most common type of garbage collection, so much so that "garbage collection" often refers to tracing garbage collection, rather than other methods such as
reference counting
In computer science, reference counting is a programming technique of storing the number of references, pointers, or handles to a resource, such as an object, a block of memory, disk space, and others.
In garbage collection algorithms, referenc ...
. The overall strategy consists of determining which objects should be garbage collected by tracing which objects are ''reachable'' by a chain of references from certain root objects, and considering the rest as garbage and collecting them. However, there are a large number of algorithms used in implementation, with widely varying complexity and performance characteristics.
Reference counting
Reference counting garbage collection is where each object has a count of the number of references to it. Garbage is identified by having a reference count of zero. An object's reference count is incremented when a reference to it is created, and decremented when a reference is destroyed. When the count reaches zero, the object's memory is reclaimed.
As with manual memory management, and unlike tracing garbage collection, reference counting guarantees that objects are destroyed as soon as their last reference is destroyed, and usually only accesses memory which is either in
CPU cache
A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, which ...
s, in objects to be freed, or directly pointed to by those, and thus tends to not have significant negative side effects on CPU cache and
virtual memory
In computing, virtual memory, or virtual storage is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very l ...
operation.
There are a number of disadvantages to reference counting; this can generally be solved or mitigated by more sophisticated algorithms:
; Cycles: If two or more objects refer to each other, they can create a cycle whereby neither will be collected as their mutual references never let their reference counts become zero. Some garbage collection systems using reference counting (like the one in
CPython
CPython is the reference implementation of the Python (programming language), Python programming language. Written in C (programming language), C and Python, CPython is the default and most widely used implementation of the Python language.
CP ...
) use specific cycle-detecting algorithms to deal with this issue.
Another strategy is to use
weak reference
In computer programming, a weak reference is a reference that does not protect the referenced object from collection by a garbage collector, unlike a strong reference. An object referenced ''only'' by weak references – meaning "every chain of ref ...
s for the "backpointers" which create cycles. Under reference counting, a weak reference is similar to a weak reference under a tracing garbage collector. It is a special reference object whose existence does not increment the reference count of the referent object. Furthermore, a weak reference is safe in that when the referent object becomes garbage, any weak reference to it ''lapses'', rather than being permitted to remain dangling, meaning that it turns into a predictable value, such as a null reference.
; Space overhead (reference count): Reference counting requires space to be allocated for each object to store its reference count. The count may be stored adjacent to the object's memory or in a side table somewhere else, but in either case, every single reference-counted object requires additional storage for its reference count. Memory space with the size of an unsigned pointer is commonly used for this task, meaning that 32 or 64 bits of reference count storage must be allocated for each object. On some systems, it may be possible to mitigate this overhead by using a
tagged pointer In computer science, a tagged pointer is a pointer (concretely a memory address) with additional data associated with it, such as an indirection bit or reference count. This additional data is often "folded" into the pointer, meaning stored inline ...
to store the reference count in unused areas of the object's memory. Often, an architecture does not actually allow programs to access the full range of memory addresses that could be stored in its native pointer size; certain number of high bits in the address is either ignored or required to be zero. If an object reliably has a pointer at a certain location, the reference count can be stored in the unused bits of the pointer. For example, each object in
Objective-C
Objective-C is a general-purpose, object-oriented programming language that adds Smalltalk-style messaging to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was selected by NeXT for its NeXTS ...
has a pointer to its
class
Class or The Class may refer to:
Common uses not otherwise categorized
* Class (biology), a taxonomic rank
* Class (knowledge representation), a collection of individuals or objects
* Class (philosophy), an analytical concept used differentl ...
at the beginning of its memory; on the
ARM64
AArch64 or ARM64 is the 64-bit extension of the ARM architecture family.
It was first introduced with the Armv8-A architecture. Arm releases a new extension every year.
ARMv8.x and ARMv9.x extensions and features
Announced in October 2011, AR ...
architecture using
iOS 7
iOS 7 is the seventh major release of the iOS mobile operating system developed by Apple Inc., being the successor to iOS 6. It was announced at the company's Worldwide Developers Conference on June 10, 2013, and was released on September 18 o ...
, 19 unused bits of this class pointer are used to store the object's reference count.
; Speed overhead (increment/decrement): In naive implementations, each assignment of a reference and each reference falling out of scope often require modifications of one or more reference counters. However, in a common case when a reference is copied from an outer scope variable into an inner scope variable, such that the lifetime of the inner variable is bounded by the lifetime of the outer one, the reference incrementing can be eliminated. The outer variable "owns" the reference. In the programming language C++, this technique is readily implemented and demonstrated with the use of
const
references. Reference counting in C++ is usually implemented using "
smart pointer
In computer science, a smart pointer is an abstract data type that simulates a pointer while providing added features, such as automatic memory management or bounds checking. Such features are intended to reduce bugs caused by the misuse of poin ...
s"
whose constructors, destructors and assignment operators manage the references. A smart pointer can be passed by reference to a function, which avoids the need to copy-construct a new smart pointer (which would increase the reference count on entry into the function and decrease it on exit). Instead the function receives a reference to the smart pointer which is produced inexpensively. The Deutsch-Bobrow method of reference counting capitalizes on the fact that most reference count updates are in fact generated by references stored in local variables. It ignores these references, only counting references in the heap, but before an object with reference count zero can be deleted, the system must verify with a scan of the stack and registers that no other reference to it still exists. A further substantial decrease in the overhead on counter updates can be obtained by update coalescing introduced by Levanoni and
Petrank.
Consider a pointer that in a given interval of the execution is updated several times. It first points to an object
O1
, then to an object
O2
, and so forth until at the end of the interval it points to some object
On
. A reference counting algorithm would typically execute
rc(O1)--
,
rc(O2)++
,
rc(O2)--
,
rc(O3)++
,
rc(O3)--
, ...,
rc(On)++
. But most of these updates are redundant. In order to have the reference count properly evaluated at the end of the interval it is enough to perform
rc(O1)--
and
rc(On)++
. Levanoni and Petrank measured an elimination of more than 99% of the counter updates in typical Java benchmarks.
; Requires atomicity: When used in a
multithreaded environment, these modifications (increment and decrement) may need to be
atomic operation
In concurrent programming, an operation (or set of operations) is linearizable if it consists of an ordered list of invocation and response events (event), that may be extended by adding response events such that:
# The extended list can be re-e ...
s such as
compare-and-swap
In computer science, compare-and-swap (CAS) is an atomic instruction used in multithreading to achieve synchronization. It compares the contents of a memory location with a given value and, only if they are the same, modifies the contents of tha ...
, at least for any objects which are shared, or potentially shared among multiple threads. Atomic operations are expensive on a multiprocessor, and even more expensive if they have to be emulated with software algorithms. It is possible to avoid this issue by adding per-thread or per-CPU reference counts and only accessing the global reference count when the local reference counts become or are no longer zero (or, alternatively, using a binary tree of reference counts, or even giving up deterministic destruction in exchange for not having a global reference count at all), but this adds significant memory overhead and thus tends to be only useful in special cases (it is used, for example, in the reference counting of Linux kernel modules). Update coalescing by Levanoni and Petrank
can be used to eliminate all atomic operations from the write-barrier. Counters are never updated by the program threads in the course of program execution. They are only modified by the collector which executes as a single additional thread with no synchronization. This method can be used as a stop-the-world mechanism for parallel programs, and also with a concurrent reference counting collector.
; Not real-time: Naive implementations of reference counting do not generally provide real-time behavior, because any pointer assignment can potentially cause a number of objects bounded only by total allocated memory size to be recursively freed while the thread is unable to perform other work. It is possible to avoid this issue by delegating the freeing of unreferenced objects to other threads, at the cost of extra overhead.
Escape analysis
Escape analysis
In compiler optimization, escape analysis is a method for determining the dynamic scope of pointers where in the program a pointer can be accessed. It is related to pointer analysis and shape analysis.
When a variable (or an object) is allocate ...
is a compile-time technique that can convert
heap allocation
In computer science, manual memory management refers to the usage of manual instructions by the programmer to identify and deallocate unused objects, or garbage. Up until the mid-1990s, the majority of programming languages used in industry suppo ...
s to
stack allocation
Stacks in computing architectures are regions of memory where data is added or removed in a last-in-first-out (LIFO) manner.
In most modern computer systems, each thread has a reserved region of memory referred to as its stack. When a func ...
s, thereby reducing the amount of garbage collection to be done. This analysis determines whether an object allocated inside a function is accessible outside of it. If a function-local allocation is found to be accessible to another function or thread, the allocation is said to "escape" and cannot be done on the stack. Otherwise, the object may be allocated directly on the stack and released when the function returns, bypassing the heap and associated memory management costs.
Availability
Generally speaking,
higher-level programming languages are more likely to have garbage collection as a standard feature. In some languages that do not have built in garbage collection, it can be added through a library, as with the
Boehm garbage collector
The Boehm–Demers–Weiser garbage collector, often simply known as Boehm GC, is a Tracing_garbage_collection#Precise_vs._conservative_and_internal_pointers, conservative Garbage collection (computer science), garbage collector for C (programming ...
for C and C++.
Most
functional programming language
In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that m ...
s, such as
ML,
Haskell
Haskell () is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research and industrial applications, Haskell has pioneered a number of programming lang ...
, and
APL, have garbage collection built in.
Lisp
A lisp is a speech impairment in which a person misarticulates sibilants (, , , , , , , ). These misarticulations often result in unclear speech.
Types
* A frontal lisp occurs when the tongue is placed anterior to the target. Interdental lisping ...
is especially notable as both the first
functional programming language
In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that m ...
and the first language to introduce garbage collection.
Other dynamic languages, such as
Ruby
A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sa ...
and
Julia
Julia is usually a feminine given name. It is a Latinate feminine form of the name Julio and Julius. (For further details on etymology, see the Wiktionary entry "Julius".) The given name ''Julia'' had been in use throughout Late Antiquity (e.g. ...
(but not
Perl
Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offici ...
5 or
PHP
PHP is a general-purpose scripting language geared toward web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementation is now produced by The PHP Group ...
before version 5.3,
which both use reference counting),
JavaScript
JavaScript (), often abbreviated as JS, is a programming language that is one of the core technologies of the World Wide Web, alongside HTML and CSS. As of 2022, 98% of Website, websites use JavaScript on the Client (computing), client side ...
and
ECMAScript
ECMAScript (; ES) is a JavaScript standard intended to ensure the interoperability of web pages across different browsers. It is standardized by Ecma International in the documenECMA-262
ECMAScript is commonly used for client-side scripting o ...
also tend to use GC.
Object-oriented programming
Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of pr ...
languages such as
Smalltalk
Smalltalk is an object-oriented, dynamically typed reflective programming language. It was designed and created in part for educational use, specifically for constructionist learning, at the Learning Research Group (LRG) of Xerox PARC by Alan Ka ...
,
RPL and
Java
Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
usually provide integrated garbage collection. Notable exceptions are
C++
C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
and
Delphi
Delphi (; ), in legend previously called Pytho (Πυθώ), in ancient times was a sacred precinct that served as the seat of Pythia, the major oracle who was consulted about important decisions throughout the ancient classical world. The oracle ...
, which have
destructors.
BASIC
BASIC
BASIC (Beginners' All-purpose Symbolic Instruction Code) is a family of general-purpose, high-level programming languages designed for ease of use. The original version was created by John G. Kemeny and Thomas E. Kurtz at Dartmouth College ...
and
Logo
A logo (abbreviation of logotype; ) is a graphic mark, emblem, or symbol used to aid and promote public identification and recognition. It may be of an abstract or figurative design or include the text of the name it represents as in a wordma ...
have often used garbage collection for variable-length data types, such as strings and lists, so as not to burden programmers with memory management details. On the
Altair 8800
The Altair 8800 is a microcomputer designed in 1974 by MITS and based on the Intel 8080 CPU. Interest grew quickly after it was featured on the cover of the January 1975 issue of Popular Electronics and was sold by mail order through advertiseme ...
, programs with many string variables and little string space could cause long pauses due to garbage collection.
Similarly the
Applesoft BASIC
Applesoft BASIC is a dialect of Microsoft BASIC, developed by Marc McDonald and Ric Weiland, supplied with the Apple II series of computers. It supersedes Integer BASIC and is the BASIC in ROM in all Apple II series computers after the original ...
interpreter's garbage collection algorithm repeatedly scans the string descriptors for the string having the highest address in order to compact it toward high memory, resulting in
performance
and pauses anywhere from a few seconds to a few minutes.
A replacement garbage collector for Applesoft BASIC by
Randy Wigginton
Randy Wigginton was one of Apple Computer's first employees (#6), creator of MacWrite, Full Impact, and numerous other Mac applications. He used to work in development at eBay, Quigo, Inc and Move.com. In November 2010, he left his position as a ...
identifies a group of strings in every pass over the heap, reducing collection time dramatically.
BASIC.System, released with
ProDOS
ProDOS is the name of two similar operating systems for the Apple II series of personal computers. The original ProDOS, renamed ProDOS 8 in version 1.2, is the last official operating system usable by all 8-bit Apple II series computers, and w ...
in 1983, provides a windowing garbage collector for BASIC that is many times faster.
Objective-C
While the
Objective-C
Objective-C is a general-purpose, object-oriented programming language that adds Smalltalk-style messaging to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was selected by NeXT for its NeXTS ...
traditionally had no garbage collection, with the release of
OS X 10.5 in 2007
Apple
An apple is an edible fruit produced by an apple tree (''Malus domestica''). Apple fruit tree, trees are agriculture, cultivated worldwide and are the most widely grown species in the genus ''Malus''. The tree originated in Central Asia, wh ...
introduced garbage collection for
Objective-C
Objective-C is a general-purpose, object-oriented programming language that adds Smalltalk-style messaging to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was selected by NeXT for its NeXTS ...
2.0, using an in-house developed runtime collector.
However, with the 2012 release of
OS X 10.8, garbage collection was deprecated in favor of
LLVM
LLVM is a set of compiler and toolchain technologies that can be used to develop a front end for any programming language and a back end for any instruction set architecture. LLVM is designed around a language-independent intermediate represen ...
's
automatic reference counter (ARC) that was introduced with
OS X 10.7.
Furthermore, since May 2015 Apple even forbids the usage of garbage collection for new OS X applications in the
App Store
An App Store (or app marketplace) is a type of digital distribution platform for computer software called applications, often in a mobile context. Apps provide a specific set of functions which, by definition, do not include the running of the co ...
.
For
iOS
iOS (formerly iPhone OS) is a mobile operating system created and developed by Apple Inc. exclusively for its hardware. It is the operating system that powers many of the company's mobile devices, including the iPhone; the term also includes ...
, garbage collection has never been introduced due to problems in application responsivity and performance;
instead, iOS uses ARC.
Limited environments
Garbage collection is rarely used on
embedded or real-time systems because of the usual need for very tight control over the use of limited resources. However, garbage collectors compatible with many limited environments have been developed.
The Microsoft
.NET Micro Framework
The .NET Micro Framework (NETMF) is a .NET Framework platform for resource-constrained devices with at least 512 kB of flash and 256 kB of random-access memory (RAM). It includes a small version of the .NET Common Language Runtime (CLR ...
, .NET nanoFramework
and
Java Platform, Micro Edition
Java Platform, Micro Edition or Java ME is a computing platform for development and deployment of portable code for embedded and mobile devices (micro-controllers, sensors, gateways, mobile phones, personal digital assistants, TV set-top ...
are embedded software platforms that, like their larger cousins, include garbage collection.
Java
Garbage collectors available in Java JDKs include:
*
G1
* Parallel
*
Concurrent mark sweep collector
The concurrent mark sweep collector (concurrent mark-sweep collector, concurrent collector or CMS) was a mark-and-sweep garbage collector in the Oracle HotSpot Java virtual machine (JVM) available since version 1.4.1. It was deprecated on version ...
(CMS)
* Serial
* C4 (Continuously Concurrent Compacting Collector)
* Shenandoah
* ZGC
Compile-time use
Compile-time garbage collection is a form of
static analysis
Static analysis, static projection, or static scoring is a simplified analysis wherein the effect of an immediate change to a system is calculated without regard to the longer-term response of the system to that change. If the short-term effect i ...
allowing memory to be reused and reclaimed based on invariants known during compilation.
This form of garbage collection has been studied in the
Mercury programming language
Mercury is a functional logic programming language made for real-world uses. The first version was developed at the University of Melbourne, Computer Science department, by Fergus Henderson, Thomas Conway, and Zoltan Somogyi, under Somogyi's sup ...
,
and it saw greater usage with the introduction of
LLVM
LLVM is a set of compiler and toolchain technologies that can be used to develop a front end for any programming language and a back end for any instruction set architecture. LLVM is designed around a language-independent intermediate represen ...
's
automatic reference counter (ARC) into Apple's ecosystem (iOS and OS X) in 2011.
Real-time systems
Incremental, concurrent, and real-time garbage collectors have been developed, for example by
Henry Baker and by
Henry Lieberman.
In Baker's algorithm, the allocation is done in either half of a single region of memory. When it becomes half full, a garbage collection is performed which moves the live objects into the other half and the remaining objects are implicitly deallocated. The running program (the 'mutator') has to check that any object it references is in the correct half, and if not move it across, while a background task is finding all of the objects.
Generational garbage collection schemes are based on the empirical observation that most objects die young. In generational garbage collection two or more allocation regions (generations) are kept, which are kept separate based on object's age. New objects are created in the "young" generation that is regularly collected, and when a generation is full, the objects that are still referenced from older regions are copied into the next oldest generation. Occasionally a full scan is performed.
Some
high-level language computer architecture
A high-level language computer architecture (HLLCA) is a computer architecture designed to be targeted by a specific high-level programming language (HLL), rather than the architecture being dictated by hardware considerations. It is accordingly al ...
s include hardware support for real-time garbage collection.
Most implementations of real-time garbage collectors use
tracing
Tracing may refer to:
Computer graphics
* Image tracing, digital image processing to convert raster graphics into vector graphics
* Path tracing, a method of rendering images of three-dimensional scenes such that the global illumination is faithf ...
. Such real-time garbage collectors meet
hard real-time
Real-time computing (RTC) is the computer science term for hardware and software systems subject to a "real-time constraint", for example from event to system response. Real-time programs must guarantee response within specified time constrai ...
constraints when used with a real-time operating system.
See also
*
Destructor (computer programming) In object-oriented programming, a destructor (sometimes abbreviated dtor) is a method which is invoked mechanically just before the memory of the object is released. It can happen when its lifetime is bound to scope and the execution leaves the sco ...
*
Dynamic dead-code elimination
In compiler theory, dead-code elimination (also known as DCE, dead-code removal, dead-code stripping, or dead-code strip) is a compiler optimization to remove code which does not affect the program results. Removing such code has several benefits: ...
*
Smart pointer
In computer science, a smart pointer is an abstract data type that simulates a pointer while providing added features, such as automatic memory management or bounds checking. Such features are intended to reduce bugs caused by the misuse of poin ...
*
Virtual memory compression
Virtual memory compression (also referred to as RAM compression and memory compression) is a memory management technique that utilizes data compression to reduce the size or number of paging requests to and from the auxiliary storage. In a virtua ...
References
Further reading
* (511 pages)
* (404 pages)
*
*
*
External links
The Memory Management Reference*
ttp://tinygc.sourceforge.net/ TinyGC - an independent implementation of the BoehmGC APIConservative Garbage Collection Implementation for C LanguageMeixnerGC - an incremental mark and sweep garbage collector for C++ using smart pointers
{{Authority control
Memory management
*
Articles with example code
Solid-state computer storage