In
computer science
Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
, abstract interpretation is a theory of
sound approximation of the
semantics of computer programs, based on
monotonic function
In mathematics, a monotonic function (or monotone function) is a function between ordered sets that preserves or reverses the given order. This concept first arose in calculus, and was later generalized to the more abstract setting of or ...
s over
ordered set
In mathematics, especially order theory, a partial order on a set is an arrangement such that, for certain pairs of elements, one precedes the other. The word ''partial'' is used to indicate that not every pair of elements needs to be comparable; ...
s, especially
lattices. It can be viewed as a partial
execution
Capital punishment, also known as the death penalty and formerly called judicial homicide, is the state-sanctioned killing of a person as punishment for actual or supposed misconduct. The sentence ordering that an offender be punished in ...
of a
computer program
A computer program is a sequence or set of instructions in a programming language for a computer to Execution (computing), execute. It is one component of software, which also includes software documentation, documentation and other intangibl ...
which gains information about its semantics (e.g.,
control-flow,
data-flow
In computing, dataflow is a broad concept, which has various meanings depending on the application and context. In the context of software architecture, data flow relates to stream processing or reactive programming.
Software architecture
Dataf ...
) without performing all the
calculation
A calculation is a deliberate mathematical process that transforms a plurality of inputs into a singular or plurality of outputs, known also as a result or results. The term is used in a variety of senses, from the very definite arithmetical ...
s.
Its main concrete application is formal
static analysis
Static analysis, static projection, or static scoring is a simplified analysis wherein the effect of an immediate change to a system is calculated without regard to the longer-term response of the system to that change. If the short-term effect i ...
, the automatic
extraction of information about the possible executions of computer programs; such analyses have two main usages:
* inside
compiler
In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
s, to analyse programs to decide whether certain
optimization
Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfiel ...
s or
transformations are applicable;
* for
debugging
In engineering, debugging is the process of finding the Root cause analysis, root cause, workarounds, and possible fixes for bug (engineering), bugs.
For software, debugging tactics can involve interactive debugging, control flow analysis, Logf ...
or even the certification of programs against classes of bugs.
Abstract interpretation was formalized by the French computer scientist working couple
Patrick Cousot and
Radhia Cousot
Radhia Cousot (6 August 1947 – 1 May 2014) was a French computer scientist known for inventing abstract interpretation.
Studies
Radhia Cousot was born on 6 August 1947, in Sakiet Sidi Youssef in Tunisia, where she survived the massacre of ...
in the late 1970s.
Intuition
This section illustrates abstract interpretation by means of real-world, non-computing examples.
Consider the people in a conference room. Assume a unique identifier for each person in the room, like a
social security number
In the United States, a Social Security number (SSN) is a nine-digit number issued to United States nationality law, U.S. citizens, Permanent residence (United States), permanent residents, and temporary (working) residents under section 205(c)(2 ...
in the United States. To prove that someone is not present, all one needs to do is see if their social security number is not on the list. Since two different people cannot have the same number, it is possible to prove or disprove the presence of a participant simply by looking up their number.
However it is possible that only the names of attendees were registered. If the name of a person is not found in the list, we may safely conclude that that person was not present; but if it is, we cannot conclude definitely without further inquiries, due to the possibility of
homonym
In linguistics, homonyms are words which are either; '' homographs''—words that mean different things, but have the same spelling (regardless of pronunciation), or '' homophones''—words that mean different things, but have the same pronunciat ...
s (for example, two people named John Smith). Note that this imprecise information will still be adequate for most purposes, because homonyms are rare in practice. However, in all rigor, we cannot say for sure that somebody was present in the room; all we can say is that they were ''possibly'' here. If the person we are looking up is a criminal, we will issue an ''alarm''; but there is of course the possibility of issuing a ''false alarm''. Similar phenomena will occur in the analysis of programs.
If we are only interested in some specific information, say, "was there a person of age
in the room?", keeping a list of all names and dates of births is unnecessary. We may safely and without loss of precision restrict ourselves to keeping a list of the participants' ages. If this is already too much to handle, we might keep only the age of the youngest,
and oldest person,
. If the question is about an age strictly lower than
or strictly higher than
, then we may safely respond that no such participant was present. Otherwise, we may only be able to say that we do not know.
In the case of computing, concrete, precise information is in general not computable within finite time and memory (see
Rice's theorem and the
halting problem
In computability theory (computer science), computability theory, the halting problem is the problem of determining, from a description of an arbitrary computer program and an input, whether the program will finish running, or continue to run for ...
).
Abstraction
Abstraction is a process where general rules and concepts are derived from the use and classifying of specific examples, literal (reality, real or Abstract and concrete, concrete) signifiers, first principles, or other methods.
"An abstraction" ...
is used to allow for generalized answers to questions (for example, answering "maybe" to a yes/no question, meaning "yes or no", when we (an algorithm of abstract interpretation) cannot compute the precise answer with certainty); this simplifies the problems, making them amenable to automatic solutions. One crucial requirement is to add enough vagueness so as to make problems manageable while still retaining enough precision for answering the important questions (such as "might the program crash?").
Abstract interpretation of computer programs
Given a programming or specification language, abstract interpretation consists of giving several semantics linked by relations of abstraction. A semantics is a mathematical characterization of a possible behavior of the program. The most precise semantics, describing very closely the actual execution of the program, are called the ''concrete semantics''. For instance, the concrete semantics of an
imperative programming
In computer science, imperative programming is a programming paradigm of software that uses Statement (computer science), statements that change a program's state (computer science), state. In much the same way that the imperative mood in natural ...
language may associate to each program the set of execution traces it may produce – an execution trace being a sequence of possible consecutive states of the execution of the program; a state typically consists of the value of the program counter and the memory locations (globals, stack and heap). More abstract semantics are then derived; for instance, one may consider only the set of reachable states in the executions (which amounts to considering the last states in finite traces).
The goal of static analysis is to derive a computable semantic interpretation at some point. For instance, one may choose to represent the state of a program manipulating integer variables by forgetting the actual values of the variables and only keeping their signs (+, − or 0). For some elementary operations, such as
multiplication
Multiplication is one of the four elementary mathematical operations of arithmetic, with the other ones being addition, subtraction, and division (mathematics), division. The result of a multiplication operation is called a ''Product (mathem ...
, such an abstraction does not lose any precision: to get the sign of a product, it is sufficient to know the sign of the operands. For some other operations, the abstraction may lose precision: for instance, it is impossible to know the sign of a sum whose operands are respectively positive and negative.
Sometimes a loss of precision is necessary to make the semantics decidable (see
Rice's theorem and the
halting problem
In computability theory (computer science), computability theory, the halting problem is the problem of determining, from a description of an arbitrary computer program and an input, whether the program will finish running, or continue to run for ...
). In general, there is a compromise to be made between the precision of the analysis and its decidability (
computability
Computability is the ability to solve a problem by an effective procedure. It is a key topic of the field of computability theory within mathematical logic and the theory of computation within computer science. The computability of a problem is c ...
), or tractability (
computational cost
A computation is any type of arithmetic or non-arithmetic calculation that is well-defined. Common examples of computation are mathematical equation solving and the execution of computer algorithms.
Mechanical or electronic devices (or, historic ...
).
In practice the abstractions that are defined are tailored to both the program properties one desires to analyze, and to the set of target programs. The first large scale automated analysis of computer programs with abstract interpretation was motivated by the accident that resulted in the destruction of the
first flight of the Ariane 5 rocket in 1996.
Formalization

Let
be an
ordered set
In mathematics, especially order theory, a partial order on a set is an arrangement such that, for certain pairs of elements, one precedes the other. The word ''partial'' is used to indicate that not every pair of elements needs to be comparable; ...
, called ''concrete set'', and let
be another ordered set, called ''abstract set''. These two sets are related to each other by defining
total function
In mathematics, a partial function from a set to a set is a function from a subset of (possibly the whole itself) to . The subset , that is, the '' domain'' of viewed as a function, is called the domain of definition or natural domain o ...
s that map elements from one to the other.
A function
is called an ''abstraction function'' if it maps an element
in the concrete set
to an element
in the abstract set
. That is, element
in
is the ''abstraction'' of
in
.
A function
is called a ''concretization function'' if it maps an element
in the abstract set
to an element
in the concrete set
. That is, element
in
is a ''concretization'' of
in
.
Let
,
,
, and
be ordered sets. The concrete semantics
is a monotonic function from
to
. A function
from
to
is said to be a ''valid abstraction'' of
if, for all
in
, we have
.
Program semantics are generally described using
fixed points in the presence of loops or recursive procedures. Suppose that
is a
complete lattice
In mathematics, a complete lattice is a partially ordered set in which all subsets have both a supremum ( join) and an infimum ( meet). A conditionally complete lattice satisfies at least one of these properties for bounded subsets. For compariso ...
and let
be a
monotonic function
In mathematics, a monotonic function (or monotone function) is a function between ordered sets that preserves or reverses the given order. This concept first arose in calculus, and was later generalized to the more abstract setting of or ...
from
into
. Then, any
such that
is an abstraction of the least fixed-point of
, which exists, according to the
Knaster–Tarski theorem.
The difficulty is now to obtain such an
. If
is of finite height, or at least verifies the
ascending chain condition
In mathematics, the ascending chain condition (ACC) and descending chain condition (DCC) are finiteness properties satisfied by some algebraic structures, most importantly Ideal (ring theory), ideals in certain commutative rings. These conditions p ...
(all ascending sequences are ultimately stationary), then such an
may be obtained as the stationary limit of the
ascending sequence defined by induction as follows:
(the least element of
) and
.
In other cases, it is still possible to obtain such an
through a (pair-)
widening operator, defined as a binary operator
which satisfies the following conditions:
# For all
and
, we have
and
, and
# For any ascending sequence
, the sequence defined by
and
is ultimately stationary. We can then take
.
In some cases, it is possible to define abstractions using
Galois connection
In mathematics, especially in order theory, a Galois connection is a particular correspondence (typically) between two partially ordered sets (posets). Galois connections find applications in various mathematical theories. They generalize the fun ...
s
where
is from
to
and
is from
to
. This supposes the existence of best abstractions, which is not necessarily the case. For instance, if we abstract sets of couples
of
real number
In mathematics, a real number is a number that can be used to measure a continuous one- dimensional quantity such as a duration or temperature. Here, ''continuous'' means that pairs of values can have arbitrarily small differences. Every re ...
s by enclosing convex
polyhedra
In geometry, a polyhedron (: polyhedra or polyhedrons; ) is a three-dimensional figure with flat polygonal faces, straight edges and sharp corners or vertices. The term "polyhedron" may refer either to a solid figure or to its boundary su ...
, there is no optimal abstraction to the disc defined by
.
Examples of abstract domains
Numerical abstract domains
One can assign to each variable
available at a given program point an interval