In
computer science and
mathematical logic, satisfiability modulo theories (SMT) is the
problem
Problem solving is the process of achieving a goal by overcoming obstacles, a frequent part of most activities. Problems in need of solutions range from simple personal tasks (e.g. how to turn on an appliance) to complex issues in business an ...
of determining whether a
mathematical formula is
satisfiable
In mathematical logic, a formula is ''satisfiable'' if it is true under some assignment of values to its variables. For example, the formula x+3=y is satisfiable because it is true when x=3 and y=6, while the formula x+1=x is not satisfiable over ...
. It generalizes the
Boolean satisfiability problem (SAT) to more complex formulas involving
real numbers
In mathematics, a real number is a number that can be used to measure a ''continuous'' one-dimensional quantity such as a distance, duration or temperature. Here, ''continuous'' means that values can have arbitrarily small variations. Every real ...
,
integers, and/or various
data structure
In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, a ...
s such as
lists,
arrays,
bit vectors, and
strings
String or strings may refer to:
*String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects
Arts, entertainment, and media Films
* ''Strings'' (1991 film), a Canadian anim ...
. The name is derived from the fact that these expressions are interpreted within ("modulo") a certain
formal theory in
first-order logic with equality (often disallowing
quantifiers). SMT solvers are tools which aim to solve the SMT problem for a practical subset of inputs. SMT solvers such as
Z3 and cvc5 have been used as a building block for a wide range of applications across computer science, including in
automated theorem proving,
program analysis,
program verification
In the context of hardware and software systems, formal verification is the act of proving or disproving the correctness of intended algorithms underlying a system with respect to a certain formal specification or property, using formal metho ...
, and
software testing.
Since Boolean satisfiability is already NP-complete, the SMT problem is typically
NP-hard
In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard pr ...
, and for many theories it is
undecidable. Researchers study which theories or subsets of theories lead to a decidable SMT problem and the
computational complexity
In computer science, the computational complexity or simply complexity of an algorithm is the amount of resources required to run it. Particular focus is given to computation time (generally measured by the number of needed elementary operations) ...
of decidable cases. The resulting decision procedures are often implemented directly in SMT solvers; see, for instance, the decidability of
Presburger arithmetic. SMT can be thought of as a
constraint satisfaction problem and thus a certain formalized approach to
constraint programming.
Basic terminology
Formally speaking, an SMT instance is a
formula
In science, a formula is a concise way of expressing information symbolically, as in a mathematical formula or a ''chemical formula''. The informal use of the term ''formula'' in science refers to the general construct of a relationship betwee ...
in
first-order logic, where some function and predicate symbols have additional interpretations, and SMT is the problem of determining whether such a formula is satisfiable. In other words, imagine an instance of the
Boolean satisfiability problem (SAT) in which some of the
binary variables
Binary data is data whose unit can take on only two possible states. These are often labelled as 0 and 1 in accordance with the binary numeral system and Boolean algebra.
Binary data occurs in many different technical and scientific fields, wher ...
are replaced by
predicates over a suitable set of non-binary variables. A predicate is a binary-valued function of non-binary variables. Example predicates include linear
inequalities (e.g.,
) or equalities involving
uninterpreted terms and function symbols (e.g.,
where
is some unspecified function of two arguments). These predicates are classified according to each respective theory assigned. For instance, linear inequalities over real variables are evaluated using the rules of the theory of linear real
arithmetic
Arithmetic () is an elementary part of mathematics that consists of the study of the properties of the traditional operations on numbers— addition, subtraction, multiplication, division, exponentiation, and extraction of roots. In the 19th ...
, whereas predicates involving uninterpreted terms and function symbols are evaluated using the rules of the theory of
uninterpreted function In mathematical logic, an uninterpreted function or function symbol is one that has no other property than its name and ''n-ary'' form. Function symbols are used, together with constants and variables, to form terms.
The theory of uninterpreted fu ...
s with equality (sometimes referred to as the
empty theory In mathematical logic, an uninterpreted function or function symbol is one that has no other property than its name and ''n-ary'' form. Function symbols are used, together with constants and variables, to form terms.
The theory of uninterpreted fu ...
). Other theories include the theories of
arrays and
list structures (useful for modeling and verifying
computer programs), and the theory of
bit vectors (useful in modeling and verifying
hardware designs). Subtheories are also possible: for example, difference logic is a sub-theory of linear arithmetic in which each inequality is restricted to have the form
for variables
and
and constant
.
Most SMT solvers support only
quantifier-free fragments of their logics.
Expressive power
An SMT instance is a generalization of a
Boolean SAT
In logic and computer science, the Boolean satisfiability problem (sometimes called propositional satisfiability problem and abbreviated SATISFIABILITY, SAT or B-SAT) is the problem of determining if there exists an interpretation that satisfie ...
instance in which various sets of variables are replaced by
predicates from a variety of underlying theories. SMT formulas provide a much richer
modeling language than is possible with Boolean SAT formulas. For example, an SMT formula allows us to model the
datapath operations of a
microprocessor at the word rather than the bit level.
By comparison,
answer set programming
Answer set programming (ASP) is a form of declarative programming oriented towards difficult (primarily NP-hard) search problems. It is based on the stable model (answer set) semantics of logic programming. In ASP, search problems are reduced ...
is also based on predicates (more precisely, on
atomic sentences created from
atomic formula). Unlike SMT, answer-set programs do not have quantifiers, and cannot easily express constraints such as
linear arithmetic or
difference logic
Difference, The Difference, Differences or Differently may refer to:
Music
* ''Difference'' (album), by Dreamtale, 2005
* ''Differently'' (album), by Cassie Davis, 2009
** "Differently" (song), by Cassie Davis, 2009
* ''The Difference'' (al ...
—ASP is at best suitable for Boolean problems that reduce to the
free theory of uninterpreted functions. Implementing 32-bit integers as bitvectors in ASP suffers from most of the same problems that early SMT solvers faced: "obvious" identities such as ''x''+''y''=''y''+''x'' are difficult to deduce.
Constraint logic programming does provide support for linear arithmetic constraints, but within a completely different theoretical framework. SMT solvers have also been extended to solve formulas in
higher-order logic.
Solver approaches
Early attempts for solving SMT instances involved translating them to Boolean SAT instances (e.g., a 32-bit integer variable would be encoded by 32 single-bit variables with appropriate weights and word-level operations such as 'plus' would be replaced by lower-level logic operations on the bits) and passing this formula to a Boolean SAT solver. This approach, which is referred to as ''the
eager approach'', has its merits: by pre-processing the SMT formula into an equivalent Boolean SAT formula existing Boolean SAT solvers can be used "as-is" and their performance and capacity improvements leveraged over time. On the other hand, the loss of the high-level semantics of the underlying theories means that the Boolean SAT solver has to work a lot harder than necessary to discover "obvious" facts (such as
for integer addition.) This observation led to the development of a number of SMT solvers that tightly integrate the Boolean reasoning of a
DPLL-style search with theory-specific solvers (''T-solvers'') that handle
conjunctions (ANDs) of predicates from a given theory. This approach is referred to as ''the
lazy approach''.
Dubbed
DPLL(T) In computer science, DPLL(T) is a framework for determining the satisfiability of SMT problems. The algorithm extends the original SAT-solving DPLL algorithm with the ability to reason about an arbitrary theory ''T''. At a high level, the algorithm ...
, this architecture gives the responsibility of Boolean reasoning to the DPLL-based SAT solver which, in turn, interacts with a solver for theory T through a well-defined interface. The theory solver only needs to worry about checking the feasibility of conjunctions of theory predicates passed on to it from the SAT solver as it explores the Boolean search space of the formula. For this integration to work well, however, the theory solver must be able to participate in propagation and conflict analysis, i.e., it must be able to infer new facts from already established facts, as well as to supply succinct explanations of infeasibility when theory conflicts arise. In other words, the theory solver must be incremental and
backtrackable.
SMT for undecidable theories
Most of the common SMT approaches support
decidable theories. However, many real-world systems, such as an aircraft and its behavior, can only be modelled by means of non-linear arithmetic over the real numbers involving
transcendental functions. This fact motivates an extension of the SMT problem to non-linear theories, such as determining whether the following equation is satisfiable:
:
where
:
Such problems are, however,
undecidable in general. (On the other hand, the theory of
real closed fields, and thus the full first order theory of the
real numbers, are
decidable using
quantifier elimination. This is due to
Alfred Tarski.) The first order theory of the
natural numbers
In mathematics, the natural numbers are those numbers used for counting (as in "there are ''six'' coins on the table") and ordering (as in "this is the ''third'' largest city in the country").
Numbers used for counting are called ''cardinal n ...
with addition (but not multiplication), called
Presburger arithmetic, is also decidable. Since multiplication by constants can be implemented as nested additions, the arithmetic in many computer programs can be expressed using Presburger arithmetic, resulting in decidable formulas.
Examples of SMT solvers addressing Boolean combinations of theory atoms from undecidable arithmetic theories over the reals are ABsolver, which employs a classical DPLL(T) architecture with a non-linear optimization packet as (necessarily incomplete) subordinate theory solver, an
iSAT building on a unification of DPLL SAT-solving and
interval constraint propagation called the iSAT algorithm.
Solvers
The table below summarizes some of the features of the many available SMT solvers. The column "SMT-LIB" indicates compatibility with the SMT-LIB language; many systems marked 'yes' may support only older versions of SMT-LIB, or offer only partial support for the language. The column "CVC" indicates support for the language. The column "DIMACS" indicates support for the
DIMACSbr>
format
Projects differ not only in features and performance, but also in the viability of the surrounding community, its ongoing interest in a project, and its ability to contribute documentation, fixes, tests and enhancements.
Standardization and the SMT-COMP solver competition
There are multiple attempts to describe a standardized interface to SMT solvers (and
automated theorem provers
Automated theorem proving (also known as ATP or automated deduction) is a subfield of automated reasoning and mathematical logic dealing with proving mathematical theorems by computer programs. Automated reasoning over mathematical proof was a maj ...
, a term often used synonymously). The most prominent is the SMT-LIB standard, which provides a language based on
S-expressions. Other standardized formats commonly supported are the DIMACS format supported by many Boolean SAT solvers, and the CVC format used by the CVC automated theorem prover.
The SMT-LIB format also comes with a number of standardized benchmarks and has enabled a yearly competition between SMT solvers called SMT-COMP. Initially, the competition took place during the
Computer Aided Verification
In computer science, the International Conference on Computer-Aided Verification (CAV) is an annual academic conference on the theory and practice of computer-aided formal analysis of software and hardware systems, broadly known as formal methods ...
conference (CAV), but as of 2020 the competition is hosted as part of the SMT Workshop, which is affiliated with the
International Joint Conference on Automated Reasoning (IJCAR).
Applications
SMT solvers are useful both for verification, proving the
correctness of programs, software testing based on
symbolic execution In computer science, symbolic execution (also symbolic evaluation or symbex) is a means of analyzing a program to determine what inputs cause each part of a program to execute. An interpreter follows the program, assuming symbolic values for inp ...
, and for
synthesis
Synthesis or synthesize may refer to:
Science Chemistry and biochemistry
*Chemical synthesis, the execution of chemical reactions to form a more complex molecule from chemical precursors
** Organic synthesis, the chemical synthesis of organ ...
, generating program fragments by searching over the space of possible programs. Outside of software verification, SMT solvers have also been used for
type inference and for modelling theoretic scenarios, including modelling actor beliefs in nuclear
arms control
Arms control is a term for international restrictions upon the development, production, stockpiling, proliferation and usage of small arms, conventional weapons, and weapons of mass destruction. Arms control is typically exercised through the u ...
.
Verification
Computer-aided
verification of computer programs often uses SMT solvers. A common technique is to translate preconditions, postconditions, loop conditions, and assertions into SMT formulas in order to determine if all properties can hold.
There are many verifiers built on top of the
Z3 SMT solverBoogieis an intermediate verification language that uses Z3 to automatically check simple imperative programs. Th
VCCverifier for concurrent C uses Boogie, as well a
Dafnyfor imperative object-based programs
Chalicefor concurrent programs, an
Spec#for C#
F*is a dependently typed language that uses Z3 to find proofs; the compiler carries these proofs through to produce proof-carrying bytecode. Th
Viper verification infrastructureencodes verification conditions to Z3. Th
sbvlibrary provides SMT-based verification of Haskell programs, and lets the user choose among a number of solvers such as Z3, ABC, Boolector, cvc5, MathSAT and Yices.
There are also many verifiers built on top of th
Alt-ErgoSMT solver. Here is a list of mature applications:
Why3 a platform for deductive program verification, uses Alt-Ergo as its main prover;
* CAVEAT, a C-verifier developed by CEA and used by Airbus; Alt-Ergo was included in the qualification DO-178C of one of its recent aircraft;
*
Frama-C
Frama-C stands for ''Framework for Modular Analysis of C programs''. Frama-C is a set of interoperable program analyzers for C programs. Frama-C has been developed by the French Commissariat à l'Énergie Atomique et aux Énergies Alternativ ...
, a framework to analyse C-code, uses Alt-Ergo in the Jessie and WP plugins (dedicated to "deductive program verification");
*
SPARK uses CVC4 and Alt-Ergo (behind GNATprove) to automate the verification of some assertions in SPARK 2014;
*
Atelier-B can use Alt-Ergo instead of its main prover (increasing success from 84% to 98% on th
ANR Bware project benchmarks;
*
Rodin
François Auguste René Rodin (12 November 184017 November 1917) was a French sculptor, generally considered the founder of modern sculpture. He was schooled traditionally and took a craftsman-like approach to his work. Rodin possessed a uniqu ...
, a B-method framework developed by Systerel, can use Alt-Ergo as a back-end;
Cubicle an open source model checker for verifying safety properties of array-based transition systems.
EasyCrypt a toolset for reasoning about relational properties of probabilistic computations with adversarial code.
Many SMT solvers implement a common interface format calle
SMTLIB2(such files usually have the extension "
.smt2
"). Th
LiquidHaskell
tool implements a refinement type based verifier for Haskell that can use any SMTLIB2 compliant solver, e.g. cvc5, MathSat, or Z3.
Symbolic-execution based analysis and testing
An important application of SMT solvers is
symbolic execution In computer science, symbolic execution (also symbolic evaluation or symbex) is a means of analyzing a program to determine what inputs cause each part of a program to execute. An interpreter follows the program, assuming symbolic values for inp ...
for analysis and testing of programs (e.g.,
concolic testing
Concolic testing (a portmanteau of ''concrete'' and ''symbolic'', also known as dynamic symbolic execution) is a hybrid software verification technique that performs symbolic execution, a classical technique that treats program variables as symbo ...
), aimed particularly at finding security vulnerabilities. Example tools in this category includ
SAGEfrom
Microsoft Research
Microsoft Research (MSR) is the research subsidiary of Microsoft. It was created in 1991 by Richard Rashid, Bill Gates and Nathan Myhrvold with the intent to advance state-of-the-art computing and solve difficult world problems through technologi ...
KLEES2E an
Triton SMT solvers that have been used for symbolic-execution applications includ
Z3STP, th
Z3str family of solvers an
Boolector
See also
*
Answer set programming
Answer set programming (ASP) is a form of declarative programming oriented towards difficult (primarily NP-hard) search problems. It is based on the stable model (answer set) semantics of logic programming. In ASP, search problems are reduced ...
*
Automated theorem proving
*
SAT solver
*
First-order logic
*
Theory of pure equality
In mathematical logic the theory of pure equality is a first-order theory. It has a signature consisting of only the equality relation symbol, and includes no non-logical axioms at all.
This theory is consistent but incomplete, as a non-empty set ...
Notes
References
*
*
*
*, pp. , .
*
*
*
*
SMT-LIB: The Satisfiability Modulo Theories LibrarySMT-COMP: The Satisfiability Modulo Theories CompetitionDecision procedures - an algorithmic point of view*
*
----
''This article is adapted from a column in the AC
SIGDAb
Prof. Karem Sakallah Original text i
available here'
{{Mathematical logic
Constraint programming
Electronic design automation
Formal methods
Logic in computer science
NP-complete problems
Satisfiability problems
SMT solvers