Logic programming
Logic programming is a programming paradigm which is largely based on formal logic
Logic is the study of correct reasoning. It includes both formal and informal logic. Formal logic is the science of deductively valid inferences or of log ...
is a
programming paradigm
Programming paradigms are a way to classify programming languages based on their features. Languages can be classified into multiple paradigms.
Some paradigms are concerned mainly with implications for the execution model of the language, s ...
that includes languages based on formal logic, including
Datalog
Datalog is a declarative logic programming language. While it is syntactically a subset of Prolog, Datalog generally uses a bottom-up rather than top-down evaluation model. This difference yields significantly different behavior and properties ...
and
Prolog
Prolog is a logic programming language associated with artificial intelligence and computational linguistics.
Prolog has its roots in first-order logic, a formal logic, and unlike many other programming languages, Prolog is intended primarily a ...
. This article describes the syntax and semantics of the purely
declarative subset of these languages. Confusingly, the name "logic programming" also refers to a programming language that roughly corresponds to the declarative subset of Prolog. Unfortunately, the term must be used in both senses in this article.
Declarative logic programs consist entirely of ''rules'' of the form
H :- B1, ..., BN.
Each such rule can be read as an
implication:
:
meaning "If each
is true, then
is true". Logic programs compute the set of facts that are implied by their rules.
Many implementations of Datalog, Prolog, and related languages add procedural features such as Prolog's
cut operator or extra-logical features such as a
foreign function interface
A foreign function interface (FFI) is a mechanism by which a program written in one programming language can call routines or make use of services written in another.
Naming
The term comes from the specification for Common Lisp, which explicit ...
. The formal semantics of such extensions are beyond the scope of this article.
Datalog
''Datalog'' is the simplest widely-studied logic programming language. There are three major definitions of the semantics of Datalog, and they are all equivalent. The syntax and semantics of other logic programming languages are extensions and generalizations of those of Datalog.
Syntax
A Datalog program consists of a list of ''rules'' (
Horn clause In mathematical logic and logic programming, a Horn clause is a logical formula of a particular rule-like form which gives it useful properties for use in logic programming, formal specification, and model theory. Horn clauses are named for the logi ...
s). If ''constant'' and ''variable'' are two
countable
In mathematics, a set is countable if either it is finite or it can be made in one to one correspondence with the set of natural numbers. Equivalently, a set is ''countable'' if there exists an injective function from it into the natural number ...
sets of constants and variables respectively and ''relation'' is a countable set of
predicate symbols, then the following
BNF grammar expresses the structure of a Datalog program:
::= , ""
::= ":-" "."
::= "(" ")"
::= , "," , ""
::= ,
::= , "," , ""
Atoms are also referred to as . The atom to the left of the
:-
symbol is called the of the rule; the atoms to the right are the . Every Datalog program must satisfy the condition that every variable that appears in the head of a rule also appears in the body (this condition is sometimes called the ).
Rules with empty bodies are called . For example, the following rule is a fact:
r(x) :- .
Syntactic sugar
Many implementations of logic programming extend the above grammar to allow writing facts without the
:-
, like so:
r(x).
Many also allow writing 0-ary relations without parentheses, like so:
p :- q.
These are merely abbreviations (
syntactic sugar
In computer science, syntactic sugar is syntax within a programming language that is designed to make things easier to read or to express. It makes the language "sweeter" for human use: things can be expressed more clearly, more concisely, or in a ...
); they have no impact on the semantics of the program.
Example
The following program computes the relation
path
, which is the
transitive closure
In mathematics, the transitive closure of a binary relation on a set is the smallest relation on that contains and is transitive. For finite sets, "smallest" can be taken in its usual sense, of having the fewest related pairs; for infinit ...
of the relation
edge
.
edge(x, y).
edge(y, z).
path(A, B) :-
edge(A, B).
path(A, C) :-
path(A, B),
edge(B, C).
Semantics
There are three widely-used approaches to the semantics of Datalog programs:
model-theoretic
In mathematical logic, model theory is the study of the relationship between formal theories (a collection of sentences in a formal language expressing statements about a mathematical structure), and their models (those structures in which th ...
,
fixed-point Fixed point may refer to:
* Fixed point (mathematics), a value that does not change under a given transformation
* Fixed-point arithmetic, a manner of doing arithmetic on computers
* Fixed point, a benchmark (surveying)
The term benchmark, be ...
, and
proof-theoretic
Proof theory is a major branchAccording to Wang (1981), pp. 3–4, proof theory is one of four domains mathematical logic, together with model theory, axiomatic set theory, and recursion theory. Barwise (1978) consists of four corresponding parts, ...
. These three approaches can be proven to be equivalent.
An atom is called if none of its subterms are variables. Intuitively, each of the semantics define the meaning of a program to be the set of all ground atoms that can be deduced from the rules of the program, starting from the facts.
Model theoretic

A rule is called ground if all of its atoms (head and body) are ground. A ground rule ''R''
1 is a ''ground instance'' of another rule ''R''
2 if ''R''
1 is the result of a
substitution
Substitution may refer to:
Arts and media
*Chord substitution, in music, swapping one chord for a related one within a chord progression
*Substitution (poetry), a variation in poetic scansion
* "Substitution" (song), a 2009 song by Silversun Pic ...
of constants for all the variables in ''R''
2.
The ''
Herbrand base
In first-order logic, a Herbrand structure ''S'' is a structure over a vocabulary σ that is defined solely by the syntactical properties of σ. The idea is to take the symbols of terms as their values, e.g. the denotation of a constant symbol ' ...
'' of a Datalog program is the set of all ground atoms that can be made with the constants appearing in the program. An ''interpretation'' (also known as a ''database instance'') is a subset of the Herbrand base. A ground atom is true in an interpretation if it is an element of . A rule is ''true in an interpretation '' if for each ground instance of that rule, if all the clauses in the body are true in , then the head of the rule is also true in .
A ''model'' of a Datalog program ''P'' is an interpretation of ''P'' which contains all the ground facts of ''P'', and makes all of the rules of ''P'' true in .
Model-theoretic
In mathematical logic, model theory is the study of the relationship between formal theories (a collection of sentences in a formal language expressing statements about a mathematical structure), and their models (those structures in which th ...
semantics state that the meaning of a Datalog program is its minimal model (equivalently, the intersection of all its models).
For example, this program:
edge(x, y).
edge(y, z).
path(A, B) :-
edge(A, B).
path(A, C) :-
path(A, B),
edge(B, C).
has this Herbrand universe:
x
,
y
,
z
and this Herbrand base:
edge(x, x)
,
edge(x, y)
, ...,
edge(z, z)
,
path(x, x)
, ...,
path(z, z)
and this minimal Herbrand model:
edge(x, y)
,
edge(y, z)
,
path(x, y)
,
path(y, z)
,
path(x, z)
Fixed-point
Let be the set of interpretations of a Datalog program ''P'', that is, , where ''H'' is the Herbrand base of ''P'' and P is the
powerset
In mathematics, the power set (or powerset) of a set is the set of all subsets of , including the empty set and itself. In axiomatic set theory (as developed, for example, in the ZFC axioms), the existence of the power set of any set is p ...
operator. The for ''P'' is the following map from to : For each ground instance of each rule in ''P'', if every clause in the body is in the input interpretation, then add the head of the ground instance to the output interpretation. This map is
monotonic
In mathematics, a monotonic function (or monotone function) is a function between ordered sets that preserves or reverses the given order. This concept first arose in calculus, and was later generalized to the more abstract setting of ord ...
with respect to the
partial order
In mathematics, especially order theory, a partially ordered set (also poset) formalizes and generalizes the intuitive concept of an ordering, sequencing, or arrangement of the elements of a set. A poset consists of a set together with a binary ...
given by subset inclusion on . By the
Knaster–Tarski theorem In the mathematical areas of order and lattice theory, the Knaster–Tarski theorem, named after Bronisław Knaster and Alfred Tarski, states the following:
:''Let'' (''L'', ≤) ''be a complete lattice and let f : L → L be an monotonic function ...
, this map has a least fixed point; by the
Kleene fixed-point theorem
In the mathematical areas of order and lattice theory, the Kleene fixed-point theorem, named after American mathematician Stephen Cole Kleene, states the following:
:Kleene Fixed-Point Theorem. Suppose (L, \sqsubseteq) is a directed-complete par ...
the fixed point is the
supremum
In mathematics, the infimum (abbreviated inf; plural infima) of a subset S of a partially ordered set P is a greatest element in P that is less than or equal to each element of S, if such an element exists. Consequently, the term ''greatest l ...
of the chain
. The least fixed point of coincides with the minimal Herbrand model of the program.
The
fixpoint
A fixed point (sometimes shortened to fixpoint, also known as an invariant point) is a value that does not change under a given transformation. Specifically, in mathematics, a fixed point of a function is an element that is mapped to itself by t ...
semantics suggest an algorithm for computing the minimal Herbrand model: Start with the set of ground facts in the program, then repeatedly add consequences of the rules until a fixpoint is reached. This algorithm is called
naïve evaluation.
Proof-theoretic
Given a program , a of a ground atom is a
tree
In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, including only woody plants with secondary growth, plants that are ...
with a root labeled by , leaves labeled by ground atoms from the heads of facts in , and branches with children
labeled by ground atoms such that there exists a ground instance
:
G :- A1, …, An.
of a rule in . The proof-theoretic semantics defines the meaning of a Datalog program to be the set of ground atoms that can be defived from such trees. This set coincides with the minimal Herbrand model.
One might be interested in knowing whether or not a particular ground atom appears in the minimal Herbrand model of a Datalog program, perhaps without caring much about the rest of the model. A top-down reading of the proof trees described above suggests an algorithm for computing the results of such ''queries'', such a reading informs the
SLD resolution SLD resolution (''Selective Linear Definite'' clause resolution) is the basic inference rule used in logic programming. It is a refinement of resolution, which is both sound and refutation complete for Horn clauses.
The SLD inference rule
Given ...
algorithm, which forms the basis for the evaluation of
Prolog
Prolog is a logic programming language associated with artificial intelligence and computational linguistics.
Prolog has its roots in first-order logic, a formal logic, and unlike many other programming languages, Prolog is intended primarily a ...
.
Other approaches
The semantics of Datalog have also been studied in the context of fixpoints over more general
semiring
In abstract algebra, a semiring is an algebraic structure similar to a ring, but without the requirement that each element must have an additive inverse.
The term rig is also used occasionally—this originated as a joke, suggesting that rigs a ...
s.
Logic programming
While the name "logic programming" is used to refer to the entire paradigm of programming languages including Datalog and Prolog, when discussing formal semantics, it generally refers to an extension of Datalog with
function symbols. Logic programs are also called . Logic programming as discussed in this article is closely related to the "pure" or
declarative subset of
Prolog
Prolog is a logic programming language associated with artificial intelligence and computational linguistics.
Prolog has its roots in first-order logic, a formal logic, and unlike many other programming languages, Prolog is intended primarily a ...
.
Syntax
The syntax of logic programming extends the syntax of Datalog with function symbols. Logic programming drops the range restriction, allowing variables to appear in the heads of rules that do not appear in their bodies.
Semantics
Due to the presence of function symbols, the Herbrand models of logic programs can be infinite. However, the semantics of a logic program is still defined to be its minimal Herbrand model. Relatedly, the fixpoint of the immediate consequence operator may not converge in a finite number of steps (or to a finite set). However, any ground atom in the minimal Herbrand model will have a finite proof tree. This is why Prolog is evaluated top-down. Just as in Datalog, the three semantics can be proven equivalent.
Negation
Logic programming has the desirable property that all three major definitions of the semantics of logic programs agree. In contrast, there are many conflicting proposals for the semantics of logic programs with negation. The source of the disagreement is that logic programs have a unique minimal Herbrand model, but in general, logic programming (or even Datalog) programs with negation do not.
Syntax
Negation is written
not
, and can appear in front of any atom in the body of a rule.
::= , "not" , "," , ""
Semantics
Stratified negation
A logic program with negation is ''stratified'' when it is possible to assign each relation to some ''stratum'', such that if a relation appears negated in the body of a relation , then is in a lower stratum than . The model-theoretic and fixed-point semantics of Datalog can be extended to handle stratified negation, and such extensions can be proved equivalent.
Many implementations of Datalog use a bottom-up evaluation model inspired by the fixed point semantics. Since this semantics can handle stratified negation, several implementations of Datalog implement stratified negation.
While stratified negation is a common extension to Datalog, there are reasonable programs that cannot be stratified. The following program describes a two-player game where a player wins if their opponent has no moves:
move(a, b).
win(X) :- move(X, Y), not win(Y).
This program is not stratified, but it seems reasonable to think that
a
should win the game.
Completion semantics
Perfect model semantics
Stable model semantics
The stable model semantics define a condition for calling certain Herbrand models of a program ''stable''. Intuitively, stable models are the "possible sets of beliefs that a rational agent might hold, given
he program
He or HE may refer to:
Language
* He (pronoun), an English pronoun
* He (kana), the romanization of the Japanese kana へ
* He (letter), the fifth letter of many Semitic alphabets
* He (Cyrillic), a letter of the Cyrillic script called ''He'' i ...
as premises.
A program with negation may have many stable models or no stable models. For instance, the program
p :- not q.
q :- not p.
has two stable models
,
. The one-rule program
p :- not p.
has no stable models.
Every stable model is a minimal Herbrand model. A Datalog program without negation has a single stable model, which is exactly its minimal Herbrand model.
The stable model semantics defines the meaning of a logic program with negation to be its stable model, if there is exactly one. However, it can be useful to investigate (or at least, several) of the stable models of a program; this is the goal of
answer set programming
Answer set programming (ASP) is a form of declarative programming oriented towards difficult (primarily NP-hard) search problems. It is based on the stable model (answer set) semantics of logic programming. In ASP, search problems are reduced ...
.
Well-founded semantics
Further extensions
Several other extensions of Datalog have been proposed and studied, including variants with support for
integer
An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the language ...
constants and functions (including
Datalogℤ),
inequality
Inequality may refer to:
Economics
* Attention inequality, unequal distribution of attention across users, groups of people, issues in etc. in attention economy
* Economic inequality, difference in economic well-being between population groups
* ...
constraints in the bodies of rules, and
aggregate function
In database management, an aggregate function or aggregation function is a function where the values of multiple rows are grouped together to form a single summary value.
Common aggregate functions include:
* Average (i.e., arithmetic mean)
* ...
s.
Constraint logic programming
Constraint logic programming is a form of constraint programming, in which logic programming is extended to include concepts from constraint satisfaction. A constraint logic program is a logic program that contains constraints in the body of clau ...
allows for constraints over domains such as the
reals or
integer
An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the language ...
s to appear in the bodies of rules.
See also
*
Herbrand structure
*
Logic programming
Logic programming is a programming paradigm which is largely based on formal logic
Logic is the study of correct reasoning. It includes both formal and informal logic. Formal logic is the science of deductively valid inferences or of log ...
*
Negation as failure Negation as failure (NAF, for short) is a non-monotonic inference rule in logic programming, used to derive \mathrm~p (i.e. that ~p is assumed not to hold) from failure to derive ~p. Note that \mathrm ~p can be different from the statement \neg p ...
*
Prolog syntax and semantics
The syntax and semantics of Prolog, a programming language, are the sets of rules that define how a Prolog program is written and how it is interpreted, respectively. The rules are laid out in ISO standard ISO/IEC 13211''ISO/IEC 13211: Informati ...
References
Notes
Sources
*{{Cite journal, last1=Ceri, first1=S., last2=Gottlob, first2=G., last3=Tanca, first3=L., date=March 1989, title=What you always wanted to know about Datalog (and never dared to ask), url=https://www2.cs.sfu.ca/CourseCentral/721/jim/DatalogPaper.pdf, journal=IEEE Transactions on Knowledge and Data Engineering, volume=1, issue=1, pages=146–166, doi=10.1109/69.43410, issn=1041-4347 , citeseerx = 10.1.1.210.1118
Programming language syntax
Logic programming