Probabilistic logic programming is a

programming paradigm Programming paradigms are a way to classify programming languages based on their features. Languages can be classified into multiple paradigms. Some paradigms are concerned mainly with implications for the execution model of the language, s ...

that combines

logic programming Logic programming is a programming paradigm which is largely based on formal logic Logic is the study of correct reasoning. It includes both formal and informal logic. Formal logic is the science of deductively valid inferences or of log ...

with probabilities. Most approaches to probabilistic logic programming are based on the ''distribution semantics,'' which splits a program into a set of probabilistic facts and a logic program. It defines a probability distribution on interpretations of the

Herbrand universe In first-order logic, a Herbrand structure ''S'' is a structure over a vocabulary σ that is defined solely by the syntactical properties of σ. The idea is to take the symbols of terms as their values, e.g. the denotation of a constant symbol ' ...

of the program.

Languages

Most approaches to probabilistic logic programming are based on the ''distribution semantics,'' which underlies many languages such as Probabilistic Horn Abduction, PRISM, Independent Choice Logic , probabilistic

Datalog Datalog is a declarative logic programming language. While it is syntactically a subset of Prolog, Datalog generally uses a bottom-up rather than top-down evaluation model. This difference yields significantly different behavior and properties ...

, Logic Programs with Annotated Disjunctions,

ProbLog ProbLog is a probabilistic logic programming language that extends Prolog with probabilities. It minimally extends Prolog by adding the notion of a probabilistic fact, which combines the idea of logical atoms and random variables. Similarly t ...

, P-log, and CP-logic. While the number of languages is large, many share a common approach so that there are transformations with linear complexity that can translate one language into another.

Semantics

Under the distribution semantics, a probabilistic logic program is interpreted as a set of independent probabilistic facts (

ground Ground may refer to: Geology * Land, the surface of the Earth not covered by water * Soil, a mixture of clay, sand and organic matter present on the surface of the Earth Electricity * Ground (electricity), the reference point in an electrical c ...

atomic formulas annotated with a probability) and a

logic program Logic programming is a programming paradigm which is largely based on formal logic. Any program written in a logic programming language is a set of sentences in logical form, expressing facts and rules about some problem domain. Major logic prog ...

which can use the probabilistic facts in the bodies of its clauses. The probability of any assignment of truth values to the groundings of the formulas associated with probabilistic facts is given by the product of their probabilities; this is equivalent to assuming the choices of probabilistic facts to be

independent random variables Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s * Independen ...

Stratified programs

If for any choice of truth values for the probabilistic facts, the resulting logic program is

stratified Stratification may refer to: Mathematics * Stratification (mathematics), any consistent assignment of numbers to predicate symbols * Data stratification in statistics Earth sciences * Stable and unstable stratification * Stratification, or s ...

, it has a unique minimal

Herbrand model In first-order logic, a Herbrand structure ''S'' is a structure over a vocabulary σ that is defined solely by the syntactical properties of σ. The idea is to take the symbols of terms as their values, e.g. the denotation of a constant symbol ' ...

which can be seen as the unique interpretation associated with that choice of truth values. Important subclasses of stratified programs are positive programs, which do not use negation, but may be recursive, and acyclic programs, which may use negation but have no recursive dependencies.

Answer set programs

The

stable model semantics The concept of a stable model, or answer set, is used to define a declarative semantics for logic programs with negation as failure. This is one of several standard approaches to the meaning of negation in logic programming, along with program co ...

underlying

answer set programming Answer set programming (ASP) is a form of declarative programming oriented towards difficult (primarily NP-hard) search problems. It is based on the stable model (answer set) semantics of logic programming. In ASP, search problems are reduced ...

gives meaning to unstratified programs by allocating potentially more than one answer set to every truth value assignment of the probabilistic facts. This raises the question of how to distribute the probability mass across the answer sets. The probabilistic logic programming language P-Log resolves this by dividing the probability mass equally between the answer sets, following the

principle of indifference The principle of indifference (also called principle of insufficient reason) is a rule for assigning epistemic probabilities. The principle of indifference states that in the absence of any relevant evidence, agents should distribute their cre ...

. Alternatively, probabilistic answer set programming under the credal semantics allocates a ''

credal set A credal set is a set of probability distributions or, more generally, a set of (possibly finitely additive) probability measures. A credal set is often assumed or constructed to be a closed convex set. It is intended to express uncertainty or ...

'' to every query. Its lower probability bound is defined by only considering those truth value assignments of the probabilistic facts for which the query is true in every answer set of the resulting program (cautious reasoning); its upper probability bound is defined by considering those assignments for which the query is true in some answer set (brave reasoning).

Inference

Under the distribution semantics, a probabilistic logic program defines a probability distribution over interpretations of its predicates on its

. The probability of a

query is then obtained from the

joint distribution Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considered ...

of the query and the worlds: it is the sum of the probability of the worlds where the query is true. The problem of computing the probability of queries is called ''(marginal) inference''. Solving it by computing all the worlds and then identifying those that entail the query is impractical as the number of possible worlds is exponential in the number of ground probabilistic facts. In fact, already for acyclic programs and atomic queries, computing the conditional probability of a query given a conjunction of atoms as evidence is #P-complete.

Exact inference

Usually, exact inference is performed by resorting to

knowledge compilation Knowledge compilation is a family of approaches for addressing the intractability of a number of artificial intelligence problems. A propositional model is compiled in an off-line phase in order to support some queries in polynomial time. Many way ...

: according to this, a

propositional In logic and linguistics, a proposition is the meaning of a declarative sentence. In philosophy, "meaning" is understood to be a non-linguistic entity which is shared by all sentences with the same meaning. Equivalently, a proposition is the no ...

theory and a query are

compiled In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily ...

into a “target language”, which is then used to answer queries in

polynomial time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...

. The compilation becomes the main computational bottleneck, but considerable effort has been devoted to the development of efficient compilers. The compilation methods differ in the compactness of the target language and the class of queries and transformations that they support in polynomial time.

Approximate inference

Since the cost of inference may be very high, approximate algorithms have been developed. They either compute subsets of possibly incomplete explanations or use random sampling. In the first approach, a subset of the explanations provides a lower bound and the set of partially expanded explanations provides an upper bound. In the second approach, the truth of the query is repeatedly checked in an ordinary

sampled from the probabilistic program. The probability of the query is then given by the fraction of the successes.

Learning

Probabilistic inductive logic programming aims to learn probabilistic logic programs from data. This includes parameter learning, which estimates the probability annotations of a program while the clauses themselves are given by the user, and structure learning, in which the clauses themselves are induced by the probabilistic inductive logic programming system. Common approaches to parameter learning are based on expectation–maximization or

gradient descent In mathematics, gradient descent (also often called steepest descent) is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of ...

, while structure learning can be performed by searching the space of possible clauses under a variety of heuristics.

References

''As of 3 February 2024, this article is derived in whole or in part from'' ''The copyright holder has licensed the content in a manner that permits reuse under

CC BY-SA 3.0 A Creative Commons (CC) license is one of several public copyright licenses that enable the free distribution of an otherwise copyrighted "work".A "work" is any creative material made by a person. A painting, a graphic, a book, a song/lyrics ...

and

GFDL The GNU Free Documentation License (GNU FDL or simply GFDL) is a copyleft license for free documentation, designed by the Free Software Foundation (FSF) for the GNU Project. It is similar to the GNU General Public License, giving readers the r ...

. All relevant terms must be followed.'' {{Programming paradigms navbox Programming paradigms Logic programming Probabilistic models