Probabilistic Soft Logic (PSL) is a
statistical relational learning (SRL) framework for modeling probabilistic and relational domains.
It is applicable to a variety of
machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
problems, such as
collective classification,
entity resolution,
link prediction, and
ontology alignment
Ontology alignment, or ontology matching, is the process of determining correspondences between concepts in ontologies. A set of correspondences is also called an alignment. The phrase takes on a slightly different meaning, in computer science, ...
.
PSL combines two tools:
first-order logic
First-order logic—also known as predicate logic, quantificational logic, and first-order predicate calculus—is a collection of formal systems used in mathematics, philosophy, linguistics, and computer science. First-order logic uses quantifie ...
, with its ability to succinctly represent complex phenomena, and
probabilistic graphical models, which capture the uncertainty and incompleteness inherent in real-world knowledge.
More specifically, PSL uses
"soft" logic as its logical component and
Markov random fields
In the domain of physics and probability, a Markov random field (MRF), Markov network or undirected graphical model is a set of random variables having a Markov property described by an undirected graph. In other words, a random field is said to ...
as its statistical model.
PSL provides sophisticated inference techniques for finding the most likely answer (i.e. the
maximum a posteriori (MAP) state).
The "softening" of the logical formulas makes inference a
polynomial time
In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
operation rather than an
NP-hard operation.
Description
The
SRL community has introduced multiple approaches that combine
graphical model
A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a Graph (discrete mathematics), graph expresses the conditional dependence structure between random variables. They are ...
s and
first-order logic
First-order logic—also known as predicate logic, quantificational logic, and first-order predicate calculus—is a collection of formal systems used in mathematics, philosophy, linguistics, and computer science. First-order logic uses quantifie ...
to allow the development of complex probabilistic models with relational structures.
A notable example of such approaches is
Markov logic network
A Markov logic network (MLN) is a probabilistic logic which applies the ideas of a Markov network to first-order logic, enabling uncertain inference. Markov logic networks generalize first-order logic, in the sense that, in a certain limit, all uns ...
s (MLNs).
Like MLNs, PSL is a modelling language (with an accompanying implementation
) for learning and predicting in relational domains.
Unlike MLNs, PSL uses soft truth values for predicates in an interval between
,1
This allows for the underlying inference to be solved quickly as a convex optimization problem.
This is useful in problems such as
collective classification,
link prediction,
social network
A social network is a social structure made up of a set of social actors (such as individuals or organizations), sets of dyadic ties, and other social interactions between actors. The social network perspective provides a set of methods for an ...
modelling, and
object identification/entity resolution/record linkage.
Probabilistic Soft Logic was first released in 2009 by
Lise Getoor
Lise Getoor is a professor in the computer science department, at the University of California, Santa Cruz, and an adjunct professor in the Computer Science Department at the University of Maryland, College Park.
Her primary research interests a ...
and Matthias Broecheler.
This first version focused heavily on reasoning about similarities between entities.
Later versions of PSL would still keep the ability to reason about similarities, but generalize the language to be more expressive.
In 2017, a
Journal of Machine Learning Research
The ''Journal of Machine Learning Research'' is a peer-reviewed open access scientific journal covering machine learning. It was established in 2000 and the first editor-in-chief was Leslie Kaelbling. The current editors-in-chief are Francis Ba ...
article detailing PSL and the underlying graphical model was published along with the release of a new major version of PSL (2.0.0).
The major new features in PSL 2.0.0 was a new type of rule mainly used in specifying constraints and a
command-line interface
A command-line interpreter or command-line processor uses a command-line interface (CLI) to receive commands from a user in the form of lines of text. This provides a means of setting parameters for the environment, invoking executables and pro ...
.
Syntax and Semantics
Terminology
* PSL Program — A collection of rules, each of which is a template for a potential in a graphical model.
* Rule — An expression relating atoms. Rules will typically take the form of either a
first-order logical implication or a
linear combination.
* Constant — A string or number that represents a real element in the universe over which a PSL program represents. Constants can represent attributes or entire entities.
* Variable — An identifier for which constants can be substituted.
* Term — Either a constant or a variable.
* Predicate — A relation defined by a unique name and a number of arguments it accepts.
* Atom — A predicate along with its term arguments.
* Ground Atom — An atom where all arguments are constants.
Syntax
A PSL model is composed of a series of weighted rules and constraints.
PSL supports two types of rules: Logical and Arithmetic.
Logical rules are composed of an implication with only a single atom or a conjunction of atoms in the body and a single atom or a disjunction of atoms in the head.
Since PSL uses soft logic, hard logic operators are replaced with
Łukasiewicz soft logical operators.
An example of a logical rule expression is:
Similar(A, B) & HasLabel(A, X) -> HasLabel(B, X)
This rule can be interpreted to mean: ''If A and B are similar and A has the label X, then there is evidence that B also has the label X.''
Arithmetic rules are relations of two
linear combinations of atoms.
Restricting each side to a linear combination ensures that the resulting potential is
convex.
The following relational operators are supported:
=
,
<=
, and
>=
.
Similar(A, B) = Similar(B, A)
This rule encodes the notion that similarity is symmetric in this model.
A commonly used feature of arithmetic rules is the summation operation.
The summation operation can be used to aggregate multiple atoms.
When used, the atom is replaced with the sum of all possible atoms where the non-summation variables are fixed.
Summation variables are made by prefixing a variable with a
+
.
Fox example:
HasLabel(A, +X) = 1.0
If the possible values for X are ''label1'', ''label2'', and ''label3'', then the above rule is equivalent to:
HasLabel(A, 'label1') + HasLabel(A, 'label2') + HasLabel(A, 'label3') = 1.0
Both of these rules force the sum of all possible labels for an entity to sum to 1.0.
This type of rule is especially useful for
collective classification problems, where only one class can be selected.
Semantics
HL-MRF
A PSL program defines a family of probabilistic
graphical model
A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a Graph (discrete mathematics), graph expresses the conditional dependence structure between random variables. They are ...
s that are parameterized by data.
More specifically, the family of graphical models it defines belongs to a special class of
Markov random field
In the domain of physics and probability, a Markov random field (MRF), Markov network or undirected graphical model is a set of random variables having a Markov property described by an undirected graph. In other words, a random field is said to b ...
known as a Hinge-Loss Markov Field (HL-MRF).
An HL-MRF determines a density function over a set of continuous variables
with joint domain
using set of evidence
, weights
, and potential functions
of the form
where
is a linear function and
.
The conditional distribution of
given the observed data
is defined as
Where
is the partition function.
This density is a
logarithmically convex function In mathematics, a function (mathematics), function ''f'' is logarithmically convex or superconvex if \circ f, the function composition, composition of the logarithm with ''f'', is itself a convex function.
Definition
Let be a convex set, convex su ...
, and thus the common inference task in PSL of finding a
maximum a posteriori estimation
In Bayesian statistics, a maximum a posteriori probability (MAP) estimate is an estimate of an unknown quantity, that equals the mode of the posterior distribution. The MAP can be used to obtain a point estimate of an unobserved quantity on the b ...
of the joint state of
is a convex problem.
This allows inference in PSL to be achievable in polynomial-time.
Open/Closed Predicates -- Closed World Assumption
Predicates in PSL can be labeled as open or closed.
When a predicate is labeled closed, PSL makes the
closed-world assumption The closed-world assumption (CWA), in a formal system of logic used for knowledge representation, is the presumption that a statement that is true is also known to be true. Therefore, conversely, what is not currently known to be true, is false. Th ...
: any predicates that are not explicitly provided to PSL are assumed to be false.
In other words, the closed world assumption presumes that a predicate that is partially true is also known to be partially true.
For example, if we had the following constants in the data for representing people:
and the following constant for movies:
, and we provided PSL with the predicate data
and
was labeled closed, then PSL would assume that
even though this data was never explicitly provided to the system.
If a predicate is labeled as open, then PSL does not make the closed-world assumption. Instead, PSL will attempt to collectively infer the unobserved instances.
Grounding
Data is used to instantiate several potential functions in a process called grounding.
The resulting potential functions are then used to define the HL-MRF.
Grounding predicates in PSL is the process of making all possible substitutions of the variables in each predicate with the existing constants in the data, resulting in a collection of ground atoms,
.
Then, all possible substitutions of the ground atoms for the predicates in the rules are made to create ground rules.
Each of the ground rules are interpreted as either potentials or hard constraints in the induced HL-MRF.
A logical rule is translated as a continuous relaxation of Boolean connectives using
Łukasiewicz logic
In mathematics and philosophy, Łukasiewicz logic ( , ) is a non-classical, many-valued logic. It was originally defined in the early 20th century by Jan Łukasiewicz as a three-valued logic;Łukasiewicz J., 1920, O logice trójwartościowej (in P ...
.
A ground logical rule is transformed into its
disjunctive normal form
In boolean logic, a disjunctive normal form (DNF) is a canonical normal form of a logical formula consisting of a disjunction of conjunctions; it can also be described as an OR of ANDs, a sum of products, or (in philosophical logic) a ''cluster c ...
.
Let
be the set of indices of the variables that correspond to atoms that are not negated, and, likewise
the set of indices corresponding to atoms that are negated, in the disjunctive clause.
Then the logical rule maps to the inequality:
If the logical rule is weighted with a weight
and exponentiated with
, then the potential
is added to the HL-MRF with a weight parameter of
.
An arithmetic rule is manipulated to
and the resulting potential takes the form
.
Interfaces
PSL is available via three different language
interfaces:
CLI,
Java, and
Python.
PSL's command line interface (CLI) is the recommended way to use PSL.
It supports all the features commonly used in a reproducible form that does not require compilation.
Since PSL is written in Java, the PSL Java interface is the most expansive and users can call directly into the core of PSL.
The Java interface is available through the
Maven central repository.
The PSL Python interface is available through
PyPi
and uses
pandas DataFrames to pass data between PSL and the user.
PSL previously provided a Groovy interface.
It has been deprecated in 2.2.1 release of PSL, and is scheduled to be removed in the 2.3.0 release.
Examples
The LINQS lab, developers of the official PSL implementation, maintain a collection of PSL examples.
These examples cover both synthetic and real-world datasets and include examples from academic publications using PSL.
Below is a toy example from this repository that can be used to infer relations in a social network.
Along with each rule is a comment describing the motivating intuition behind the statements.
/* People living in the same location are more likely to know one another. */
20: Lived(P1, L) & Lived(P2, L) & (P1 != P2) -> Knows(P1, P2) ^2
/* People who have not lived in the same location are not likely to know one another. */
5: Lived(P1, L1) & Lived(P2, L2) & (P1 != P2) & (L1 != L2) -> !Knows(P1, P2) ^2
/* Two people with similar interests are more likely to know one another. */
10: Likes(P1, X) & Likes(P2, X) & (P1 != P2) -> Knows(P1, P2) ^2
/* People in the same circles tend to know one another (transitivity). */
5: Knows(P1, P2) & Knows(P2, P3) & (P1 != P3) -> Knows(P1, P3) ^2
/* Knowing one another is symmetric. */
Knows(P1, P2) = Knows(P2, P1) .
/* By default, assume that two arbitrary people do not know one another (negative prior). */
5: !Knows(P1, P2) ^2
See also
*
Statistical relational learning
Statistical relational learning (SRL) is a subdiscipline of artificial intelligence and machine learning that is concerned with domain models that exhibit both uncertainty (which can be dealt with using statistical methods) and complex, relational ...
*
Probabilistic logic network
A probabilistic logic network (PLN) is a conceptual, mathematical, and computational approach to uncertain inference; inspired by logic programming, but using probabilities in place of crisp (true/false) truth values, and fractional uncertainty in ...
*
Markov logic network
A Markov logic network (MLN) is a probabilistic logic which applies the ideas of a Markov network to first-order logic, enabling uncertain inference. Markov logic networks generalize first-order logic, in the sense that, in a certain limit, all uns ...
*
Fuzzy logic
Fuzzy logic is a form of many-valued logic in which the truth value of variables may be any real number between 0 and 1. It is employed to handle the concept of partial truth, where the truth value may range between completely true and completely ...
References
{{reflist,
refs=
[
{{cite journal , last1=Bach , first1=Stephen , last2=Broecheler , first2=Matthias , last3=Huang , first3=Bert , last4=Getoor , first4=Lise , date=2017 , title=Hinge-Loss Markov Random Fields and Probabilistic Soft Logic , journal=Journal of Machine Learning Research , volume=18 , pages=1–67
]
[
{{cite conference , url=https://linqs.github.io/linqs-website/publications/#id:broecheler-srl09 , title=Probabilistic Similarity Logic , last1=Broecheler , first1=Matthias , last2=Getoor , first2=Lise , author-link2=Lise Getoor , date=2009 , conference=International Workshop on Statistical Relational Learning (SRL)
]
[
{{cite book , last1=Getoor , first1=Lise , last2=Taskar , first2=Ben , author-link1=Lise Getoor , author-link2=Ben Taskar , date=2007 , title=Introduction to Statistical Relational Learning , url=https://linqs.github.io/linqs-website/publications/#id:getoor-book07 , publisher=MIT Press , isbn=978-0262072885
]
[
{{cite web , title=Maven Repository: org.linqs » psl-groovy , url=https://mvnrepository.com/artifact/org.linqs/psl-groovy , website=mvnrepository.com
]
[
{{cite web , title=Maven Repository: org.linqs » psl-java , url=https://mvnrepository.com/artifact/org.linqs/psl-java , website=mvnrepository.com , accessdate=15 July 2020
]
[
{{cite web , title=PSL API Reference , url=https://psl.linqs.org/api/ , website=Probabilistic Soft Logic , accessdate=15 July 2020 , language=en
]
[
{{cite web , last1=Augustine , first1=Eriq , title=PSL 2.2.1 Release , url=https://psl.linqs.org/blog/2019/12/06/psl-2.2.1-release.html#new-python-interface , website=Probabilistic Soft Logic , accessdate=15 July 2020 , language=en , date=6 December 2019
]
[
{{cite web , last1=Augustine , first1=Eriq , title=Getting Started with PSL , url=https://psl.linqs.org/blog/2018/07/15/getting-started-with-psl.html , website=Probabilistic Soft Logic , accessdate=15 July 2020 , language=en , date=15 July 2018
]
[
{{cite web , last1=Augustine , first1=Eriq , title=PSL 2.2.1 Release , url=https://psl.linqs.org/blog/2019/12/06/psl-2.2.1-release.html#groovy-interface-deprecated , website=Probabilistic Soft Logic , accessdate=15 July 2020 , language=en , date=6 December 2019
]
[
{{cite web, url=https://github.com/linqs/psl, title=GitHub repository, accessdate=26 March 2018
]
[
{{cite web , title=linqs/psl-examples , url=https://github.com/linqs/psl-examples , website=Github , publisher=linqs , date=19 June 2020
]
[
{{cite web , url=https://psl.linqs.org/wiki/master/Rule-Specification.html , title=Rule Specification , author= , date=December 6, 2019 , website=psl.linqs.org , publisher=LINQS Lab , access-date=July 10, 2020
]
[
{{cite web , title=pslpython: A python inferface to the PSL SRL/ML software. , url=https://pypi.org/project/pslpython/ , website=Python Package Index , accessdate=15 July 2020
]
[
{{cite web , url = https://github.com/linqs/psl/tree/2.2.2 , title = PSL 2.2.2 , website = GitHub , language = en-US , accessdate = 16 July 2020
]
External links
Video lectures about PSL on Youtube
Bayesian statistics
Markov networks