Variable elimination (VE) is a simple and general exact inference algorithm in

probabilistic graphical model A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability ...

s, such as

Bayesian network A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Bay ...

s and

Markov random field In the domain of physics and probability, a Markov random field (MRF), Markov network or undirected graphical model is a set of random variables having a Markov property described by an undirected graph. In other words, a random field is said to b ...

s.Zhang, N.L., Poole, D.:A Simple Approach to Bayesian Network Computations.In: 7th Canadian Conference on Artificial Intelligence,pp. 171--178. Springer, New York (1994) It can be used for inference of

maximum a posteriori In Bayesian statistics, a maximum a posteriori probability (MAP) estimate is an estimate of an unknown quantity, that equals the mode of the posterior distribution. The MAP can be used to obtain a point estimate of an unobserved quantity on the b ...

(MAP) state or estimation of conditional or

marginal distribution In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. It gives the probabilities of various values of the variables ...

s over a subset of variables. The algorithm has exponential time complexity, but could be efficient in practice for low-

treewidth In graph theory, the treewidth of an undirected graph is an integer number which specifies, informally, how far the graph is from being a tree. The smallest treewidth is 1; the graphs with treewidth 1 are exactly the trees and the forests. The gra ...

graphs, if the proper elimination order is used.

Factors

Enabling a key reduction in algorithmic complexity, a factor

f

, also known as a potential, of variables

V

is a relation between each instantiation of

v

of variables

f

to a non-negative number, commonly denoted as

f(x)

. A factor does not necessarily have a set interpretation. One may perform operations on factors of different representations such as a probability distribution or conditional distribution. Joint distributions often become too large to handle as the complexity of this operation is exponential. Thus variable elimination becomes more feasible when computing factorized entities.

Basic Operations

Variable Summation

Algorithm 1, called sum-out (SO), or marginalization, eliminates a single variable

v

from a set

\phi

of factors,Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge, MA (2009) and returns the resulting set of factors. The algorithm collect-relevant simply returns those factors in

\phi

involving variable

v

. Algorithm 1 sum-out(

v

\phi

) :

\Phi

= collect factors relevant to

v

\Psi

= the product of all factors in

\Phi

\tau = \sum_ \Psi

return

(\phi - \Phi)   \cup \

Example Here we have a

joint probability distribution Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considered ...

. A variable,

v

can be summed out between a set of instantiations where the set

V-v

at minimum must agree over the remaining variables. The value of

v

is irrelevant when it is the variable to be summed out. After eliminating

V_1

, its reference is excluded and we are left with a distribution only over the remaining variables and the sum of each instantiation. The resulting distribution which follows the sum-out operation only helps to answer queries that do not mention

V_1

. Also worthy to note, the summing-out operation is commutative.

Factor Multiplication

Computing a product between multiple factors results in a factor compatible with a single instantiation in each factor. Algorithm 2 mult-factors(

v

\phi

) :

Z

= Union of all variables between product of factors

f_1(X_1),...,f_m(X_m)

f

= a factor over

f

where

f

for all

f

:For each instantiation

z

::For 1 to

m

:::

x_1=

instantiation of variables

X_1

consistent with

z

:::

f(z) = f(z)f_i(x_i)

:return

f

Factor multiplication is not only commutative but also associative.

Inference

The most common query type is in the form

p(X, E = e)

where

X

and

E

are disjoint subsets of

U

, and

E

is observed taking value

e

. A basic algorithm to computing p(X, E = e) is called ''variable elimination'' (VE), first put forth in. Taken from, this algorithm computes

p(X, E = e)

from a discrete Bayesian network B. VE calls SO to eliminate variables one by one. More specifically, in Algorithm 2,

\phi

is the set C of conditional probability tables (henceforth "CPTs") for B,

X

is a list of query variables,

E

is a list of observed variables,

e

is the corresponding list of observed values, and

\sigma

is an elimination ordering for variables

U - XE

, where

XE

denotes

X \cup E

. Variable Elimination Algorithm VE(

\phi, X, E, e, \sigma

) :Multiply factors with appropriate CPTs while σ is not empty :Remove the first variable

v

from

\sigma

\phi

= sum-out

(v,\phi)

p(X, E = e)

= the product of all factors

\Psi \in \phi

return

p(X,E = e)/ \sum_ p(X,E = e)

Ordering

Finding the optimal order in which to eliminate variables is an NP-hard problem. As such there are heuristics one may follow to better optimize performance by order: # Minimum Degree: Eliminate the variable which results in constructing the smallest factor possible. # Minimum Fill: By constructing an undirected graph showing variable relations expressed by all CPTs, eliminate the variable which would result in the least edges to be added post elimination.

References

{{Reflist Graphical models