statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

and

machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...

, a Markov blanket of a

random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...

is a minimal

set Set, The Set, SET or SETS may refer to: Science, technology, and mathematics Mathematics *Set (mathematics), a collection of elements *Category of sets, the category whose objects and morphisms are sets and total functions, respectively Electro ...

of variables that renders the variable conditionally independent of all other variables in the system. This concept is central in probabilistic graphical models and

feature selection In machine learning, feature selection is the process of selecting a subset of relevant Feature (machine learning), features (variables, predictors) for use in model construction. Feature selection techniques are used for several reasons: * sim ...

. If a Markov blanket is minimal—meaning that no variable in it can be removed without losing this conditional independence—it is called a Markov boundary. Identifying a Markov blanket or boundary allows for efficient

inference Inferences are steps in logical reasoning, moving from premises to logical consequences; etymologically, the word '' infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinct ...

and helps isolate relevant variables for prediction or causal reasoning. The terms of Markov blanket and Markov boundary were coined by Judea Pearl in 1988. A Markov blanket may be derived from the structure of a probabilistic graphical model such as a Bayesian network or Markov random field.

Markov blanket

A Markov blanket of a random variable

Y

in a random variable set

\mathcal=\

is any subset

\mathcal_1

\mathcal

, conditioned on which other variables are independent with

Y

Y \perp\!\!\!\perp \mathcal \setminus \mathcal_1 \mid \mathcal_1

It means that

\mathcal_1

contains at least all the information one needs to infer

Y

, where the variables in

\mathcal \setminus \mathcal_1

are redundant. In general, a given Markov blanket is not unique. Any set in

\mathcal

that contains a Markov blanket is also a Markov blanket itself. Specifically,

\mathcal

is a Markov blanket of

Y

\mathcal

Example

In a Bayesian network, the Markov blanket of a

node In general, a node is a localized swelling (a "knot") or a point of intersection (a vertex). Node may refer to: In mathematics * Vertex (graph theory), a vertex in a mathematical graph *Vertex (geometry), a point where two or more curves, lines ...

consists of its parents, its children, and its children's other parents (i.e., co-parents). Knowing the values of these nodes makes the target node conditionally independent of the rest of the network. In a Markov random field, the Markov blanket of a node is simply its immediate neighbors.

Markov condition

The concept of a Markov blanket is rooted in the Markov condition, which states that in a probabilistic graphical model, each variable is conditionally independent of its non-descendants given its parents. This condition implies the existence of a minimal separating set — the Markov blanket — that shields a variable from the rest of the network.

Markov boundary

A Markov boundary of

Y

\mathcal

is a subset

\mathcal_2

\mathcal

, such that

\mathcal_2

itself is a Markov blanket of

Y

, but any proper subset of

\mathcal_2

is not a Markov blanket of

Y

. In other words, a Markov boundary is a minimal Markov blanket. The Markov boundary of a

A

in a Bayesian network is the set of nodes composed of

A

's parents,

A

's children, and

A

's children's other parents. In a Markov random field, the Markov boundary for a node is the set of its neighboring nodes. In a dependency network, the Markov boundary for a node is the set of its parents.

Uniqueness of Markov boundary

The Markov boundary always exists. Under some mild conditions, the Markov boundary is unique. However, for most practical and theoretical scenarios multiple Markov boundaries may provide alternative solutions. When there are multiple Markov boundaries, quantities measuring causal effect could fail.{{cite journal , last1=Wang , first1=Yue , last2=Wang , first2=Linbo , title=Causal inference in degenerate systems: An impossibility result , journal=Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics , date=2020 , pages=3383-3392 , url=http://proceedings.mlr.press/v108/wang20i.html

Notes