Markov Kernel
   HOME

TheInfoList



OR:

In
probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...
, a Markov kernel (also known as a stochastic kernel or probability kernel) is a map that in the general theory of
Markov process A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happe ...
es plays the role that the transition matrix does in the theory of Markov processes with a
finite Finite is the opposite of infinite. It may refer to: * Finite number (disambiguation) * Finite set, a set whose cardinality (number of elements) is some natural number * Finite verb, a verb form that has a subject, usually being inflected or marked ...
state space A state space is the set of all possible configurations of a system. It is a useful abstraction for reasoning about the behavior of a given system and is widely used in the fields of artificial intelligence and game theory. For instance, the toy ...
.


Formal definition

Let (X,\mathcal A) and (Y,\mathcal B) be
measurable space In mathematics, a measurable space or Borel space is a basic object in measure theory. It consists of a set and a σ-algebra, which defines the subsets that will be measured. Definition Consider a set X and a σ-algebra \mathcal A on X. Then the ...
s. A ''Markov kernel'' with source (X,\mathcal A) and target (Y,\mathcal B) is a map \kappa : \mathcal B \times X \to ,1/math> with the following properties: # For every (fixed) B \in \mathcal B, the map x \mapsto \kappa(B, x) is \mathcal A-measurable # For every (fixed) x \in X, the map B \mapsto \kappa(B, x) is a
probability measure In mathematics, a probability measure is a real-valued function defined on a set of events in a probability space that satisfies measure properties such as ''countable additivity''. The difference between a probability measure and the more gener ...
on (Y, \mathcal B) In other words it associates to each point x \in X a
probability measure In mathematics, a probability measure is a real-valued function defined on a set of events in a probability space that satisfies measure properties such as ''countable additivity''. The difference between a probability measure and the more gener ...
\kappa(dy, x): B \mapsto \kappa(B, x) on (Y,\mathcal B) such that, for every measurable set B\in\mathcal B, the map x\mapsto \kappa(B, x) is measurable with respect to the \sigma-algebra \mathcal A.


Examples


Simple random walk In mathematics, a random walk is a random process that describes a path that consists of a succession of random steps on some mathematical space. An elementary example of a random walk is the random walk on the integer number line \mathbb Z ...
on the integers

Take X=Y=\Z, and \mathcal A = \mathcal B = \mathcal P(\Z) (the
power set In mathematics, the power set (or powerset) of a set is the set of all subsets of , including the empty set and itself. In axiomatic set theory (as developed, for example, in the ZFC axioms), the existence of the power set of any set is po ...
of \Z). Then a Markov kernel is fully determined by the probability it assigns to singletons \,\, m \in Y = \Z for each n \in X = \Z: :\kappa(B, n )=\sum_\kappa(\, n), \qquad \forall n \in \mathbb, \, \forall B \in \mathcal B. Now the random walk \kappa that goes to the right with probability p and to the left with probability 1 - p is defined by :\kappa(\, n)= p \delta_+ (1-p) \delta_, \quad \forall n,m \in \Z where \delta is the
Kronecker delta In mathematics, the Kronecker delta (named after Leopold Kronecker) is a function of two variables, usually just non-negative integers. The function is 1 if the variables are equal, and 0 otherwise: \delta_ = \begin 0 &\text i \neq j, \\ 1 &\ ...
. The transition probabilities P(m, n) = \kappa(\, n) for the random walk are equivalent to the Markov kernel.


General

Markov process A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happe ...
es with countable state space

More generally take X and Y both countable and \mathcal A = \mathcal P(X),\ \mathcal B = \mathcal P(Y). Again a Markov kernel is defined by the probability it assigns to singleton sets for each i \in X :\kappa(B, i)=\sum_\kappa(\, i), \qquad \forall i \in X, \, \forall B \in \mathcal B, We define a Markov process by defining a transition probability P(j, i) = K_ where the numbers K_ define a (countable)
stochastic matrix In mathematics, a stochastic matrix is a square matrix used to describe the transitions of a Markov chain. Each of its entries is a nonnegative real number representing a probability. It is also called a probability matrix, transition matrix, ...
(K_) i.e. :\begin K_ &\ge 0, \qquad &\forall (j,i) \in Y\times X, \\ \sum_K_&=1, \qquad &\forall i \in X.\\ \end We then define : \kappa(\ , i) = K_ = P(j, i), \qquad \forall i \in X, \quad \forall B \in \mathcal B. Again the transition probability, the stochastic matrix and the Markov kernel are equivalent reformulations.


Markov kernel defined by a kernel function and a measure

Let \nu be a
measure Measure may refer to: * Measurement, the assignment of a number to a characteristic of an object or event Law * Ballot measure, proposed legislation in the United States * Church of England Measure, legislation of the Church of England * Mea ...
on (Y, \mathcal B), and k: Y \times X\to , \infty/math> a
measurable function In mathematics and in particular measure theory, a measurable function is a function between the underlying sets of two measurable spaces that preserves the structure of the spaces: the preimage of any measurable set is measurable. This is in di ...
with respect to the product \sigma-algebra \mathcal A \otimes \mathcal B such that : \int_Y k(y, x)\nu(\mathrm y) = 1, \qquad \forall x \in X , then \kappa(dy , x) = k(y, x)\nu(dy) i.e. the mapping :\begin \kappa:\mathcal B \times X \to ,1\\ \kappa(B, x)=\int_k(y, x)\nu(\mathrm y) \end defines a Markov kernel. This example generalises the countable Markov process example where \nu was the
counting measure In mathematics, specifically measure theory, the counting measure is an intuitive way to put a measure on any set – the "size" of a subset is taken to be the number of elements in the subset if the subset has finitely many elements, and infinity ...
. Moreover it encompasses other important examples such as the convolution kernels, in particular the Markov kernels defined by the heat equation. The latter example includes the
Gaussian kernel In mathematics, a Gaussian function, often simply referred to as a Gaussian, is a function of the base form f(x) = \exp (-x^2) and with parametric extension f(x) = a \exp\left( -\frac \right) for arbitrary real constants , and non-zero . It is n ...
on X = Y = \mathbb R with \nu(dx) = dx standard Lebesgue measure and :k_t(y, x) = \frace^.


Measurable functions

Take (X, \mathcal) and (Y, \mathcal) arbitrary measurable spaces, and let f:X \to Y be a measurable function. Now define \kappa(dy, x) = \delta_(dy) i.e. : \kappa(B, x) = \mathbf_B(f(x)) = \mathbf_(x) = \begin1 & \text f(x) \in B\\ 0 & \text\end for all B \in \mathcal. Note that the indicator function \mathbf_ is \mathcal-measurable for all B \in \mathcal iff f is measurable. This example allows us to think of a Markov kernel as a generalised function with a (in general) random rather than certain value. That is, it is a
multivalued function In mathematics, a multivalued function, also called multifunction, many-valued function, set-valued function, is similar to a function, but may associate several values to each input. More precisely, a multivalued function from a domain to a ...
where the values are not equally weighted.


Galton–Watson process The Galton–Watson process is a branching stochastic process arising from Francis Galton's statistical investigation of the extinction of family names. The process models family names as patrilineal (passed from father to son), while offspri ...

As a less obvious example, take X = \N, \mathcal A = \mathcal P(\N), and (Y, \mathcal B) the real numbers \R with the standard sigma algebra of
Borel set In mathematics, a Borel set is any set in a topological space that can be formed from open sets (or, equivalently, from closed sets) through the operations of countable union, countable intersection, and relative complement. Borel sets are named ...
s. Then :\kappa(B, n)=\begin \mathbf_B(0) & n=0\\ \Pr(\xi_1 + \cdots + \xi_x \in B) & n \neq 0 \\ \end with
i.i.d. In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is us ...
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s \xi_i (usually with mean 0) and where \mathbf_B is the indicator function. For the simple case of coin flips this models the different levels of a
Galton board The Galton board, also known as the Galton box or quincunx or bean machine, is a device invented by Sir Francis Galton to demonstrate the central limit theorem, in particular that with sufficient sample size the binomial distribution approxim ...
.


Composition of Markov Kernels and the Markov Category

Given measurable spaces (X, \mathcal A), (Y, \mathcal B) we consider a Markov kernel \kappa: \mathcal B \times X \to ,1/math> as a morphism \kappa: X \to Y. Intuitively, rather than assigning to each x \in X a sharply defined point y \in Y the kernel assigns a "fuzzy" point in Y which is only known with some level of uncertainty, much like actual physical measurements. If we have a third measurable space (Z, \mathcal C), and probability kernels \kappa: X \to Y and \lambda: Y \to Z, we can define a composition \lambda \circ \kappa : X \to Z by :(\lambda \circ \kappa) (dz, x) = \int_Y \lambda(dz , y)\kappa(dy, x). The composition is associative by the Monotone Convergence Theorem and the identity function considered as a Markov kernel (i.e. the delta measure \kappa_(dx', x) = \delta_x(dx')) is the unit for this composition. This composition defines the structure of a
category Category, plural categories, may refer to: Philosophy and general uses * Categorization, categories in cognitive science, information science and generally *Category of being * ''Categories'' (Aristotle) *Category (Kant) *Categories (Peirce) * ...
on the measurable spaces with Markov kernels as morphisms first defined by Lawvere. The category has the empty set as initial object and the one point set * as the terminal object. From this point of view a probability space (\Omega, \mathcal A, \mathbb P) is the same thing as a pointed space * \to \Omega in the
Markov category Markov (Bulgarian, russian: Марков), Markova, and Markoff are common surnames used in Russia and Bulgaria. Notable people with the name include: Academics *Ivana Markova (born 1938), Czechoslovak-British emeritus professor of psychology at t ...
.


Probability Space defined by Probability Distribution and a Markov Kernel

A composition of a probability space (X, \mathcal A, P_X) and a probability kernel \kappa: (X, \mathcal A) \to (Y, \mathcal B) defines a probability space (Y, \mathcal B, P_Y = \kappa \circ P_X), where the probability measure is given by : P_Y(B) = \int_X \int_B \kappa(dy, x) P_X(dx) = \int_X \kappa(B, x)P_X(dx) = \mathbb_\kappa(B, \cdot).


Properties


Semidirect product

Let (X, \mathcal A, P) be a probability space and \kappa a Markov kernel from (X, \mathcal A) to some (Y, \mathcal B). Then there exists a unique measure Q on (X \times Y, \mathcal A \otimes \mathcal B), such that: : Q(A \times B) = \int_A \kappa(B, x)\,P(dx), \quad \forall A \in \mathcal A, \quad \forall B \in\mathcal B.


Regular conditional distribution

Let (S,Y) be a Borel space, X a (S,Y)-valued random variable on the measure space (\Omega, \mathcal, P) and \mathcal G \subseteq \mathcal F a sub-\sigma-algebra. Then there exists a Markov kernel \kappa from (\Omega, \mathcal G) to (S,Y), such that \kappa(\cdot,B) is a version of the
conditional expectation In probability theory, the conditional expectation, conditional expected value, or conditional mean of a random variable is its expected value – the value it would take “on average” over an arbitrarily large number of occurrences – give ...
\mathbb mathbf 1_ \mid \mathcal G/math> for every B \in Y, i.e. :P(X \in B\mid\mathcal G)=\mathbb \left mathbf 1_\mid\mathcal G \right = \kappa(\cdot,B), \qquad P\text\,\, \forall B \in \mathcal G. It is called regular conditional distribution of X given \mathcal G and is not uniquely defined.


Generalizations

Transition kernel In the mathematics of probability, a transition kernel or kernel is a function in mathematics that has different applications. Kernels can for example be used to define random measures or stochastic processes. The most important example of kernels ...
s generalize Markov kernels in the sense that for all x \in X, the map : B \mapsto \kappa(B, x) can be any type of (non negative) measure, not necessarily a probability measure.


References

* {{citation, first1=Heinz, last1=Bauer, title=Probability Theory, publisher=de Gruyter, year=1996, isbn=3-11-013935-9 :: §36. Kernels and semigroups of kernels Markov processes