HOME

TheInfoList



OR:

In the mathematical theory of
probability Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speaking, ...
, the entropy rate or source information rate of a
stochastic process In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables. Stochastic processes are widely used as mathematical models of systems and phenomena that appea ...
is, informally, the time density of the average information in a stochastic process. For stochastic processes with a
countable In mathematics, a set is countable if either it is finite or it can be made in one to one correspondence with the set of natural numbers. Equivalently, a set is ''countable'' if there exists an injective function from it into the natural numbe ...
index, the
entropy Entropy is a scientific concept, as well as a measurable physical property, that is most commonly associated with a state of disorder, randomness, or uncertainty. The term and the concept are used in diverse fields, from classical thermodyna ...
rate H(X) is the limit of the
joint entropy In information theory, joint entropy is a measure of the uncertainty associated with a set of variables. Definition The joint Shannon entropy (in bits) of two discrete random variables X and Y with images \mathcal X and \mathcal Y is defined ...
of n members of the process X_k divided by n, as n tends to
infinity Infinity is that which is boundless, endless, or larger than any natural number. It is often denoted by the infinity symbol . Since the time of the ancient Greeks, the philosophical nature of infinity was the subject of many discussions a ...
: :H(X) = \lim_ \frac H(X_1, X_2, \dots X_n) when the limit exists. An alternative, related quantity is: :H'(X) = \lim_ H(X_n, X_, X_, \dots X_1) For strongly stationary stochastic processes, H(X) = H'(X). The entropy rate can be thought of as a general property of stochastic sources; this is the
asymptotic equipartition property In information theory, the asymptotic equipartition property (AEP) is a general property of the output samples of a stochastic source. It is fundamental to the concept of typical set used in theories of data compression. Roughly speaking, the t ...
. The entropy rate may be used to estimate the complexity of stochastic processes. It is used in diverse applications ranging from characterizing the complexity of languages, blind source separation, through to optimizing quantizers and data compression algorithms. For example, a maximum entropy rate criterion may be used for
feature selection In machine learning and statistics, feature selection, also known as variable selection, attribute selection or variable subset selection, is the process of selecting a subset of relevant features (variables, predictors) for use in model const ...
in
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machin ...
.


Entropy rates for Markov chains

Since a stochastic process defined by a
Markov chain A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happe ...
that is irreducible,
aperiodic A periodic function is a function that repeats its values at regular intervals. For example, the trigonometric functions, which repeat at intervals of 2\pi radians, are periodic functions. Periodic functions are used throughout science to de ...
and positive recurrent has a
stationary distribution Stationary distribution may refer to: * A special distribution for a Markov chain such that if the chain starts with its stationary distribution, the marginal distribution of all states at any time will always be the stationary distribution. Assum ...
, the entropy rate is independent of the initial distribution. For example, for such a Markov chain Y_k defined on a
countable In mathematics, a set is countable if either it is finite or it can be made in one to one correspondence with the set of natural numbers. Equivalently, a set is ''countable'' if there exists an injective function from it into the natural numbe ...
number of states, given the transition matrix P_, H(Y) is given by: :\displaystyle H(Y) = - \sum_ \mu_i P_ \log P_ where \mu_i is the
asymptotic distribution In mathematics and statistics, an asymptotic distribution is a probability distribution that is in a sense the "limiting" distribution of a sequence of distributions. One of the main uses of the idea of an asymptotic distribution is in providing ap ...
of the chain. A simple consequence of this definition is that an
i.i.d. In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is usual ...
stochastic process In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables. Stochastic processes are widely used as mathematical models of systems and phenomena that appea ...
has an entropy rate that is the same as the
entropy Entropy is a scientific concept, as well as a measurable physical property, that is most commonly associated with a state of disorder, randomness, or uncertainty. The term and the concept are used in diverse fields, from classical thermodyna ...
of any individual member of the process.


See also

*
Information source (mathematics) In mathematics, an information source is a sequence of random variables ranging over a finite alphabet Γ, having a stationary distribution. The uncertainty, or entropy rate, of an information source is defined as :H\ = \lim_ H(X_n , X_0, ...
*
Markov information source In mathematics, a Markov information source, or simply, a Markov source, is an information source whose underlying dynamics are given by a stationary finite Markov chain. Formal definition An information source is a sequence of random variables ...
*
Asymptotic equipartition property In information theory, the asymptotic equipartition property (AEP) is a general property of the output samples of a stochastic source. It is fundamental to the concept of typical set used in theories of data compression. Roughly speaking, the t ...
*
Maximal entropy random walk Maximal entropy random walk (MERW) is a popular type of biased random walk on a graph, in which transition probabilities are chosen accordingly to the principle of maximum entropy, which says that the probability distribution which best represents ...
- chosen to maximize entropy rate


References

* Cover, T. and Thomas, J. (1991) Elements of Information Theory, John Wiley and Sons, Inc., {{ISBN, 0-471-06259-6}

Information theory Entropy Markov models Temporal rates