In
mathematics
Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...
— specifically, in
large deviations theory
In probability theory, the theory of large deviations concerns the asymptotic behaviour of remote tails of sequences of probability distributions. While some basic ideas of the theory can be traced to Laplace, the formalization started with insura ...
— a rate function is a function used to quantify the
probabilities of rare events. It is required to have several properties which assist in the formulation of the large deviation principle. In some sense, the large deviation principle is an analogue of
weak convergence of probability measures, but one which takes account of how well the rare events behave.
A rate function is also called a Cramér function, after the Swedish probabilist
Harald Cramér
Harald Cramér (; 25 September 1893 – 5 October 1985) was a Swedish mathematician, actuary, and statistician, specializing in mathematical statistics and probabilistic number theory. John Kingman described him as "one of the giants of statist ...
.
Definitions
Rate function An
extended real-valued function ''I'' : ''X'' →
, +∞defined on a
Hausdorff topological space
In mathematics, a topological space is, roughly speaking, a geometrical space in which closeness is defined but cannot necessarily be measured by a numeric distance. More specifically, a topological space is a set whose elements are called points ...
''X'' is said to be a rate function if it is not identically +∞ and is
lower semi-continuous
In mathematical analysis, semicontinuity (or semi-continuity) is a property of extended real-valued functions that is weaker than continuity. An extended real-valued function f is upper (respectively, lower) semicontinuous at a point x_0 if, rou ...
, i.e. all the sub-level sets
:
are
closed in ''X''.
If, furthermore, they are
compact
Compact as used in politics may refer broadly to a pact or treaty; in more specific cases it may refer to:
* Interstate compact
* Blood compact, an ancient ritual of the Philippines
* Compact government, a type of colonial rule utilized in British ...
, then ''I'' is said to be a good rate function.
A family of
probability measure
In mathematics, a probability measure is a real-valued function defined on a set of events in a probability space that satisfies measure properties such as ''countable additivity''. The difference between a probability measure and the more gener ...
s (''μ''
''δ'')
''δ'' > 0 on ''X'' is said to satisfy the large deviation principle with rate function ''I'' : ''X'' →
, +∞) (and rate 1 ⁄ ''δ'') if, for every closed set ''F'' ⊆ ''X'' and every open set ''G'' ⊆ ''X'',
:
:
If the upper bound (U) holds only for compact (instead of closed) sets ''F'', then (''μ''
''δ'')
''δ''>0 is said to satisfy the weak large deviations principle (with rate 1 ⁄ ''δ'' and weak rate function ''I'').
Remarks
The role of the open and closed sets in the large deviation principle is similar to their role in the weak convergence of probability measures: recall that (''μ''
''δ'')
''δ'' > 0 is said to converge weakly to ''μ'' if, for every closed set ''F'' ⊆ ''X'' and every open set ''G'' ⊆ ''X'',
:
:
There is some variation in the nomenclature used in the literature: for example, den Hollander (2000) uses simply "rate function" where this article — following Dembo & Zeitouni (1998) — uses "good rate function", and "weak rate function". Regardless of the nomenclature used for rate functions, examination of whether the upper bound inequality (U) is supposed to hold for closed or compact sets tells one whether the large deviation principle in use is strong or weak.
Properties
Uniqueness
A natural question to ask, given the somewhat abstract setting of the general framework above, is whether the rate function is unique. This turns out to be the case: given a sequence of probability measures (''μ''
''δ'')
''δ''>0 on ''X'' satisfying the large deviation principle for two rate functions ''I'' and ''J'', it follows that ''I''(''x'') = ''J''(''x'') for all ''x'' ∈ ''X''.
Exponential tightness
It is possible to convert a weak large deviation principle into a strong one if the measures converge sufficiently quickly. If the upper bound holds for compact sets ''F'' and the sequence of measures (''μ''
''δ'')
''δ''>0 is
exponentially tight, then the upper bound also holds for closed sets ''F''. In other words, exponential tightness enables one to convert a weak large deviation principle into a strong one.
Continuity
Naïvely, one might try to replace the two inequalities (U) and (L) by the single requirement that, for all Borel sets ''S'' ⊆ ''X'',
:
The equality (E) is far too restrictive, since many interesting examples satisfy (U) and (L) but not (E). For example, the measure ''μ''
''δ'' might be
non-atomic for all ''δ'', so the equality (E) could hold for ''S'' = only if ''I'' were identically +∞, which is not permitted in the definition. However, the inequalities (U) and (L) do imply the equality (E) for so-called ''I''-continuous sets ''S'' ⊆ ''X'', those for which
:
where
and
denote the
interior and
closure of ''S'' in ''X'' respectively. In many examples, many sets/events of interest are ''I''-continuous. For example, if ''I'' is a
continuous function
In mathematics, a continuous function is a function such that a continuous variation (that is a change without jump) of the argument induces a continuous variation of the value of the function. This means that there are no abrupt changes in value ...
, then all sets ''S'' such that
:
are ''I''-continuous; all open sets, for example, satisfy this containment.
Transformation of large deviation principles
Given a large deviation principle on one space, it is often of interest to be able to construct a large deviation principle on another space. There are several results in this area:
* the
contraction principle tells one how a large deviation principle on one space "pushes forward" (via the
pushforward
The notion of pushforward in mathematics is "dual" to the notion of pullback, and can mean a number of different but closely related things.
* Pushforward (differential), the differential of a smooth map between manifolds, and the "pushforward" op ...
of a probability measure) to a large deviation principle on another space ''via'' a
continuous function
In mathematics, a continuous function is a function such that a continuous variation (that is a change without jump) of the argument induces a continuous variation of the value of the function. This means that there are no abrupt changes in value ...
;
* the
Dawson-Gärtner theorem tells one how a sequence of large deviation principles on a sequence of spaces passes to the
projective limit
In mathematics, the inverse limit (also called the projective limit) is a construction that allows one to "glue together" several related objects, the precise gluing process being specified by morphisms between the objects. Thus, inverse limits c ...
.
* the
tilted large deviation principle
In mathematics — specifically, in large deviations theory — the tilted large deviation principle is a result that allows one to generate a new large deviation principle from an old one by "tilting", i.e. integration against an expon ...
gives a large deviation principle for integrals of exponential
functional
Functional may refer to:
* Movements in architecture:
** Functionalism (architecture)
** Form follows function
* Functional group, combination of atoms within molecules
* Medical conditions without currently visible organic basis:
** Functional sy ...
s.
*
exponentially equivalent measures have the same large deviation principles.
History and basic development
The notion of a rate function emerged in the 1930s with the Swedish mathematician
Harald Cramér
Harald Cramér (; 25 September 1893 – 5 October 1985) was a Swedish mathematician, actuary, and statistician, specializing in mathematical statistics and probabilistic number theory. John Kingman described him as "one of the giants of statist ...
's study of a sequence of
i.i.d.
In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is us ...
random variables (''Z''
''i'')
i∈. Namely, among some considerations of scaling, Cramér studied the behavior of the distribution of the average
as ''n''→∞.
He found that the tails of the distribution of ''X''
''n'' decay exponentially as ''e''
−''nλ''(''x'') where the factor ''λ''(''x'') in the exponent is the Legendre–Fenchel transform (a.k.a. the
convex conjugate
In mathematics and mathematical optimization, the convex conjugate of a function is a generalization of the Legendre transformation which applies to non-convex functions. It is also known as Legendre–Fenchel transformation, Fenchel transformation ...
) of the
cumulant
In probability theory and statistics, the cumulants of a probability distribution are a set of quantities that provide an alternative to the '' moments'' of the distribution. Any two probability distributions whose moments are identical will have ...
-generating function
For this reason this particular function ''λ''(''x'') is sometimes called the Cramér function. The rate function defined above in this article is a broad generalization of this notion of Cramér's, defined more abstractly on a
probability space
In probability theory, a probability space or a probability triple (\Omega, \mathcal, P) is a mathematical construct that provides a formal model of a random process or "experiment". For example, one can define a probability space which models t ...
, rather than the
state space
A state space is the set of all possible configurations of a system. It is a useful abstraction for reasoning about the behavior of a given system and is widely used in the fields of artificial intelligence and game theory.
For instance, the toy ...
of a random variable.
See also
*
Extreme value theory
Extreme value theory or extreme value analysis (EVA) is a branch of statistics dealing with the extreme deviations from the median of probability distributions. It seeks to assess, from a given ordered sample of a given random variable, the pr ...
References
*
* {{MathSciNet, id=1739680
Asymptotic analysis
Large deviations theory