In
probability theory
Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...
, Bernstein inequalities give bounds on the probability that the sum of random variables deviates from its mean. In the simplest case, let ''X''
1, ..., ''X''
''n'' be independent
Bernoulli random variables taking values +1 and −1 with probability 1/2 (this distribution is also known as the
Rademacher distribution), then for every positive
,
:
Bernstein inequalities were proved and published by
Sergei Bernstein in the 1920s and 1930s.
[J.V.Uspensky, "Introduction to Mathematical Probability", McGraw-Hill Book Company, 1937] Later, these inequalities were rediscovered several times in various forms. Thus, special cases of the Bernstein inequalities are also known as the
Chernoff bound,
Hoeffding's inequality and
Azuma's inequality.
Some of the inequalities
1. Let
be independent zero-mean random variables. Suppose that
almost surely, for all
Then, for all positive
,
:
2. Let
be independent zero-mean random variables. Suppose that for some positive real
and every integer
,
:
Then
:
3. Let
be independent zero-mean random variables. Suppose that
:
for all integer
Denote
:
Then,
:
4. Bernstein also proved generalizations of the inequalities above to weakly dependent random variables. For example, inequality (2) can be extended as follows. Let
be possibly non-independent random variables. Suppose that for all integers
,
:
Then
:
More general results for martingales can be found in Fan et al. (2015).
Proofs
The proofs are based on an application of
Markov's inequality to the random variable
:
for a suitable choice of the parameter
.
Generalizations
The Bernstein inequality could be generalized to Gaussian random matrices. Let
be a scalar where
is a complex Hermitian matrix and
is complex vector of size
. The vector
is a Gaussian vector of size
. Then for any
, we have
:
where
is the
vectorization operation. Also,
and
is the maximum eigenvalue of
. The proof is detailed in here.
Another similar inequality is also formulated as
:
where
and
is the maximum eigenvalue of
.
See also
*
Concentration inequality
In probability theory, concentration inequalities provide bounds on how a random variable deviates from some value (typically, its expected value). The law of large numbers of classical probability theory states that sums of independent random vari ...
- a summary of tail-bounds on random variables.
*
Hoeffding's inequality
References
(according to: S.N.Bernstein, Collected Works, Nauka, 1964)
A modern translation of some of these results can also be found in
{{DEFAULTSORT:Bernstein Inequalities (Probability Theory)
Probabilistic inequalities