In
probability theory
Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...
, Hoeffding's inequality provides an
upper bound
In mathematics, particularly in order theory, an upper bound or majorant of a subset of some preordered set is an element of that is greater than or equal to every element of .
Dually, a lower bound or minorant of is defined to be an element ...
on the
probability
Probability is the branch of mathematics concerning numerical descriptions of how likely an Event (probability theory), event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and ...
that the sum of bounded
independent random variables
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s
* Independe ...
deviates from its
expected value
In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a l ...
by more than a certain amount. Hoeffding's inequality was proven by
Wassily Hoeffding
Wassily Hoeffding (June 12, 1914 – February 28, 1991) was a Finnish statistician and probabilist. Hoeffding was one of the founders of nonparametric statistics, in which Hoeffding contributed the idea and basic results on U-statistics.
In pro ...
in 1963.
Hoeffding's inequality is a special case of the
Azuma–Hoeffding inequality In probability theory, the Azuma–Hoeffding inequality (named after Kazuoki Azuma and Wassily Hoeffding) gives a concentration result for the values of martingales that have bounded differences.
Suppose \ is a martingale (or super-martingale) ...
and
McDiarmid's inequality. It is similar to the
Chernoff bound
In probability theory, the Chernoff bound gives exponentially decreasing bounds on tail distributions of sums of independent random variables. Despite being named after Herman Chernoff, the author of the paper it first appeared in, the result is d ...
, but tends to be less sharp, in particular when the variance of the random variables is small. It is similar to, but incomparable with, one of
Bernstein's inequalities.
Statement
Let be
independent random variables
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s
* Independe ...
such that
almost surely
In probability theory, an event is said to happen almost surely (sometimes abbreviated as a.s.) if it happens with probability 1 (or Lebesgue measure 1). In other words, the set of possible exceptions may be non-empty, but it has probability 0. ...
. Consider the sum of these random variables,
:
Then Hoeffding's theorem states that, for all ,
:
Here is the
expected value
In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a l ...
of .
Note that the inequalities also hold when the have been obtained using sampling without replacement; in this case the random variables are not independent anymore. A proof of this statement can be found in Hoeffding's paper. For slightly better bounds in the case of sampling without replacement, see for instance the paper by .
Example
Suppose
and
for all ''i''. This can occur when ''X
i'' are independent
Bernoulli random variables, though they need not be identically distributed. Then we get the inequality
:
for all
. This is a version of the
additive Chernoff bound which is more general, since it allows for random variables that take values between zero and one, but also weaker, since the Chernoff bound gives a better tail bound when the random variables have small variance.
General case of sub-Gaussian random variables
The proof of Hoeffding's inequality can be generalized to any
sub-Gaussian distribution In probability theory, a sub-Gaussian distribution is a probability distribution with strong tail decay. Informally, the tails of a sub-Gaussian distribution are dominated by (i.e. decay at least as fast as) the tails of a Gaussian. This property gi ...
. In fact, the main lemma used in the proof,
Hoeffding's lemma, implies that bounded random variables are sub-Gaussian. A random variable is called sub-Gaussian, if
:
for some c>0. For a random variable , the following norm is finite if and only if is sub-Gaussian:
:
Then let be zero-mean independent sub-Gaussian random variables, the general version of the Hoeffding's inequality states that:
:
where ''c'' > 0 is an absolute constant.
Proof
The proof of Hoeffding's inequality follows similarly to concentration inequalities like
Chernoff bound
In probability theory, the Chernoff bound gives exponentially decreasing bounds on tail distributions of sums of independent random variables. Despite being named after Herman Chernoff, the author of the paper it first appeared in, the result is d ...
s.
The main difference is the use of
Hoeffding's Lemma:
::Suppose is a real random variable such that