In
probability theory
Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...
, Hoeffding's lemma is an
inequality that bounds the
moment-generating function
In probability theory and statistics, the moment-generating function of a real-valued random variable is an alternative specification of its probability distribution. Thus, it provides the basis of an alternative route to analytical results compare ...
of any
bounded random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
, implying that such variables are
subgaussian. It is named after the
Finnish–
American mathematical statistician Wassily Hoeffding
Wassily Hoeffding (June 12, 1914 – February 28, 1991) was a Finnish-born American statistician and probabilist. Hoeffding was one of the founders of nonparametric statistics, in which Hoeffding contributed the idea and basic results on U-stat ...
.
The proof of Hoeffding's lemma uses
Taylor's theorem
In calculus, Taylor's theorem gives an approximation of a k-times differentiable function around a given point by a polynomial of degree k, called the k-th-order Taylor polynomial. For a smooth function, the Taylor polynomial is the truncation a ...
and
Jensen's inequality
In mathematics, Jensen's inequality, named after the Danish mathematician Johan Jensen, relates the value of a convex function of an integral to the integral of the convex function. It was proved by Jensen in 1906, building on an earlier p ...
. Hoeffding's lemma is itself used in the proof of
Hoeffding's inequality as well as the generalization
McDiarmid's inequality.
Statement
Let ''X'' be any real-valued random variable such that
almost surely
In probability theory, an event is said to happen almost surely (sometimes abbreviated as a.s.) if it happens with probability 1 (with respect to the probability measure). In other words, the set of outcomes on which the event does not occur ha ...
, i.e. with probability one. Then, for all
,
:
or equivalently,
:
Proof
The following proof is direct but somewhat ad-hoc. Another proof with a slightly worse constant are also available using symmetrization.
Statement
This statement and proof uses the language of
subgaussian variables and
exponential tilting, and is less ad-hoc.
Let
be any real-valued random variable such that
almost surely
In probability theory, an event is said to happen almost surely (sometimes abbreviated as a.s.) if it happens with probability 1 (with respect to the probability measure). In other words, the set of outcomes on which the event does not occur ha ...
, i.e. with probability one. Then it is subgaussian with
variance proxy norm
.
Given this general case, the formula
is a mere corollary of a general property of variance proxy.
See also
*
Hoeffding's inequality
*
Bennett's inequality
Notes
Probabilistic inequalities
{{probability-stub