In
probability theory
Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...
, a Chernoff bound is an exponentially decreasing upper bound on the tail of a random variable based on its
moment generating function. The minimum of all such exponential bounds forms ''the'' Chernoff or Chernoff-Cramér bound, which may decay faster than exponential (e.g.
sub-Gaussian).
It is especially useful for sums of independent random variables, such as sums of
Bernoulli random variable
In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli, is the discrete probability distribution of a random variable which takes the value 1 with probability p and the value 0 with pro ...
s.
The bound is commonly named after
Herman Chernoff
Herman Chernoff (born July 1, 1923) is an American applied mathematician, statistician and physicist. He was formerly a professor at University of Illinois Urbana-Champaign, Stanford, and MIT, currently emeritus at Harvard University.
Early lif ...
who described the method in a 1952 paper, though Chernoff himself attributed it to Herman Rubin. In 1938
Harald Cramér
Harald Cramér (; 25 September 1893 – 5 October 1985) was a Swedish mathematician, actuary, and statistician, specializing in mathematical statistics and probabilistic number theory. John Kingman described him as "one of the giants of statis ...
had published an almost identical concept now known as
Cramér's theorem.
It is a sharper bound than the first- or second-moment-based tail bounds such as
Markov's inequality
In probability theory, Markov's inequality gives an upper bound on the probability that a non-negative random variable is greater than or equal to some positive Constant (mathematics), constant. Markov's inequality is tight in the sense that for e ...
or
Chebyshev's inequality
In probability theory, Chebyshev's inequality (also called the Bienaymé–Chebyshev inequality) provides an upper bound on the probability of deviation of a random variable (with finite variance) from its mean. More specifically, the probability ...
, which only yield power-law bounds on tail decay. However, when applied to sums the Chernoff bound requires the random variables to be independent, a condition that is not required by either Markov's inequality or Chebyshev's inequality.
The Chernoff bound is related to the
Bernstein inequalities. It is also used to prove
Hoeffding's inequality,
Bennett's inequality, and
McDiarmid's inequality.
Generic Chernoff bounds

The generic Chernoff bound for a random variable
is attained by applying
Markov's inequality
In probability theory, Markov's inequality gives an upper bound on the probability that a non-negative random variable is greater than or equal to some positive Constant (mathematics), constant. Markov's inequality is tight in the sense that for e ...
to
(which is why it is sometimes called the ''exponential Markov'' or ''exponential moments'' bound). For positive
this gives a bound on the
right tail of
in terms of its
moment-generating function
In probability theory and statistics, the moment-generating function of a real-valued random variable is an alternative specification of its probability distribution. Thus, it provides the basis of an alternative route to analytical results compare ...
:
:
Since this bound holds for every positive
, we may take the
infimum
In mathematics, the infimum (abbreviated inf; : infima) of a subset S of a partially ordered set P is the greatest element in P that is less than or equal to each element of S, if such an element exists. If the infimum of S exists, it is unique ...
:
:
Performing the same analysis with negative
we get a similar bound on the
left tail:
:
and
:
The quantity
can be expressed as the expected value
, or equivalently
.
Properties
The exponential function is convex, so by
Jensen's inequality
In mathematics, Jensen's inequality, named after the Danish mathematician Johan Jensen, relates the value of a convex function of an integral to the integral of the convex function. It was proved by Jensen in 1906, building on an earlier p ...
. It follows that the bound on the right tail is greater or equal to one when
, and therefore trivial; similarly, the left bound is trivial for
. We may therefore combine the two infima and define the two-sided Chernoff bound:
which provides an upper bound on the folded
cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ever ...
of
(folded at the mean, not the median).
The logarithm of the two-sided Chernoff bound is known as the
rate function (or ''Cramér transform'')
. It is equivalent to the
Legendre–Fenchel transform or
convex conjugate
In mathematics and mathematical optimization, the convex conjugate of a function is a generalization of the Legendre transformation which applies to non-convex functions. It is also known as Legendre–Fenchel transformation, Fenchel transformati ...
of the
cumulant generating function
In probability theory and statistics, the cumulants of a probability distribution are a set of quantities that provide an alternative to the '' moments'' of the distribution. Any two probability distributions whose moments are identical will have ...
, defined as:
The
moment generating function is
log-convex, so by a property of the convex conjugate, the Chernoff bound must be
log-concave. The Chernoff bound attains its maximum at the mean,
, and is invariant under translation:
.
The Chernoff bound is exact if and only if
is a single concentrated mass (
degenerate distribution
In probability theory, a degenerate distribution on a measure space (E, \mathcal, \mu) is a probability distribution whose support is a null set with respect to \mu. For instance, in the -dimensional space endowed with the Lebesgue measure, an ...
). The bound is tight only at or beyond the extremes of a bounded random variable, where the infima are attained for infinite
. For unbounded random variables the bound is nowhere tight, though it is asymptotically tight up to sub-exponential factors ("exponentially tight"). Individual moments can provide tighter bounds, at the cost of greater analytical complexity.
In practice, the exact Chernoff bound may be unwieldy or difficult to evaluate analytically, in which case a suitable upper bound on the moment (or cumulant) generating function may be used instead (e.g. a sub-parabolic CGF giving a sub-Gaussian Chernoff bound).
Bounds from below from the MGF
Using only the moment generating function, a bound from below on the tail probabilities can be obtained by applying the
Paley-Zygmund inequality to
, yielding:
(a bound on the left tail is obtained for negative
). Unlike the Chernoff bound however, this result is not exponentially tight.
Theodosopoulos constructed a tight(er) MGF-based bound from below using an
exponential tilting procedure.
For particular distributions (such as the
binomial
Binomial may refer to:
In mathematics
*Binomial (polynomial), a polynomial with two terms
*Binomial coefficient, numbers appearing in the expansions of powers of binomials
*Binomial QMF, a perfect-reconstruction orthogonal wavelet decomposition
* ...
) bounds from below of the same exponential order as the Chernoff bound are often available.
Sums of independent random variables
When is the sum of independent random variables , the moment generating function of is the product of the individual moment generating functions, giving that:
and:
: