In
probability theory
Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set ...
, the central limit theorem (CLT) establishes that, in many situations, when
independent random variables
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s
* Independe ...
are summed up, their properly
normalized sum tends toward a
normal distribution
In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
:
f(x) = \frac e^
The parameter \mu ...
even if the original variables themselves are not normally distributed.
The theorem is a key concept in probability theory because it implies that probabilistic and
statistical methods that work for normal distributions can be applicable to many problems involving other types of distributions.
This theorem has seen many changes during the formal development of probability theory. Previous versions of the theorem date back to 1811, but in its modern general form, this fundamental result in probability theory was precisely stated as late as 1920, thereby serving as a bridge between classical and modern probability theory.
If
are
random samples
In statistics, quality assurance, and Statistical survey, survey methodology, sampling is the selection of a subset (a statistical sample) of individuals from within a population (statistics), statistical population to estimate characteristics o ...
drawn from a population with overall
mean
There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value (magnitude and sign) of a given data set.
For a data set, the '' ari ...
and finite
variance
In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbe ...
and if
is the
sample mean
The sample mean (or "empirical mean") and the sample covariance are statistics computed from a sample of data on one or more random variables.
The sample mean is the average value (or mean value) of a sample of numbers taken from a larger popu ...
of the first
samples, then the limiting form of the distribution, with
, is a standard normal distribution.
For example, suppose that a
sample
Sample or samples may refer to:
Base meaning
* Sample (statistics), a subset of a population – complete data set
* Sample (signal), a digital discrete sample of a continuous analog signal
* Sample (material), a specimen or small quantity of s ...
is obtained containing many
observations
Observation is the active acquisition of information from a primary source. In living beings, observation employs the senses. In science, observation can also involve the perception and recording of data via the use of scientific instrument ...
, each observation being randomly generated in a way that does not depend on the values of the other observations, and that the
arithmetic mean of the observed values is computed. If this procedure is performed many times, the central limit theorem says that the
probability distribution of the average will closely approximate a normal distribution.
The central limit theorem has several variants. In its common form, the random variables must be
independent and identically distributed
In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is usual ...
(i.i.d.). In variants, convergence of the mean to the normal distribution also occurs for non-identical distributions or for non-independent observations, if they comply with certain conditions.
The earliest version of this theorem, that the normal distribution may be used as an approximation to the
binomial distribution, is the
de Moivre–Laplace theorem
In probability theory, the de Moivre–Laplace theorem, which is a special case of the central limit theorem, states that the normal distribution may be used as an approximation to the binomial distribution under certain conditions. In particu ...
.
Independent sequences
Classical CLT
Let
be a sequence of
random samples
In statistics, quality assurance, and Statistical survey, survey methodology, sampling is the selection of a subset (a statistical sample) of individuals from within a population (statistics), statistical population to estimate characteristics o ...
— that is, a sequence of i.i.d. random variables drawn from a distribution of
expected value given by
and finite
variance
In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbe ...
given by Suppose we are interested in the
sample average
of the first
samples.
By the
law of large numbers, the sample averages
converge almost surely (and therefore also
converge in probability) to the expected value
as
The classical central limit theorem describes the size and the distributional form of the stochastic fluctuations around the deterministic number
during this convergence. More precisely, it states that as
gets larger, the distribution of the difference between the sample average
and its limit when multiplied by the factor
approximates the
normal distribution
In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
:
f(x) = \frac e^
The parameter \mu ...
with mean 0 and variance For large enough , the distribution of
gets arbitrarily close to the normal distribution with mean
and variance
The usefulness of the theorem is that the distribution of
approaches normality regardless of the shape of the distribution of the individual Formally, the theorem can be stated as follows:
In the case convergence in distribution means that the
cumulative distribution functions of
converge pointwise to the cdf of the
distribution: for every real
where
is the standard normal cdf evaluated The convergence is uniform in
in the sense that
where
denotes the least upper bound (or
supremum) of the set.
Lyapunov CLT
The theorem is named after Russian mathematician
Aleksandr Lyapunov
Aleksandr Mikhailovich Lyapunov (russian: Алекса́ндр Миха́йлович Ляпуно́в, ; – 3 November 1918) was a Russian mathematician, mechanician and physicist. His surname is variously romanized as Ljapunov, Liapunov, Lia ...
. In this variant of the central limit theorem the random variables
have to be independent, but not necessarily identically distributed. The theorem also requires that random variables
have
moments of some order and that the rate of growth of these moments is limited by the Lyapunov condition given below.
In practice it is usually easiest to check Lyapunov's condition for
If a sequence of random variables satisfies Lyapunov's condition, then it also satisfies Lindeberg's condition. The converse implication, however, does not hold.
Lindeberg CLT
In the same setting and with the same notation as above, the Lyapunov condition can be replaced with the following weaker one (from
Lindeberg in 1920).
Suppose that for every
where
is the
indicator function. Then the distribution of the standardized sums
converges towards the standard normal distribution
Multidimensional CLT
Proofs that use characteristic functions can be extended to cases where each individual
is a
random vector
In probability, and statistics, a multivariate random variable or random vector is a list of mathematical variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge of its value ...
in with mean vector