Berry–Esseen theorem
   HOME

TheInfoList



OR:

In
probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set ...
, the
central limit theorem In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are summed up, their properly normalized sum tends toward a normal distribution even if the original variables themsel ...
states that, under certain circumstances, the
probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon ...
of the scaled mean of a random sample converges to a
normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu ...
as the sample size increases to infinity. Under stronger assumptions, the Berry–Esseen theorem, or Berry–Esseen inequality, gives a more quantitative result, because it also specifies the rate at which this convergence takes place by giving a bound on the maximal error of
approximation An approximation is anything that is intentionally similar but not exactly equal to something else. Etymology and usage The word ''approximation'' is derived from Latin ''approximatus'', from ''proximus'' meaning ''very near'' and the prefix ' ...
between the normal distribution and the true distribution of the scaled sample mean. The approximation is measured by the Kolmogorov–Smirnov distance. In the case of independent samples, the convergence rate is , where is the sample size, and the constant is estimated in terms of the
third Third or 3rd may refer to: Numbers * 3rd, the ordinal form of the cardinal number 3 * , a fraction of one third * 1⁄60 of a ''second'', or 1⁄3600 of a ''minute'' Places * 3rd Street (disambiguation) * Third Avenue (disambiguation) * Hi ...
absolute normalized moment.


Statement of the theorem

Statements of the theorem vary, as it was independently discovered by two
mathematician A mathematician is someone who uses an extensive knowledge of mathematics in their work, typically to solve mathematical problems. Mathematicians are concerned with numbers, data, quantity, structure, space, models, and change. History On ...
s, Andrew C. Berry (in 1941) and Carl-Gustav Esseen (1942), who then, along with other authors, refined it repeatedly over subsequent decades.


Identically distributed summands

One version, sacrificing generality somewhat for the sake of clarity, is the following: :There exists a positive constant ''C'' such that if ''X''1, ''X''2, ..., are
i.i.d. random variables In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is us ...
with E(''X''1) = 0, E(''X''12) = σ2 > 0, and E(, ''X''1, 3) = ρ < ∞,Since the random variables are identically distributed, ''X''2, ''X''3, ... all have the same moments as ''X''1. and if we define ::Y_n = :the
sample mean The sample mean (or "empirical mean") and the sample covariance are statistics computed from a sample of data on one or more random variables. The sample mean is the average value (or mean value) of a sample of numbers taken from a larger popu ...
, with ''F''''n'' the
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Eve ...
of ::, :and Φ the cumulative distribution function of the
standard normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu ...
, then for all ''x'' and ''n'', ::\left, F_n(x) - \Phi(x)\ \le .\ \ \ \ (1) That is: given a sequence of
independent and identically distributed random variables In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is usu ...
, each having
mean There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value ( magnitude and sign) of a given data set. For a data set, the '' ar ...
zero and positive
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbe ...
, if additionally the third absolute
moment Moment or Moments may refer to: * Present time Music * The Moments, American R&B vocal group Albums * ''Moment'' (Dark Tranquillity album), 2020 * ''Moment'' (Speed album), 1998 * ''Moments'' (Darude album) * ''Moments'' (Christine Guldbrand ...
is finite, then the
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Eve ...
s of the
standardized Standardization or standardisation is the process of implementing and developing technical standards based on the consensus of different parties that include firms, users, interest groups, standards organizations and governments. Standardization ...
sample mean and the standard normal distribution differ (vertically, on a graph) by no more than the specified amount. Note that the approximation error for all ''n'' (and hence the limiting rate of convergence for indefinite ''n'' sufficiently large) is bounded by the
order Order, ORDER or Orders may refer to: * Categorization, the process in which ideas and objects are recognized, differentiated, and understood * Heterarchy, a system of organization wherein the elements have the potential to be ranked a number of ...
of ''n''−1/2. Calculated values of the constant ''C'' have decreased markedly over the years, from the original value of 7.59 by , to 0.7882 by , then 0.7655 by , then 0.7056 by , then 0.7005 by , then 0.5894 by , then 0.5129 by , then 0.4785 by . The detailed review can be found in the papers and . The best estimate , ''C'' < 0.4748, follows from the inequality :\sup_\left, F_n(x) - \Phi(x)\ \le , due to , since σ3 ≤ ρ and 0.33554 · 1.415 < 0.4748. However, if ρ ≥ 1.286σ3, then the estimate :\sup_\left, F_n(x) - \Phi(x)\ \le , which is also proved in , gives an even tighter upper estimate. proved that the constant also satisfies the lower bound : C\geq\frac \approx 0.40973 \approx \frac + 0.01079 .


Non-identically distributed summands

:Let ''X''1, ''X''2, ..., be independent random variables with E(''X''''i'') = 0, E(''X''''i''2) = σ''i''2 > 0, and E(, ''X''''i'', 3) = ρ''i'' < ∞. Also, let ::S_n = :be the normalized ''n''-th partial sum. Denote ''F''''n'' the cdf of ''S''''n'', and Φ the cdf of the
standard normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu ...
. For the sake of convenience denote ::\vec=(\sigma_1,\ldots,\sigma_n),\ \vec=(\rho_1,\ldots,\rho_n). :In 1941, Andrew C. Berry proved that for all ''n'' there exists an absolute constant ''C''1 such that ::\sup_\left, F_n(x) - \Phi(x)\ \le C_1\cdot\psi_1,\ \ \ \ (2) :where ::\psi_1=\psi_1\big(\vec,\vec\big)=\Big(\Big)^\cdot\max_\frac. :Independently, in 1942, Carl-Gustav Esseen proved that for all ''n'' there exists an absolute constant ''C''0 such that ::\sup_\left, F_n(x) - \Phi(x)\ \le C_0\cdot\psi_0, \ \ \ \ (3) :where ::\psi_0=\psi_0\big(\vec,\vec\big)=\Big(\Big)^\cdot\sum\limits_^n\rho_i. It is easy to make sure that ψ0≤ψ1. Due to this circumstance inequality (3) is conventionally called the Berry–Esseen inequality, and the quantity ψ0 is called the Lyapunov fraction of the third order. Moreover, in the case where the summands ''X''1, ..., ''X''''n'' have identical distributions ::\psi_0=\psi_1=\frac, and thus the bounds stated by inequalities (1), (2) and (3) coincide apart from the constant. Regarding ''C''0, obviously, the lower bound established by remains valid: : C_0\geq\frac = 0.4097\ldots. The upper bounds for ''C''0 were subsequently lowered from the original estimate 7.59 due to to (considering recent results only) 0.9051 due to , 0.7975 due to , 0.7915 due to , 0.6379 and 0.5606 due to and . the best estimate is 0.5600 obtained by .


Multidimensional version

As with the multidimensional central limit theorem, there is a multidimensional version of the Berry–Esseen theorem.Bentkus, Vidmantas. "A Lyapunov-type bound in Rd." Theory of Probability & Its Applications 49.2 (2005): 311–323. Let X_1,\dots,X_n be independent \mathbb R^d-valued random vectors each having mean zero. Write S = \sum_^n X_i and assume \Sigma = \operatorname /math> is invertible. Let Z\sim\operatorname(0,\Sigma) be a d-dimensional Gaussian with the same mean and covariance matrix as S. Then for all convex sets U\subseteq\mathbb R^d, :\big, \Pr \in U\Pr \in U,\big, \le C d^ \gamma, where C is a universal constant and \gamma=\sum_^n \operatorname\big \Sigma^X_i\, _2^3\big/math> (the third power of the L2 norm). The dependency on d^ is conjectured to be optimal, but might not be.


See also

*
Chernoff's inequality In probability theory, the Chernoff bound gives exponentially decreasing bounds on tail distributions of sums of independent random variables. Despite being named after Herman Chernoff, the author of the paper it first appeared in, the result is d ...
* Edgeworth series *
List of inequalities This article lists Wikipedia articles about named mathematical inequalities. Inequalities in pure mathematics Analysis * Agmon's inequality * Askey–Gasper inequality * Babenko–Beckner inequality * Bernoulli's inequality * Bernstein's ineq ...
*
List of mathematical theorems A ''list'' is any set of items in a row. List or lists may also refer to: People * List (surname) Organizations * List College, an undergraduate division of the Jewish Theological Seminary of America * SC Germania List, German rugby unio ...
* Concentration inequality


Notes


References

* * Durrett, Richard (1991). ''Probability: Theory and Examples''. Pacific Grove, CA: Wadsworth & Brooks/Cole. . * * * Feller, William (1972). ''An Introduction to Probability Theory and Its Applications, Volume II'' (2nd ed.). New York: John Wiley & Sons. . * * * Manoukian, Edward B. (1986). ''Modern Concepts and Theorems of Mathematical Statistics''. New York: Springer-Verlag. . * Serfling, Robert J. (1980). ''Approximation Theorems of Mathematical Statistics''. New York: John Wiley & Sons. . * * * * * * * * *


External links

* Gut, Allan & Holst Lars
Carl-Gustav Esseen
retrieved Mar. 15, 2004. * {{DEFAULTSORT:Berry-Esseen theorem Probabilistic inequalities Theorems in statistics Central limit theorem