In
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, the delta method is a method of deriving the asymptotic distribution of a random variable. It is applicable when the random variable being considered can be defined as a
differentiable function
In mathematics, a differentiable function of one real variable is a function whose derivative exists at each point in its domain. In other words, the graph of a differentiable function has a non- vertical tangent line at each interior point in ...
of a random variable which is
asymptotically
In analytic geometry, an asymptote () of a curve is a line such that the distance between the curve and the line approaches zero as one or both of the ''x'' or ''y'' coordinates tends to infinity. In projective geometry and related contexts, ...
Gaussian.
History
The delta method was derived from
propagation of error, and the idea behind was known in the early 20th century. Its statistical application can be traced as far back as 1928 by
T. L. Kelley. A formal description of the method was presented by
J. L. Doob
Joseph Leo Doob (February 27, 1910 – June 7, 2004) was an American mathematician, specializing in Mathematical analysis, analysis and probability theory.
The theory of Martingale (probability theory), martingales was developed by Doob.
Early ...
in 1935.
Robert Dorfman also described a version of it in 1938.
Univariate delta method
While the delta method generalizes easily to a multivariate setting, careful motivation of the technique is more easily demonstrated in univariate terms. Roughly, if there is a
sequence
In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members (also called ''elements'', or ''terms''). The number of elements (possibly infinite) is cal ...
of random variables satisfying
:
where ''θ'' and ''σ''
2 are finite valued constants and
denotes
convergence in distribution
In probability theory, there exist several different notions of convergence of sequences of random variables, including ''convergence in probability'', ''convergence in distribution'', and ''almost sure convergence''. The different notions of conve ...
, then
:
for any function ''g'' satisfying the property that its first derivative, evaluated at
,
exists and is non-zero valued.
The intuition of the delta method is that any such ''g'' function, in a "small enough" range of the function, can be approximated via a first order
Taylor series
In mathematics, the Taylor series or Taylor expansion of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor ser ...
(which is basically a linear function). If the random variable is roughly normal then a
linear transformation
In mathematics, and more specifically in linear algebra, a linear map (also called a linear mapping, linear transformation, vector space homomorphism, or in some contexts linear function) is a mapping V \to W between two vector spaces that pr ...
of it is also normal. Small range can be achieved when approximating the function around the mean, when the variance is "small enough". When g is applied to a random variable such as the mean, the delta method would tend to work better as the sample size increases, since it would help reduce the variance, and thus the Taylor approximation would be applied to a smaller range of the function g at the point of interest.
Proof in the univariate case
Demonstration of this result is fairly straightforward under the assumption that
is differentiable near the neighborhood of
and
is continuous at
with
. To begin, we use the
mean value theorem
In mathematics, the mean value theorem (or Lagrange's mean value theorem) states, roughly, that for a given planar arc (geometry), arc between two endpoints, there is at least one point at which the tangent to the arc is parallel to the secant lin ...
(i.e.: the first order approximation of a
Taylor series
In mathematics, the Taylor series or Taylor expansion of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor ser ...
using
Taylor's theorem
In calculus, Taylor's theorem gives an approximation of a k-times differentiable function around a given point by a polynomial of degree k, called the k-th-order Taylor polynomial. For a smooth function, the Taylor polynomial is the truncation a ...
):
:
where
lies between and ''θ''.
Note that since
and
, it must be that
and since is continuous, applying the
continuous mapping theorem
In probability theory, the continuous mapping theorem states that continuous functions preserve limits even if their arguments are sequences of random variables. A continuous function, in Heine's definition, is such a function that maps converge ...
yields
:
where
denotes
convergence in probability
In probability theory, there exist several different notions of convergence of sequences of random variables, including ''convergence in probability'', ''convergence in distribution'', and ''almost sure convergence''. The different notions of conve ...
.
Rearranging the terms and multiplying by
gives
:
Since
:
by assumption, it follows immediately from appeal to
Slutsky's theorem that
:
This concludes the proof.
Proof with an explicit order of approximation
Alternatively, one can add one more step at the end, to obtain the
order of approximation:
:
This suggests that the error in the approximation converges to 0 in probability.
Multivariate delta method
By definition, a
consistent estimator
In statistics, a consistent estimator or asymptotically consistent estimator is an estimator—a rule for computing estimates of a parameter ''θ''0—having the property that as the number of data points used increases indefinitely, the result ...
''B''
converges in probability to its true value ''β'', and often a
central limit theorem
In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the Probability distribution, distribution of a normalized version of the sample mean converges to a Normal distribution#Standard normal distributi ...
can be applied to obtain
asymptotic normality
In mathematics and statistics, an asymptotic distribution is a probability distribution that is in a sense the limiting distribution of a sequence of distributions. One of the main uses of the idea of an asymptotic distribution is in providing appr ...
:
:
where ''n'' is the number of observations and Σ is a (symmetric positive semi-definite)
covariance matrix
In probability theory and statistics, a covariance matrix (also known as auto-covariance matrix, dispersion matrix, variance matrix, or variance–covariance matrix) is a square matrix giving the covariance between each pair of elements of ...
. Suppose we want to estimate the variance of a scalar-valued function ''h'' of the estimator ''B''. Keeping only the first two terms of the
Taylor series
In mathematics, the Taylor series or Taylor expansion of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor ser ...
, and using vector notation for the
gradient
In vector calculus, the gradient of a scalar-valued differentiable function f of several variables is the vector field (or vector-valued function) \nabla f whose value at a point p gives the direction and the rate of fastest increase. The g ...
, we can estimate ''h(B)'' as
:
which implies the variance of ''h(B)'' is approximately
:
One can use the
mean value theorem
In mathematics, the mean value theorem (or Lagrange's mean value theorem) states, roughly, that for a given planar arc (geometry), arc between two endpoints, there is at least one point at which the tangent to the arc is parallel to the secant lin ...
(for real-valued functions of many variables) to see that this does not rely on taking first order approximation.
The delta method therefore implies that
:
or in univariate terms,
:
Example: the binomial proportion
Suppose ''X
n'' is
binomial
Binomial may refer to:
In mathematics
*Binomial (polynomial), a polynomial with two terms
*Binomial coefficient, numbers appearing in the expansions of powers of binomials
*Binomial QMF, a perfect-reconstruction orthogonal wavelet decomposition
* ...
with parameters
and ''n''. Since
:
we can apply the Delta method with to see
:
Hence, even though for any finite ''n'', the variance of
does not actually exist (since ''X
n'' can be zero), the asymptotic variance of
does exist and is equal to
:
Note that since ''p>0'',
as
, so with probability converging to one,
is finite for large ''n''.
Moreover, if
and
are estimates of different group rates from independent samples of sizes ''n'' and ''m'' respectively, then the logarithm of the estimated
relative risk
The relative risk (RR) or risk ratio is the ratio of the probability of an outcome in an exposed group to the probability of an outcome in an unexposed group. Together with risk difference and odds ratio, relative risk measures the association bet ...
has asymptotic variance equal to
:
This is useful to construct a hypothesis test or to make a
confidence interval for the relative risk.
Alternative form
The delta method is often used in a form that is essentially identical to that above, but without the assumption that or ''B'' is asymptotically normal. Often the only context is that the variance is "small". The results then just give approximations to the means and covariances of the transformed quantities. For example, the formulae presented in Klein (1953, p. 258) are:
:
where is the ''r''th element of ''h''(''B'') and ''B
i'' is the ''i''th element of ''B''.
Second-order delta method
When the delta method cannot be applied. However, if exists and is not zero, the second-order delta method can be applied. By the Taylor expansion,
, so that the variance of
relies on up to the 4th moment of
.
The second-order delta method is also useful in conducting a more accurate approximation of
's distribution when sample size is small.
.
For example, when
follows the standard normal distribution,
can be approximated as the weighted sum of a standard normal and a chi-square with 1 degree of freedom.
Nonparametric delta method
A version of the delta method exists in
nonparametric statistics
Nonparametric statistics is a type of statistical analysis that makes minimal assumptions about the underlying distribution of the data being studied. Often these models are infinite-dimensional, rather than finite dimensional, as in parametric s ...
. Let
be an
independent and identically distributed
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in Pennsylvania, United States
* Independentes (English: Independents), a Portuguese artist ...
random variable with a sample of size
with an
empirical distribution function
In statistics, an empirical distribution function ( an empirical cumulative distribution function, eCDF) is the Cumulative distribution function, distribution function associated with the empirical measure of a Sampling (statistics), sample. Th ...
, and let
be a functional. If
is
Hadamard differentiable with respect to the
Chebyshev metric, then
:
where
and
, with
denoting the empirical
influence function for
. A nonparametric
pointwise asymptotic confidence interval for
is therefore given by
:
where
denotes the
-quantile of the standard normal. See Wasserman (2006) p. 19f. for details and examples.
See also
*
Taylor expansions for the moments of functions of random variables
*
Variance-stabilizing transformation
In applied statistics, a variance-stabilizing transformation is a data transformation that is specifically chosen either to simplify considerations in graphical exploratory data analysis or to allow the application of simple regression-based or ana ...
References
Further reading
*
*
*
External links
*
*{{cite web , title=Explanation of the delta method , first=Alan H. , last=Feiveson , publisher=Stata Corp. , url=https://www.stata.com/support/faqs/statistics/delta-method/
Estimation methods
Statistical approximations
Articles containing proofs
Statistics articles needing expert attention