In
statistics
Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, the delta method is a result concerning the approximate
probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
for a
function
Function or functionality may refer to:
Computing
* Function key, a type of key on computer keyboards
* Function model, a structured representation of processes in a system
* Function object or functor or functionoid, a concept of object-oriente ...
of an
asymptotically normal statistical
estimator
In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its result (the estimate) are distinguished. For example, the ...
from knowledge of the limiting
variance
In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...
of that estimator.
History
The delta method was derived from
propagation of error
In statistics, propagation of uncertainty (or propagation of error) is the effect of variables' uncertainties (or errors, more specifically random errors) on the uncertainty of a function based on them. When the variables are the values of ex ...
, and the idea behind was known in the early 19th century. Its statistical application can be traced as far back as 1928 by
T. L. Kelley. A formal description of the method was presented by
J. L. Doob
Joseph Leo Doob (February 27, 1910 – June 7, 2004) was an United States of America, American mathematician, specializing in Mathematical analysis, analysis and probability theory.
The theory of Martingale (probability theory), martingales was ...
in 1935.
Robert Dorfman
Robert Dorfman (27 October 1916 – 24 June 2002) was professor of political economy at Harvard University. Dorfman made great contributions to the fields of economics, statistics, group testing and in the process of coding theory.
His paper†...
also described a version of it in 1938.
Univariate delta method
While the delta method generalizes easily to a multivariate setting, careful motivation of the technique is more easily demonstrated in univariate terms. Roughly, if there is a
sequence
In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members (also called ''elements'', or ''terms''). The number of elements (possibly infinite) is calle ...
of random variables satisfying
:
where ''θ'' and ''σ''
2 are finite valued constants and
denotes
convergence in distribution
In probability theory, there exist several different notions of convergence of random variables. The convergence of sequences of random variables to some limit random variable is an important concept in probability theory, and its applications t ...
, then
:
for any function ''g'' satisfying the property that exists and is non-zero valued.
Proof in the univariate case
Demonstration of this result is fairly straightforward under the assumption that is
continuous
Continuity or continuous may refer to:
Mathematics
* Continuity (mathematics), the opposing concept to discreteness; common examples include
** Continuous probability distribution or random variable in probability and statistics
** Continuous ...
. To begin, we use the
mean value theorem (i.e.: the first order approximation of a
Taylor series
In mathematics, the Taylor series or Taylor expansion of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor serie ...
using
Taylor's theorem
In calculus, Taylor's theorem gives an approximation of a ''k''-times differentiable function around a given point by a polynomial of degree ''k'', called the ''k''th-order Taylor polynomial. For a smooth function, the Taylor polynomial is the t ...
):
:
where
lies between and ''θ''.
Note that since
and
, it must be that
and since is continuous, applying the
continuous mapping theorem
In probability theory, the continuous mapping theorem states that continuous functions preserve limits even if their arguments are sequences of random variables. A continuous function, in Heine’s definition, is such a function that maps converg ...
yields
:
where
denotes
convergence in probability
In probability theory, there exist several different notions of convergence of random variables. The convergence of sequences of random variables to some limit random variable is an important concept in probability theory, and its applications t ...
.
Rearranging the terms and multiplying by
gives
:
Since
:
by assumption, it follows immediately from appeal to
Slutsky's theorem In probability theory, Slutsky’s theorem extends some properties of algebraic operations on convergent sequences of real numbers to sequences of random variables.
The theorem was named after Eugen Slutsky. Slutsky's theorem is also attributed to ...
that
:
This concludes the proof.
Proof with an explicit order of approximation
Alternatively, one can add one more step at the end, to obtain the
order of approximation
In science, engineering, and other quantitative disciplines, order of approximation refers to formal or informal expressions for how accurate an approximation is.
Usage in science and engineering
In formal expressions, the English_numerals#Ordi ...
:
:
This suggests that the error in the approximation converges to 0 in probability.
Multivariate delta method
By definition, a
consistent estimator
In statistics, a consistent estimator or asymptotically consistent estimator is an estimator—a rule for computing estimates of a parameter ''θ''0—having the property that as the number of data points used increases indefinitely, the result ...
''B''
converges in probability
In probability theory, there exist several different notions of convergence of random variables. The convergence of sequences of random variables to some limit random variable is an important concept in probability theory, and its applications to ...
to its true value ''β'', and often a
central limit theorem
In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are summed up, their properly normalized sum tends toward a normal distribution even if the original variables themselv ...
can be applied to obtain
asymptotic normality
In mathematics and statistics, an asymptotic distribution is a probability distribution that is in a sense the "limiting" distribution of a sequence of distributions. One of the main uses of the idea of an asymptotic distribution is in providing a ...
:
:
where ''n'' is the number of observations and Σ is a (symmetric positive semi-definite) covariance matrix. Suppose we want to estimate the variance of a scalar-valued function ''h'' of the estimator ''B''. Keeping only the first two terms of the
Taylor series
In mathematics, the Taylor series or Taylor expansion of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor serie ...
, and using vector notation for the
gradient
In vector calculus, the gradient of a scalar-valued differentiable function of several variables is the vector field (or vector-valued function) \nabla f whose value at a point p is the "direction and rate of fastest increase". If the gradi ...
, we can estimate ''h(B)'' as
:
which implies the variance of ''h(B)'' is approximately
:
One can use the
mean value theorem (for real-valued functions of many variables) to see that this does not rely on taking first order approximation.
The delta method therefore implies that
:
or in univariate terms,
:
Example: the binomial proportion
Suppose ''X
n'' is
binomial
Binomial may refer to:
In mathematics
*Binomial (polynomial), a polynomial with two terms
* Binomial coefficient, numbers appearing in the expansions of powers of binomials
*Binomial QMF, a perfect-reconstruction orthogonal wavelet decomposition
...
with parameters
and ''n''. Since
:
we can apply the Delta method with to see
:
Hence, even though for any finite ''n'', the variance of
does not actually exist (since ''X
n'' can be zero), the asymptotic variance of
does exist and is equal to
:
Note that since ''p>0'',
as
, so with probability converging to one,
is finite for large ''n''.
Moreover, if
and
are estimates of different group rates from independent samples of sizes ''n'' and ''m'' respectively, then the logarithm of the estimated
relative risk has asymptotic variance equal to
:
This is useful to construct a hypothesis test or to make a confidence interval for the relative risk.
Alternative form
The delta method is often used in a form that is essentially identical to that above, but without the assumption that or ''B'' is asymptotically normal. Often the only context is that the variance is "small". The results then just give approximations to the means and covariances of the transformed quantities. For example, the formulae presented in Klein (1953, p. 258) are:
:
where is the ''r''th element of ''h''(''B'') and ''B
i'' is the ''i''th element of ''B''.
Second-order delta method
When the delta method cannot be applied. However, if exists and is not zero, the second-order delta method can be applied. By the Taylor expansion,
, so that the variance of
relies on up to the 4th moment of
.
The second-order delta method is also useful in conducting a more accurate approximation of
's distribution when sample size is small.
.
For example, when
follows the standard normal distribution,
can be approximated as the weighted sum of a standard normal and a chi-square with degree-of-freedom of 1.
Nonparametric delta method
A version of the delta method exists in
nonparametric statistics
Nonparametric statistics is the branch of statistics that is not based solely on parametrized families of probability distributions (common examples of parameters are the mean and variance). Nonparametric statistics is based on either being distr ...
. Let
be an
independent and identically distributed
In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is usua ...
random variable with a sample of size
with an
empirical distribution function , and let
be a functional. If
is
Hadamard differentiable with respect to the
Chebyshev metric, then
:
where
and
, with
denoting the empirical
influence function for
. A nonparametric
pointwise asymptotic confidence interval for
is therefore given by
:
where
denotes the
-quantile of the standard normal. See Wasserman (2006) p. 19f. for details and examples.
See also
*
Taylor expansions for the moments of functions of random variables
*
Variance-stabilizing transformation In applied statistics, a variance-stabilizing transformation is a data transformation that is specifically chosen either to simplify considerations in graphical exploratory data analysis or to allow the application of simple regression-based or anal ...
References
Further reading
*
*
*
External links
*
*
*{{cite web , title=Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes , first1=Jun , last1=Xu , first2=J. Scott , last2=Long, author2-link=J. Scott Long , publisher=Indiana University , date=August 22, 2005 , work=Lecture notes , url=http://www.indiana.edu/~jslsoc/stata/ci_computations/spost_deltaci.pdf
Estimation methods
Statistical approximations
Articles containing proofs
Statistics articles needing expert attention
de:Statistischer Test#Asymptotisches Verhalten des Tests