In
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, Basu's theorem states that any
boundedly complete and
sufficient statistic
In statistics, sufficiency is a property of a statistic computed on a sample dataset in relation to a parametric model of the dataset. A sufficient statistic contains all of the information that the dataset provides about the model parameters. It ...
is
independent
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in Pennsylvania, United States
* Independentes (English: Independents), a Portuguese artist ...
of any
ancillary statistic In statistics, ancillarity is a property of a statistic computed on a sample dataset in relation to a parametric model of the dataset. An ancillary statistic has the same distribution regardless of the value of the parameters and thus provides no i ...
. This is a 1955 result of
Debabrata Basu.
It is often used in statistics as a tool to prove independence of two statistics, by first demonstrating one is complete sufficient and the other is ancillary, then appealing to the theorem. An example of this is to show that the sample mean and sample variance of a normal distribution are independent statistics, which is done in the
Example
Example may refer to:
* ''exempli gratia'' (e.g.), usually read out in English as "for example"
* .example, reserved as a domain name that may not be installed as a top-level domain of the Internet
** example.com, example.net, example.org, an ...
section below. This property (independence of sample mean and sample variance) characterizes normal distributions.
Statement
Let
be a family of distributions on a
measurable space
In mathematics, a measurable space or Borel space is a basic object in measure theory. It consists of a set and a σ-algebra, which defines the subsets that will be measured.
It captures and generalises intuitive notions such as length, area, an ...
and a
statistic
A statistic (singular) or sample statistic is any quantity computed from values in a sample which is considered for a statistical purpose. Statistical purposes include estimating a population parameter, describing a sample, or evaluating a hypot ...
maps from
to some measurable space
. If
is a boundedly complete sufficient statistic for
, and
is ancillary to
, then conditional on
,
is independent of
. That is,
.
Proof
Let
and
be the
marginal distribution
In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. It gives the probabilities of various values of the variable ...
s of
and
respectively.
Denote by
the
preimage
In mathematics, for a function f: X \to Y, the image of an input value x is the single output value produced by f when passed x. The preimage of an output value y is the set of input values that produce y.
More generally, evaluating f at each ...
of a set
under the map
. For any measurable set
we have
:
The distribution
does not depend on
because
is ancillary. Likewise,
does not depend on
because
is sufficient. Therefore
:
Note the integrand (the function inside the integral) is a function of
and not
. Therefore, since
is boundedly complete the function
:
is zero for
almost all values of
and thus
:
for almost all
. Therefore,
is independent of
.
Example
Independence of sample mean and sample variance of a normal distribution
Let ''X''
1, ''X''
2, ..., ''X''
''n'' be
independent, identically distributed normal random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
s with
mean
A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ...
''μ'' and
variance
In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...
''σ''
2.
Then with respect to the parameter ''μ'', one can show that
:
the sample mean, is a complete and sufficient statistic – it is all the information one can derive to estimate ''μ,'' and no more – and
:
the sample variance, is an ancillary statistic – its distribution does not depend on ''μ.''
Therefore, from Basu's theorem it follows that these statistics are independent conditional on
, conditional on
.
This independence result can also be proven by
Cochran's theorem.
Further, this property (that the sample mean and sample variance of the normal distribution are independent) ''
characterizes'' the normal distribution – no other distribution has this property.
Notes
References
*
* Mukhopadhyay, Nitis (2000). ''Probability and Statistical Inference''. Statistics: A Series of Textbooks and Monographs. 162. Florida: CRC Press USA. .
*
*
{{Statistics, state=collapsed
Indian inventions
Theorems in statistics
Independence (probability theory)
Articles containing proofs