statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...

, a pivotal quantity or pivot is a function of observations and unobservable parameters such that the function's

probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...

does not depend on the unknown

parameter A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...

s (including

nuisance parameter Nuisance (from archaic ''nocence'', through Fr. ''noisance'', ''nuisance'', from Lat. ''nocere'', "to hurt") is a common law tort. It means that which causes offence, annoyance, trouble or injury. A nuisance can be either public (also "common") ...

s). A pivot quantity need not be a

statistic A statistic (singular) or sample statistic is any quantity computed from values in a sample which is considered for a statistical purpose. Statistical purposes include estimating a population parameter, describing a sample, or evaluating a hypo ...

—the function and its ''value'' can depend on the parameters of the model, but its ''distribution'' must not. If it is a statistic, then it is known as an ''

ancillary statistic An ancillary statistic is a measure of a sample whose distribution (or whose pmf or pdf) does not depend on the parameters of the model. An ancillary statistic is a pivotal quantity that is also a statistic. Ancillary statistics can be used to c ...

.'' More formally, let

X = (X_1,X_2,\ldots,X_n)

be a random sample from a distribution that depends on a parameter (or vector of parameters)

\theta

. Let

g(X,\theta)

be a random variable whose distribution is the same for all

\theta

. Then

g

is called a ''pivotal quantity'' (or simply a ''pivot''). Pivotal quantities are commonly used for normalization to allow data from different data sets to be compared. It is relatively easy to construct pivots for location and scale parameters: for the former we form differences so that location cancels, for the latter ratios so that scale cancels. Pivotal quantities are fundamental to the construction of

test statistic A test statistic is a statistic (a quantity derived from the sample) used in statistical hypothesis testing.Berger, R. L.; Casella, G. (2001). ''Statistical Inference'', Duxbury Press, Second Edition (p.374) A hypothesis test is typically specifi ...

s, as they allow the statistic to not depend on parameters – for example,

Student's t-statistic In statistics, the ''t''-statistic is the ratio of the departure of the estimated value of a parameter from its hypothesized value to its standard error. It is used in hypothesis testing via Student's ''t''-test. The ''t''-statistic is used in a ...

is for a normal distribution with unknown variance (and mean). They also provide one method of constructing

confidence interval In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...

s, and the use of pivotal quantities improves performance of the bootstrap. In the form of ancillary statistics, they can be used to construct frequentist

prediction interval In statistical inference, specifically predictive inference, a prediction interval is an estimate of an interval in which a future observation will fall, with a certain probability, given what has already been observed. Prediction intervals are o ...

s (predictive confidence intervals).

Examples

Normal distribution

One of the simplest pivotal quantities is the

z-score In statistics, the standard score is the number of standard deviations by which the value of a raw score (i.e., an observed value or data point) is above or below the mean value of what is being observed or measured. Raw scores above the mean ...

; given a normal distribution with mean

\mu

and variance

\sigma^2

, and an observation ''x,'' the z-score: :

z = \frac,

has distribution

N(0,1)

– a normal distribution with mean 0 and variance 1. Similarly, since the ''n''-sample sample mean has sampling distribution

N(\mu,\sigma^2/n),

the z-score of the mean :

z = \frac

also has distribution

N(0,1).

Note that while these functions depend on the parameters – and thus one can only compute them if the parameters are known (they are not statistics) – the distribution is independent of the parameters. Given

n

independent, identically distributed (i.i.d.) observations

X = (X_1, X_2, \ldots, X_n)

from the

normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu ...

with unknown mean

\mu

and variance

\sigma^2

, a pivotal quantity can be obtained from the function: :

g(x,X) = \frac

where :

\overline = \frac\sum_^n

and :

s^2 = \frac\sum_^n

are unbiased estimates of

\mu

and

\sigma^2

, respectively. The function

g(x,X)

is the

for a new value

x

, to be drawn from the same population as the already observed set of values

X

. Using

x=\mu

the function

g(\mu,X)

becomes a pivotal quantity, which is also distributed by the

Student's t-distribution In probability and statistics, Student's ''t''-distribution (or simply the ''t''-distribution) is any member of a family of continuous probability distributions that arise when estimating the mean of a normally distributed population in sit ...

with

\nu = n-1

degrees of freedom. As required, even though

\mu

appears as an argument to the function

g

, the distribution of

g(\mu,X)

does not depend on the parameters

\mu

\sigma

of the normal probability distribution that governs the observations

X_1,\ldots,X_n

. This can be used to compute a

for the next observation

X_;

see Prediction interval: Normal distribution.

Bivariate normal distribution

In more complicated cases, it is impossible to construct exact pivots. However, having approximate pivots improves convergence to

asymptotic normality In mathematics and statistics, an asymptotic distribution is a probability distribution that is in a sense the "limiting" distribution of a sequence of distributions. One of the main uses of the idea of an asymptotic distribution is in providing a ...

. Suppose a sample of size

n

of vectors

(X_i,Y_i)'

is taken from a bivariate

with unknown

correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...

\rho

. An estimator of

\rho

is the sample (Pearson, moment) correlation :

r = \frac

where

s_X^2, s_Y^2

are

sample variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...

s of

X

and

Y

. The sample statistic

r

has an asymptotically normal distribution: :

\sqrt\frac \Rightarrow N(0,1)

. However, a

variance-stabilizing transformation In applied statistics, a variance-stabilizing transformation is a data transformation that is specifically chosen either to simplify considerations in graphical exploratory data analysis or to allow the application of simple regression-based or anal ...

z = \rm^ r = \frac12 \ln \frac

known as Fisher's ''z'' transformation of the correlation coefficient allows creating the distribution of

z

asymptotically independent of unknown parameters: :

\sqrt(z-\zeta) \Rightarrow N(0,1)

where

\zeta = ^ \rho

is the corresponding distribution parameter. For finite samples sizes

n

, the random variable

z

will have distribution closer to normal than that of

r

. An even closer approximation to the standard normal distribution is obtained by using a better approximation for the exact variance: the usual form is :

\operatorname(z) \approx \frac1 .

Robustness

From the point of view of

robust statistics Robust statistics are statistics with good performance for data drawn from a wide range of probability distributions, especially for distributions that are not normal. Robust statistical methods have been developed for many common problems, suc ...

, pivotal quantities are robust to changes in the parameters – indeed, independent of the parameters – but not in general robust to changes in the model, such as violations of the assumption of normality. This is fundamental to the robust critique of non-robust statistics, often derived from pivotal quantities: such statistics may be robust within the family, but are not robust outside it.

References

{{Statistics, inference Statistical theory

Examples

Normal distribution

Bivariate normal distribution

Robustness

See also

References