In
probability
Probability is the branch of mathematics concerning numerical descriptions of how likely an Event (probability theory), event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and ...
and
statistics
Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, Student's ''t''-distribution (or simply the ''t''-distribution) is any member of a family of continuous
probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
s that arise when estimating the
mean
There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value (magnitude and sign) of a given data set.
For a data set, the ''arithme ...
of a
normally distributed population
Population typically refers to the number of people in a single area, whether it be a city or town, region, country, continent, or the world. Governments typically quantify the size of the resident population within their jurisdiction using a ...
in situations where the
sample size
Sample size determination is the act of choosing the number of observations or Replication (statistics), replicates to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make stat ...
is small and the population's
standard deviation
In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...
is unknown. It was developed by English statistician
William Sealy Gosset
William Sealy Gosset (13 June 1876 – 16 October 1937) was an English statistician, chemist and brewer who served as Head Brewer of Guinness and Head Experimental Brewer of Guinness and was a pioneer of modern statistics. He pioneered small sa ...
under the pseudonym "Student".
The ''t''-distribution plays a role in a number of widely used statistical analyses, including
Student's ''t''-test for assessing the
statistical significance
In statistical hypothesis testing, a result has statistical significance when it is very unlikely to have occurred given the null hypothesis (simply by chance alone). More precisely, a study's defined significance level, denoted by \alpha, is the p ...
of the difference between two sample means, the construction of
confidence interval
In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...
s for the difference between two population means, and in linear
regression analysis
In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
. Student's ''t''-distribution also arises in the
Bayesian analysis
Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, and ...
of data from a normal family.
If we take a sample of
observations from a normal distribution, then the ''t''-distribution with
degrees of freedom
Degrees of freedom (often abbreviated df or DOF) refers to the number of independent variables or parameters of a thermodynamic system. In various scientific fields, the word "freedom" is used to describe the limits to which physical movement or ...
can be defined as the distribution of the location of the sample mean relative to the true mean, divided by the sample standard deviation, after multiplying by the standardizing term
. In this way, the ''t''-distribution can be used to construct a
confidence interval
In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...
for the true mean.
The ''t''-distribution is symmetric and bell-shaped, like the normal distribution. However, the ''t''-distribution has heavier tails, meaning that it is more prone to producing values that fall far from its mean. This makes it useful for understanding the statistical behavior of certain types of ratios of random quantities, in which variation in the denominator is amplified and may produce outlying values when the denominator of the ratio falls close to zero. The Student's ''t''-distribution is a special case of the
generalized hyperbolic distribution
The generalised hyperbolic distribution (GH) is a continuous probability distribution defined as the normal variance-mean mixture where the mixing distribution is the generalized inverse Gaussian distribution (GIG). Its probability density functi ...
.
History and etymology
In statistics, the ''t''-distribution was first derived as a
posterior distribution
The posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. From an epistemological perspective, the posterior p ...
in 1876 by
Helmert
Friedrich Robert Helmert (31 July 1843 – 15 June 1917) was a German geodesist and statistician with important contributions to the theory of errors.
Career
Helmert was born in Freiberg, Kingdom of Saxony. After schooling in Freiberg and D ...
and
Lüroth.
The ''t''-distribution also appeared in a more general form as
Pearson Type IV distribution in
Karl Pearson
Karl Pearson (; born Carl Pearson; 27 March 1857 – 27 April 1936) was an English mathematician and biostatistician. He has been credited with establishing the discipline of mathematical statistics. He founded the world's first university st ...
's 1895 paper.
In the English-language literature, the distribution takes its name from William Sealy Gosset's 1908 paper in ''
Biometrika
''Biometrika'' is a peer-reviewed scientific journal published by Oxford University Press for thBiometrika Trust The editor-in-chief is Paul Fearnhead (Lancaster University). The principal focus of this journal is theoretical statistics. It was es ...
'' under the pseudonym "Student". One version of the origin of the pseudonym is that Gosset's employer preferred staff to use pen names when publishing scientific papers instead of their real name, so he used the name "Student" to hide his identity. Another version is that Guinness did not want their competitors to know that they were using the ''t''-test to determine the quality of raw material.
Gosset worked at the
Guinness Brewery
St. James's Gate Brewery is a brewery founded in 1759 in Dublin, Ireland, by Arthur Guinness. The company is now a part of Diageo, a company formed from the merger of Guinness and Grand Metropolitan in 1997. The main product of the brewery is ...
in
Dublin, Ireland
Dublin (; , or ) is the capital and largest city of Ireland. On a bay at the mouth of the River Liffey, it is in the province of Leinster, bordered on the south by the Dublin Mountains, a part of the Wicklow Mountains range. At the 2016 cen ...
, and was interested in the problems of small samples – for example, the chemical properties of barley where sample sizes might be as few as 3. Gosset's paper refers to the distribution as the "frequency distribution of standard deviations of samples drawn from a normal population". It became well known through the work of
Ronald Fisher
Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British polymath who was active as a mathematician, statistician, biologist, geneticist, and academic. For his work in statistics, he has been described as "a genius who a ...
, who called the distribution "Student's distribution" and represented the test value with the letter ''t''.
How Student's distribution arises from sampling
Let
be independently and identically drawn from the distribution
, i.e. this is a sample of size
from a normally distributed population with expected mean value
and variance
.
Let
:
be the sample mean and let
:
be the (
Bessel-corrected) sample variance. Then the random variable
:
has a standard normal distribution (i.e. normal with expected mean 0 and variance 1), and the random variable
:
''i.e'' where
has been substituted for
, has a Student's ''t''-distribution with
degrees of freedom. Since
has replaced
the only unobservable quantity in this expression is
so this can be used to derive confidence intervals for
The numerator and the denominator in the preceding expression are
statistically independent
Independence is a fundamental notion in probability theory, as in statistics and the theory of stochastic processes. Two events are independent, statistically independent, or stochastically independent if, informally speaking, the occurrence of o ...
random variables despite being based on the same sample
. This can be seen by observing that
and recalling that
and
are both linear combinations of the same set of i.i.d. normally distributed random variables.
Definition
Probability density function
Student's ''t''-distribution has the
probability density function
In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
(PDF) given by
:
where
is the number of ''
degrees of freedom
Degrees of freedom (often abbreviated df or DOF) refers to the number of independent variables or parameters of a thermodynamic system. In various scientific fields, the word "freedom" is used to describe the limits to which physical movement or ...
'' and
is the
gamma function
In mathematics, the gamma function (represented by , the capital letter gamma from the Greek alphabet) is one commonly used extension of the factorial function to complex numbers. The gamma function is defined for all complex numbers except ...
. This may also be written as
:
where B is the
Beta function
In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral
: \Beta(z_1,z_2) = \int_0^1 t^(1 ...
. In particular for integer valued degrees of freedom
we have:
For
even,
:
For
odd,
:
The probability density function is
symmetric
Symmetry (from grc, συμμετρία "agreement in dimensions, due proportion, arrangement") in everyday language refers to a sense of harmonious and beautiful proportion and balance. In mathematics, "symmetry" has a more precise definiti ...
, and its overall shape resembles the bell shape of a normally distributed variable with mean 0 and variance 1, except that it is a bit lower and wider. As the number of degrees of freedom grows, the ''t''-distribution approaches the normal distribution with mean 0 and variance 1. For this reason
is also known as the normality parameter.
The following images show the density of the ''t''-distribution for increasing values of
. The normal distribution is shown as a blue line for comparison. Note that the ''t''-distribution (red line) becomes closer to the normal distribution as
increases.
Cumulative distribution function
The
cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ev ...
(CDF) can be written in terms of ''I'', the regularized
incomplete beta function
In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral
: \Beta(z_1,z_2) = \int_0^1 t^ ...
. For ''t'' > 0,
[
:
where
:
Other values would be obtained by symmetry. An alternative formula, valid for , is][
:
where 2''F''1 is a particular case of the hypergeometric function.
For information on its inverse cumulative distribution function, see .
]
Special cases
Certain values of give a simple form for Student's t-distribution.
How the ''t''-distribution arises
Sampling distribution
Let be the numbers observed in a sample from a continuously distributed population with expected value . The sample mean and sample variance
In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...
are given by:
:
The resulting ''t-value'' is
:
The ''t''-distribution with degrees of freedom is the sampling distribution
In statistics, a sampling distribution or finite-sample distribution is the probability distribution of a given random-sample-based statistic. If an arbitrarily large number of samples, each involving multiple observations (data points), were s ...
of the ''t''-value when the samples consist of independent identically distributed
In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is usual ...
observations from a normally distributed population. Thus for inference purposes ''t'' is a useful "pivotal quantity
In statistics, a pivotal quantity or pivot is a function of observations and unobservable parameters such that the function's probability distribution does not depend on the unknown parameters (including nuisance parameters). A pivot quantity need ...
" in the case when the mean and variance are unknown population parameters, in the sense that the ''t''-value has then a probability distribution that depends on neither nor .
Bayesian inference
In Bayesian statistics, a (scaled, shifted) ''t''-distribution arises as the marginal distribution
In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. It gives the probabilities of various values of the variables ...
of the unknown mean of a normal distribution, when the dependence on an unknown variance has been marginalized out:
:
where stands for the data , and represents any other information that may have been used to create the model. The distribution is thus the compounding
In the field of pharmacy, compounding (performed in compounding pharmacies) is preparation of a custom formulation of a medication to fit a unique need of a patient that cannot be met with commercially available products. This may be done for me ...
of the conditional distribution of given the data and with the marginal distribution of given the data.
With data points, if uninformative, or flat, the location prior can be taken for ''μ'', and the scale prior can be taken for ''σ''2, then Bayes' theorem
In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For examp ...
gives
:
a normal distribution and a scaled inverse chi-squared distribution
The scaled inverse chi-squared distribution is the distribution for ''x'' = 1/''s''2, where ''s''2 is a sample mean of the squares of ν independent normal random variables that have mean 0 and inverse variance 1/σ2 = τ2. The distribu ...
respectively, where and
:
The marginalization integral thus becomes
:
This can be evaluated by substituting , where , giving
:
so
:
But the ''z'' integral is now a standard Gamma integral
In mathematics, the gamma function (represented by , the capital letter gamma from the Greek alphabet) is one commonly used extension of the factorial function to complex numbers. The gamma function is defined for all complex numbers except t ...
, which evaluates to a constant, leaving
:
This is a form of the ''t''-distribution with an explicit scaling and shifting that will be explored in more detail in a further section below. It can be related to the standardized ''t''-distribution by the substitution
:
The derivation above has been presented for the case of uninformative priors for and ; but it will be apparent that any priors that lead to a normal distribution being compounded with a scaled inverse chi-squared distribution will lead to a ''t''-distribution with scaling and shifting for , although the scaling parameter corresponding to above will then be influenced both by the prior information and the data, rather than just by the data as above.
Characterization
As the distribution of a test statistic
Student's ''t''-distribution with degrees of freedom can be defined as the distribution of the random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
''T'' with
:
where
* ''Z'' is a standard normal with expected value
In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a l ...
0 and variance 1;
* ''V'' has a chi-squared distribution
In probability theory and statistics, the chi-squared distribution (also chi-square or \chi^2-distribution) with k degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables. The chi-squa ...
() with degrees of freedom
Degrees of freedom (often abbreviated df or DOF) refers to the number of independent variables or parameters of a thermodynamic system. In various scientific fields, the word "freedom" is used to describe the limits to which physical movement or ...
;
* ''Z'' and ''V'' are independent
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s
* Independ ...
;
A different distribution is defined as that of the random variable defined, for a given constant ''μ'', by
:
This random variable has a noncentral ''t''-distribution with noncentrality parameter Noncentral distributions are families of probability distributions that are related to other "central" families of distributions by means of a noncentrality parameter. Whereas the central distribution describes how a test statistic is distributed wh ...
''μ''. This distribution is important in studies of the power
Power most often refers to:
* Power (physics), meaning "rate of doing work"
** Engine power, the power put out by an engine
** Electric power
* Power (social and political), the ability to influence people or events
** Abusive power
Power may a ...
of Student's ''t''-test.
Derivation
Suppose ''X''1, ..., ''X''''n'' are independent
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s
* Independ ...
realizations of the normally-distributed, random variable ''X'', which has an expected value ''μ'' and variance
In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...
''σ''2. Let
:
be the sample mean, and
:
be an unbiased estimate of the variance from the sample. It can be shown that the random variable
:
has a chi-squared distribution with degrees of freedom (by Cochran's theorem In statistics, Cochran's theorem, devised by William G. Cochran, is a theorem used to justify results relating to the probability distributions of statistics that are used in the analysis of variance.
Statement
Let ''U''1, ..., ''U'N'' be i.i. ...
). It is readily shown that the quantity
:
is normally distributed with mean 0 and variance 1, since the sample mean is normally distributed with mean ''μ'' and variance ''σ''2/''n''. Moreover, it is possible to show that these two random variables (the normally distributed one ''Z'' and the chi-squared-distributed one ''V'') are independent. Consequently the pivotal quantity
In statistics, a pivotal quantity or pivot is a function of observations and unobservable parameters such that the function's probability distribution does not depend on the unknown parameters (including nuisance parameters). A pivot quantity need ...
:
which differs from ''Z'' in that the exact standard deviation ''σ'' is replaced by the random variable ''S''''n'', has a Student's ''t''-distribution as defined above. Notice that the unknown population variance ''σ''2 does not appear in ''T'', since it was in both the numerator and the denominator, so it canceled. Gosset intuitively obtained the probability density function stated above, with equal to ''n'' − 1, and Fisher proved it in 1925.
The distribution of the test statistic ''T'' depends on , but not ''μ'' or ''σ''; the lack of dependence on ''μ'' and ''σ'' is what makes the ''t''-distribution important in both theory and practice.
As a maximum entropy distribution
Student's ''t''-distribution is the maximum entropy probability distribution
In statistics and information theory, a maximum entropy probability distribution has entropy that is at least as great as that of all other members of a specified class of probability distributions. According to the principle of maximum entro ...
for a random variate ''X'' for which is fixed.
Properties
Moments
For , the raw moment
In mathematics, the moments of a function are certain quantitative measures related to the shape of the function's graph. If the function represents mass density, then the zeroth moment is the total mass, the first moment (normalized by total ma ...
s of the ''t''-distribution are
: