In
statistics
Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, the multivariate ''t''-distribution (or multivariate Student distribution) is a
multivariate probability distribution
Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considere ...
. It is a generalization to
random vector
In probability, and statistics, a multivariate random variable or random vector is a list of mathematical variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge of its value. ...
s of the
Student's ''t''-distribution, which is a distribution applicable to univariate
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s. While the case of a
random matrix
In probability theory and mathematical physics, a random matrix is a matrix-valued random variable—that is, a matrix in which some or all elements are random variables. Many important properties of physical systems can be represented mathemat ...
could be treated within this structure, the
matrix ''t''-distribution is distinct and makes particular use of the matrix structure.
Definition
One common method of construction of a multivariate ''t''-distribution, for the case of
dimensions, is based on the observation that if
and
are independent and distributed as
and
(i.e.
multivariate normal
In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One d ...
and
chi-squared distribution
In probability theory and statistics, the chi-squared distribution (also chi-square or \chi^2-distribution) with k degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables. The chi-squa ...
s) respectively, the matrix
is a ''p'' × ''p'' matrix, and
, then
has the density
:
and is said to be distributed as a multivariate ''t''-distribution with parameters
. Note that
is not the covariance matrix since the covariance is given by
(for
).
The constructive definition of a multivariate ''t''-distribution simultaneously serves as a sampling algorithm:
# Generate
and
, independently.
# Compute
.
This formulation gives rise to the hierarchical representation of a multivariate ''t''-distribution as a scale-mixture of normals:
where
indicates a gamma distribution with density proportional to
, and
conditionally follows
.
In the special case
, the distribution is a
multivariate Cauchy distribution.
Derivation
There are in fact many candidates for the multivariate generalization of
Student's ''t''-distribution. An extensive survey of the field has been given by Kotz and Nadarajah (2004). The essential issue is to define a probability density function of several variables that is the appropriate generalization of the formula for the univariate case. In one dimension (
), with
and
, we have the
probability density function
In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
:
and one approach is to write down a corresponding function of several variables. This is the basic idea of
elliptical distribution
In probability and statistics, an elliptical distribution is any member of a broad family of probability distributions that generalize the multivariate normal distribution. Intuitively, in the simplified two and three dimensional case, the joint ...
theory, where one writes down a corresponding function of
variables
that replaces
by a quadratic function of all the
. It is clear that this only makes sense when all the marginal distributions have the same
degrees of freedom
Degrees of freedom (often abbreviated df or DOF) refers to the number of independent variables or parameters of a thermodynamic system. In various scientific fields, the word "freedom" is used to describe the limits to which physical movement or ...
. With
, one has a simple choice of multivariate density function
:
which is the standard but not the only choice.
An important special case is the standard bivariate ''t''-distribution, ''p'' = 2:
:
Note that
.
Now, if
is the identity matrix, the density is
:
The difficulty with the standard representation is revealed by this formula, which does not factorize into the product of the marginal one-dimensional distributions. When
is diagonal the standard representation can be shown to have zero
correlation
In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...
but the
marginal distribution
In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. It gives the probabilities of various values of the variables ...
s do not agree with
statistical independence
Independence is a fundamental notion in probability theory, as in statistics and the theory of stochastic processes. Two events are independent, statistically independent, or stochastically independent if, informally speaking, the occurrence of ...
.
Cumulative distribution function
The definition of the
cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ev ...
(cdf) in one dimension can be extended to multiple dimensions by defining the following probability (here
is a real vector):
:
There is no simple formula for
, but it can b
approximated numericallyvia
Monte Carlo integration
In mathematics, Monte Carlo integration is a technique for numerical integration using random numbers. It is a particular Monte Carlo method that numerically computes a definite integral. While other algorithms usually evaluate the integrand a ...
.
[
]
Conditional Distribution
This was demonstrated by Muirhead though previously derived using the simpler ratio representation above, by Cornish. Let vector
follow the multivariate ''t'' distribution and partition into two subvectors of
elements:
:
where
, the known mean vector is
and the scale matrix is
.
Then
:
where
:
is the conditional mean where it exists or median otherwise.
:
is the
Schur complement In linear algebra and the theory of matrices, the Schur complement of a block matrix is defined as follows.
Suppose ''p'', ''q'' are nonnegative integers, and suppose ''A'', ''B'', ''C'', ''D'' are respectively ''p'' × ''p'', ''p'' × ''q'', ''q'' ...
of
:
is the squared
Mahalanobis distance The Mahalanobis distance is a measure of the distance between a point ''P'' and a distribution ''D'', introduced by P. C. Mahalanobis in 1936. Mahalanobis's definition was prompted by the problem of identifying the similarities of skulls based ...
of
from
with scale matrix
See for a simple proof of the above conditional distribution.
Copulas based on the multivariate ''t''
The use of such distributions is enjoying renewed interest due to applications in
mathematical finance
Mathematical finance, also known as quantitative finance and financial mathematics, is a field of applied mathematics, concerned with mathematical modeling of financial markets.
In general, there exist two separate branches of finance that require ...
, especially through the use of the Student's ''t''
copula.
Elliptical Representation
Constructed as an
elliptical distribution
In probability and statistics, an elliptical distribution is any member of a broad family of probability distributions that generalize the multivariate normal distribution. Intuitively, in the simplified two and three dimensional case, the joint ...
and in the simplest centralised case with spherical symmetry and without scaling,
, the multivariate t PDF takes the form
:
where
and
= degrees of freedom. The expected covariance of
is
:
The aim is to convert the Cartesian PDF to a radial one. Kibria and Joarder,
in a tutorial-style paper, define radial measure
such that
which is equivalent to the expected variance of
-element vector
treated as a univariate zero-mean random sequence. They note that
follows the
Fisher-Snedecor or
distribution:
:
having mean value
.
By a change of random variable to
in the equation above, retaining
-vector
, we have
and probability distribution
:
which is a regular
Beta-prime distribution
In probability theory and statistics, the beta prime distribution (also known as inverted beta distribution or beta distribution of the second kindJohnson et al (1995), p 248) is an absolutely continuous probability distribution.
Definitions
...
having mean value
. The cumulative distribution function of
is thus known to be
where
is the incomplete
Beta function
In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral
: \Beta(z_1,z_2) = \int_0^1 t^(1 ...
.
These results can be derived by straightforward transformation of coordinates from cartesian to spherical. A constant radius surface at
with PDF
is an iso-density surface. The quantum of probability in a surface shell of area
and thickness
at
is
.
The enclosed sphere in
dimensions has surface area
and substitution into
shows that the shell has element of probability
. This is equivalent to a radial density function
:
which simplifies to
where
is the
Beta function
In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral
: \Beta(z_1,z_2) = \int_0^1 t^(1 ...
. Changing the radial variable to
gets
:
Finally, scaling to
returns the previous Beta Prime distribution
:
To scale the radial variables without changing the radial shape function, define scale matrix
, yielding a 3-parameter Cartesian density function, ie. the probability
in volume element
is
:
or, in terms of scalar radial variable
,
:
The moments of all the radial variables can be derived from the Beta Prime distribution. If
then
, a known result. Thus, for variable
, proportional to
, we have
:
The moments of
are
:
while introducing the scale matrix yields
:
Moments relating to radial variable
are found by setting
and
whereupon
:
Linear Combinations and Affine Transformation
Following section 3.3 of Kibria et.al. let
be a
-vector sampled from a central spherical multivariate ''t'' distribution with
degrees of freedom:
.
is derived from
via a linear transformation:
:
where
has full rank, then
:
That is
and the covariance of
is
Furthermore, if
is a non-singular matrix then
:
with mean
and covariance
.
Roth (reference below) notes that if
is a
squat matrix with
then
has distribution
.
If
takes the form
then the PDF of
is the marginal distribution of the leading
elements of
.
The degrees of freedom parameter
is invariant throughout.
Related concepts
In univariate statistics, the
Student's ''t''-test makes use of
Student's ''t''-distribution.
Hotelling's ''T''-squared distribution is a distribution that arises in multivariate statistics. The
matrix ''t''-distribution is a distribution for random variables arranged in a matrix structure.
See also
*
Multivariate normal distribution
In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One d ...
, which is the limiting case of the multivariate Student's t-distribution when
.
*
Chi distribution
In probability theory and statistics, the chi distribution is a continuous probability distribution. It is the distribution of the positive square root of the sum of squares of a set of independent random variables each following a standard norm ...
, the
pdf
Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. ...
of the scaling factor in the construction the Student's t-distribution and also the
2-norm (or
Euclidean norm
Euclidean space is the fundamental space of geometry, intended to represent physical space. Originally, that is, in Euclid's ''Elements'', it was the three-dimensional space of Euclidean geometry, but in modern mathematics there are Euclidean s ...
) of a multivariate normally distributed vector (centered at zero).
**
Rayleigh distribution#Student's t, random vector length of multivariate ''t''-distribution
*
Mahalanobis distance The Mahalanobis distance is a measure of the distance between a point ''P'' and a distribution ''D'', introduced by P. C. Mahalanobis in 1936. Mahalanobis's definition was prompted by the problem of identifying the similarities of skulls based ...
References
Literature
*
*
External links
Copula Methods vs Canonical Multivariate Distributions: the multivariate Student T distribution with general degrees of freedom
{{DEFAULTSORT:Multivariate Normal Distribution
Continuous distributions
Multivariate continuous distributions