HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, the multivariate ''t''-distribution (or multivariate Student distribution) is a
multivariate probability distribution Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considere ...
. It is a generalization to
random vector In probability, and statistics, a multivariate random variable or random vector is a list of mathematical variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge of its value. ...
s of the Student's ''t''-distribution, which is a distribution applicable to univariate
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s. While the case of a
random matrix In probability theory and mathematical physics, a random matrix is a matrix-valued random variable—that is, a matrix in which some or all elements are random variables. Many important properties of physical systems can be represented mathemat ...
could be treated within this structure, the matrix ''t''-distribution is distinct and makes particular use of the matrix structure.


Definition

One common method of construction of a multivariate ''t''-distribution, for the case of p dimensions, is based on the observation that if \mathbf y and u are independent and distributed as N(,) and \chi^2_\nu (i.e.
multivariate normal In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One d ...
and
chi-squared distribution In probability theory and statistics, the chi-squared distribution (also chi-square or \chi^2-distribution) with k degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables. The chi-squa ...
s) respectively, the matrix \mathbf\, is a ''p'' × ''p'' matrix, and /\sqrt = -, then has the density : \frac\left +\frac(-)^T^(-)\right and is said to be distributed as a multivariate ''t''-distribution with parameters ,,\nu. Note that \mathbf\Sigma is not the covariance matrix since the covariance is given by \nu/(\nu-2)\mathbf\Sigma (for \nu>2). The constructive definition of a multivariate ''t''-distribution simultaneously serves as a sampling algorithm: # Generate u \sim \chi^2_\nu and \mathbf \sim N(\mathbf, \boldsymbol), independently. # Compute \mathbf \gets \sqrt\mathbf+ \boldsymbol. This formulation gives rise to the hierarchical representation of a multivariate ''t''-distribution as a scale-mixture of normals: u \sim \mathrm(\nu/2,\nu/2) where \mathrm(a,b) indicates a gamma distribution with density proportional to x^e^, and \mathbf\mid u conditionally follows N(\boldsymbol,u^\boldsymbol). In the special case \nu=1, the distribution is a multivariate Cauchy distribution.


Derivation

There are in fact many candidates for the multivariate generalization of Student's ''t''-distribution. An extensive survey of the field has been given by Kotz and Nadarajah (2004). The essential issue is to define a probability density function of several variables that is the appropriate generalization of the formula for the univariate case. In one dimension (p=1), with t=x-\mu and \Sigma=1, we have the
probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
:f(t) = \frac (1+t^2/\nu)^ and one approach is to write down a corresponding function of several variables. This is the basic idea of
elliptical distribution In probability and statistics, an elliptical distribution is any member of a broad family of probability distributions that generalize the multivariate normal distribution. Intuitively, in the simplified two and three dimensional case, the joint ...
theory, where one writes down a corresponding function of p variables t_i that replaces t^2 by a quadratic function of all the t_i. It is clear that this only makes sense when all the marginal distributions have the same
degrees of freedom Degrees of freedom (often abbreviated df or DOF) refers to the number of independent variables or parameters of a thermodynamic system. In various scientific fields, the word "freedom" is used to describe the limits to which physical movement or ...
\nu. With \mathbf = \boldsymbol\Sigma^, one has a simple choice of multivariate density function :f(\mathbf t) = \frac \left(1+\sum_^ A_ t_i t_j/\nu\right)^ which is the standard but not the only choice. An important special case is the standard bivariate ''t''-distribution, ''p'' = 2: :f(t_1,t_2) = \frac \left(1+\sum_^ A_ t_i t_j/\nu\right)^ Note that \frac= \frac . Now, if \mathbf is the identity matrix, the density is :f(t_1,t_2) = \frac \left(1+(t_1^2 + t_2^2)/\nu\right)^. The difficulty with the standard representation is revealed by this formula, which does not factorize into the product of the marginal one-dimensional distributions. When \Sigma is diagonal the standard representation can be shown to have zero
correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...
but the
marginal distribution In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. It gives the probabilities of various values of the variables ...
s do not agree with
statistical independence Independence is a fundamental notion in probability theory, as in statistics and the theory of stochastic processes. Two events are independent, statistically independent, or stochastically independent if, informally speaking, the occurrence of ...
.


Cumulative distribution function

The definition of the
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ev ...
(cdf) in one dimension can be extended to multiple dimensions by defining the following probability (here \mathbf is a real vector): : F(\mathbf) = \mathbb(\mathbf\leq \mathbf), \quad \textrm\;\; \mathbf\sim t_\nu(\boldsymbol\mu,\boldsymbol\Sigma). There is no simple formula for F(\mathbf), but it can b
approximated numerically
via
Monte Carlo integration In mathematics, Monte Carlo integration is a technique for numerical integration using random numbers. It is a particular Monte Carlo method that numerically computes a definite integral. While other algorithms usually evaluate the integrand a ...
.


Conditional Distribution

This was demonstrated by Muirhead though previously derived using the simpler ratio representation above, by Cornish. Let vector X follow the multivariate ''t'' distribution and partition into two subvectors of p_1, p_2 elements: : X_p = \begin X_1 \\ X_2 \end \sim t_p \left (\mu_p, \Sigma_, \nu \right ) where p_1 + p_2 = p , the known mean vector is \mu_p = \begin \mu_1 \\ \mu_2 \end and the scale matrix is \Sigma_ = \begin \Sigma_ & \Sigma_ \\ \Sigma_ & \Sigma_ \end . Then : p(X_2, X_1) \sim t_ \left( \mu_,\frac \Sigma_, \nu + p_1 \right) where : \mu_ = \mu_2 + \Sigma_ \Sigma_^ \left(X_1 - \mu_1 \right ) is the conditional mean where it exists or median otherwise. : \Sigma_ = \Sigma_ - \Sigma_ \Sigma_^ \Sigma_ is the
Schur complement In linear algebra and the theory of matrices, the Schur complement of a block matrix is defined as follows. Suppose ''p'', ''q'' are nonnegative integers, and suppose ''A'', ''B'', ''C'', ''D'' are respectively ''p'' × ''p'', ''p'' × ''q'', ''q'' ...
of \Sigma_ \text \Sigma. : d_1 = (X_1 - \mu_1)^T \Sigma_^ (X_1 - \mu_1) is the squared
Mahalanobis distance The Mahalanobis distance is a measure of the distance between a point ''P'' and a distribution ''D'', introduced by P. C. Mahalanobis in 1936. Mahalanobis's definition was prompted by the problem of identifying the similarities of skulls based ...
of X_1 from \mu_1 with scale matrix \Sigma_ See for a simple proof of the above conditional distribution.


Copulas based on the multivariate ''t''

The use of such distributions is enjoying renewed interest due to applications in
mathematical finance Mathematical finance, also known as quantitative finance and financial mathematics, is a field of applied mathematics, concerned with mathematical modeling of financial markets. In general, there exist two separate branches of finance that require ...
, especially through the use of the Student's ''t'' copula.


Elliptical Representation

Constructed as an
elliptical distribution In probability and statistics, an elliptical distribution is any member of a broad family of probability distributions that generalize the multivariate normal distribution. Intuitively, in the simplified two and three dimensional case, the joint ...
and in the simplest centralised case with spherical symmetry and without scaling, \Sigma = \operatorname \, , the multivariate t PDF takes the form : f_X(X)= g(X^T X) = \frac \bigg( 1 + \nu^ X^T X \bigg)^ where X =(x_1, \cdots ,x_p )^T\text p\text and \nu = degrees of freedom. The expected covariance of X is : \int_^\infty \cdots \int_^\infty f_X(x_1,\dots, x_p) XX^T \, dx_1 \dots dx_p = \frac \operatorname (XX^T) The aim is to convert the Cartesian PDF to a radial one. Kibria and Joarder, in a tutorial-style paper, define radial measure r_2 = R^2 = \frac such that
\operatorname
r_2 R2, R02, R.II, R.2 or R-2 may refer to: Entertainment * ''R2 (Rock'n'Reel)'', a British music magazine * R2-D2, a character from ''Star Wars'' films and books, nickname R2 * ''Code Geass: Lelouch of the Rebellion R2'', a 2008 anime series * ''Res ...
= \int_^\infty \cdots \int_^\infty f_X(x_1,\dots, x_p) \frac \, dx_1 \dots dx_p
which is equivalent to the expected variance of p -element vector X treated as a univariate zero-mean random sequence. They note that r_2 follows the Fisher-Snedecor or F distribution: : r_2 \sim F_( p,\nu) = B \bigg( \frac , \frac \bigg ) ^ \bigg (\frac \bigg )^ r_2^ \bigg( 1 + \frac r_2 \bigg) ^ having mean value \operatorname
r_2 R2, R02, R.II, R.2 or R-2 may refer to: Entertainment * ''R2 (Rock'n'Reel)'', a British music magazine * R2-D2, a character from ''Star Wars'' films and books, nickname R2 * ''Code Geass: Lelouch of the Rebellion R2'', a 2008 anime series * ''Res ...
= \frac . By a change of random variable to y = \frac r_2 = \frac in the equation above, retaining p -vector X , we have \operatorname y = \int_^\infty \cdots \int_^\infty f_X(X) \frac \, dx_1 \dots dx_p = \frac and probability distribution : \begin f_Y(y, \,p,\nu) & = \frac B \bigg( \frac , \frac \bigg )^ \big (\frac \big )^ \big (\frac \big )^ y^ \big( 1 + y \big) ^ \\ \\ & = B \bigg ( \frac , \frac \bigg )^ y^(1+ y )^ \end which is a regular
Beta-prime distribution In probability theory and statistics, the beta prime distribution (also known as inverted beta distribution or beta distribution of the second kindJohnson et al (1995), p 248) is an absolutely continuous probability distribution. Definitions ...
y \sim \beta \, ' \bigg(y; \frac , \frac \bigg ) having mean value \frac = \frac . The cumulative distribution function of y is thus known to be
F_Y(y) \sim I \, \bigg(\frac ; \, \frac , \frac \bigg )
where I is the incomplete
Beta function In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral : \Beta(z_1,z_2) = \int_0^1 t^(1 ...
. These results can be derived by straightforward transformation of coordinates from cartesian to spherical. A constant radius surface at R = (X^TX)^ with PDF p_X(X) \propto \bigg( 1 + \nu^ R^2 \bigg)^ is an iso-density surface. The quantum of probability in a surface shell of area A_R and thickness \delta R at R is \delta P = p_X(R) \, A_R \delta R . The enclosed sphere in p dimensions has surface area A_R = \frac and substitution into \delta P shows that the shell has element of probability \delta P = p_X(R) \frac \delta R . This is equivalent to a radial density function : f_R(R) = \frac \frac \bigg( 1 + \frac \bigg)^ which simplifies to f_R(R) = \frac \bigg( \frac \bigg)^ \bigg( 1 + \frac \bigg)^ where B(*,*) is the
Beta function In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral : \Beta(z_1,z_2) = \int_0^1 t^(1 ...
. Changing the radial variable to r_2=R^2 gets : f_(r_2) = \frac \bigg( \frac \bigg)^ \bigg( 1 + \frac \bigg)^ Finally, scaling to y= r_2 / \nu returns the previous Beta Prime distribution : f_Y(y) = \frac y^ \bigg( 1 + y \bigg)^ To scale the radial variables without changing the radial shape function, define scale matrix \Sigma = \alpha \operatorname , yielding a 3-parameter Cartesian density function, ie. the probability \Delta in volume element dx_1 \dots dx_p is : \Delta f_X(X \,, \alpha, p, \nu) = \frac \bigg( 1 + \frac \bigg)^ \; dx_1 \dots dx_p or, in terms of scalar radial variable R , : f_R(R \,, \alpha, p, \nu) = \frac \bigg( \frac \bigg)^ \bigg( 1 + \frac \bigg)^ The moments of all the radial variables can be derived from the Beta Prime distribution. If Z \sim \beta'(a,b) then \operatorname (Z^m) = , a known result. Thus, for variable y , proportional to R^2 , we have : \operatorname (y^m) = = \frac The moments of r_2 = \nu \, y are : \operatorname (r_2^m) = \nu^m\operatorname (y^m) while introducing the scale matrix yields : \operatorname (r_2^m , \alpha) = \alpha^m \nu^m \operatorname (y^m) Moments relating to radial variable R are found by setting R =(\alpha\nu y)^ and M=2m whereupon : \operatorname (R^M ) =\operatorname \big((\alpha \nu y)^ \big)^ = (\alpha \nu )^ \operatorname (y^)= (\alpha \nu )^


Linear Combinations and Affine Transformation

Following section 3.3 of Kibria et.al. let Z be a p -vector sampled from a central spherical multivariate ''t'' distribution with \nu degrees of freedom: Z_p \sim mvt_p(0, \operatorname, \nu) . X is derived from Z via a linear transformation: : X = \mu + \Sigma^ Z where \Sigma has full rank, then : X \sim mvt_p(\mu, \Sigma, \nu) That is \operatorname(X) = \mu and the covariance of X is \operatorname \big (X-\mu)(X-\mu)^T \big= \frac \Sigma Furthermore, if A is a non-singular matrix then : Y = AX + b \sim mvt_p(A \mu + b, A \Sigma A^T, \nu) with mean \operatorname (Y) = A \mu + b and covariance \operatorname \big (Y- A \mu -b)(Y- A \mu -b)^T \big= \frac A\Sigma A^T . Roth (reference below) notes that if A is a p_1 \times p_2 squat matrix with p_1 < p_2 then Y has distribution Y_ \sim mvt_(A \mu + b, A \Sigma A^T, \nu) . If A takes the form Y_ = \begin \operatorname & 0_ \end X_p then the PDF of Y_ is the marginal distribution of the leading p_1 elements of X_p . The degrees of freedom parameter \nu is invariant throughout.


Related concepts

In univariate statistics, the Student's ''t''-test makes use of Student's ''t''-distribution. Hotelling's ''T''-squared distribution is a distribution that arises in multivariate statistics. The matrix ''t''-distribution is a distribution for random variables arranged in a matrix structure.


See also

*
Multivariate normal distribution In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One d ...
, which is the limiting case of the multivariate Student's t-distribution when \nu\uparrow\infty. *
Chi distribution In probability theory and statistics, the chi distribution is a continuous probability distribution. It is the distribution of the positive square root of the sum of squares of a set of independent random variables each following a standard norm ...
, the
pdf Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. ...
of the scaling factor in the construction the Student's t-distribution and also the 2-norm (or
Euclidean norm Euclidean space is the fundamental space of geometry, intended to represent physical space. Originally, that is, in Euclid's ''Elements'', it was the three-dimensional space of Euclidean geometry, but in modern mathematics there are Euclidean s ...
) of a multivariate normally distributed vector (centered at zero). ** Rayleigh distribution#Student's t, random vector length of multivariate ''t''-distribution *
Mahalanobis distance The Mahalanobis distance is a measure of the distance between a point ''P'' and a distribution ''D'', introduced by P. C. Mahalanobis in 1936. Mahalanobis's definition was prompted by the problem of identifying the similarities of skulls based ...


References


Literature

* *


External links


Copula Methods vs Canonical Multivariate Distributions: the multivariate Student T distribution with general degrees of freedom
{{DEFAULTSORT:Multivariate Normal Distribution Continuous distributions Multivariate continuous distributions