probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...

and

statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...

, the normal-inverse-Wishart distribution (or Gaussian-inverse-Wishart distribution) is a multivariate four-parameter family of continuous

probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...

s. It is the

conjugate prior In Bayesian probability theory, if the posterior distribution p(\theta \mid x) is in the same probability distribution family as the prior probability distribution p(\theta), the prior and posterior are then called conjugate distributions, and th ...

of a

multivariate normal distribution In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One d ...

with unknown

mean There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value (magnitude and sign) of a given data set. For a data set, the ''arithme ...

and

covariance matrix In probability theory and statistics, a covariance matrix (also known as auto-covariance matrix, dispersion matrix, variance matrix, or variance–covariance matrix) is a square matrix giving the covariance between each pair of elements of ...

(the inverse of the

precision matrix In statistics, the precision matrix or concentration matrix is the matrix inverse of the covariance matrix or dispersion matrix, P = \Sigma^. For univariate distributions, the precision matrix degenerates into a scalar precision, defined as the r ...

).Murphy, Kevin P. (2007). "Conjugate Bayesian analysis of the Gaussian distribution.

/ref>

Definition

Suppose :

\boldsymbol\mu, \boldsymbol\mu_0,\lambda,\boldsymbol\Sigma \sim \mathcal\left(\boldsymbol\mu\Big, \boldsymbol\mu_0,\frac\boldsymbol\Sigma\right)

has a

with

\boldsymbol\mu_0

and

\tfrac\boldsymbol\Sigma

, where :

\boldsymbol\Sigma, \boldsymbol\Psi,\nu \sim \mathcal^(\boldsymbol\Sigma, \boldsymbol\Psi,\nu)

has an

inverse Wishart distribution In statistics, the inverse Wishart distribution, also called the inverted Wishart distribution, is a probability distribution defined on real-valued positive-definite matrices. In Bayesian statistics it is used as the conjugate prior for the cov ...

. Then

(\boldsymbol\mu,\boldsymbol\Sigma)

has a normal-inverse-Wishart distribution, denoted as :

(\boldsymbol\mu,\boldsymbol\Sigma) \sim \mathrm(\boldsymbol\mu_0,\lambda,\boldsymbol\Psi,\nu)  .

Characterization

Probability density function

f(\boldsymbol\mu,\boldsymbol\Sigma, \boldsymbol\mu_0,\lambda,\boldsymbol\Psi,\nu) = \mathcal\left(\boldsymbol\mu\Big, \boldsymbol\mu_0,\frac\boldsymbol\Sigma\right) \mathcal^(\boldsymbol\Sigma, \boldsymbol\Psi,\nu)

The full version of the PDF is as follows:

f(\boldsymbol,\boldsymbol ,  
\boldsymbol,\gamma,\boldsymbol,\alpha  )
=\frac\text\left\

Here

\Gamma_D

cdot CDOT may refer to: *\cdot – the LaTeX input for the dot operator (⋅) *Cdot, a rapper from Sumter, South Carolina *Centre for Development of Telematics, India *Chicago Department of Transportation * Clustered Data ONTAP, an operating system from ...

/math> is the multivariate gamma function and

Tr(\boldsymbol)

is the Trace of the given matrix.

Properties

Scaling

Marginal distributions

By construction, the

marginal distribution In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. It gives the probabilities of various values of the variables ...

over

\boldsymbol\Sigma

is an

, and the

conditional distribution In probability theory and statistics, given two jointly distributed random variables X and Y, the conditional probability distribution of Y given X is the probability distribution of Y when X is known to be a particular value; in some cases the co ...

over

\boldsymbol\mu

given

\boldsymbol\Sigma

is a

. The

over

\boldsymbol\mu

is a

multivariate t-distribution In statistics, the multivariate ''t''-distribution (or multivariate Student distribution) is a multivariate probability distribution. It is a generalization to random vectors of the Student's ''t''-distribution, which is a distribution applicab ...

Posterior distribution of the parameters

Suppose the sampling density is a multivariate normal distribution :

\boldsymbol, \boldsymbol\mu,\boldsymbol\Sigma \sim \mathcal_p(\boldsymbol\mu,\boldsymbol\Sigma)

where

\boldsymbol

is an

n\times p

matrix and

\boldsymbol

(of length

p

) is row

i

of the matrix . With the mean and covariance matrix of the sampling distribution is unknown, we can place a Normal-Inverse-Wishart prior on the mean and covariance parameters jointly :

(\boldsymbol\mu,\boldsymbol\Sigma) \sim \mathrm(\boldsymbol\mu_0,\lambda,\boldsymbol\Psi,\nu).

The resulting posterior distribution for the mean and covariance matrix will also be a Normal-Inverse-Wishart :

(\boldsymbol\mu,\boldsymbol\Sigma, y) \sim \mathrm(\boldsymbol\mu_n,\lambda_n,\boldsymbol\Psi_n,\nu_n),

where :

\boldsymbol\mu_n = \frac

\lambda_n = \lambda + n

\nu_n = \nu + n

\boldsymbol\Psi_n = \boldsymbol +\frac 
(\boldsymbol)(\boldsymbol)^T
~~~\mathrm~~\boldsymbol= \sum_^ (\boldsymbol)(\boldsymbol)^T

. To sample from the joint posterior of

(\boldsymbol\mu,\boldsymbol\Sigma)

, one simply draws samples from

\boldsymbol\Sigma, \boldsymbol y \sim \mathcal^(\boldsymbol\Psi_n,\nu_n)

, then draw

\boldsymbol\mu ,  \boldsymbol \sim \mathcal_p(\boldsymbol\mu_n,\boldsymbol\Sigma/\lambda_n)

. To draw from the posterior predictive of a new observation, draw

\boldsymbol\tilde, \boldsymbol \sim \mathcal_p(\boldsymbol\mu,\boldsymbol\Sigma)

, given the already drawn values of

\boldsymbol\mu

and

\boldsymbol\Sigma

.Gelman, Andrew, et al. Bayesian data analysis. Vol. 2, p.73. Boca Raton, FL, USA: Chapman & Hall/CRC, 2014.

Generating normal-inverse-Wishart random variates

Generation of random variates is straightforward: # Sample

\boldsymbol\Sigma

from an

with parameters

\boldsymbol\Psi

and

\nu

# Sample

\boldsymbol\mu

from a

with mean

\boldsymbol\mu_0

and variance

\boldsymbol \tfrac \boldsymbol\Sigma

Related distributions

* The

normal-Wishart distribution In probability theory and statistics, the normal-Wishart distribution (or Gaussian-Wishart distribution) is a multivariate four-parameter family of continuous probability distributions. It is the conjugate prior of a multivariate normal distributio ...

is essentially the same distribution parameterized by precision rather than variance. If

(\boldsymbol\mu,\boldsymbol\Sigma) \sim \mathrm(\boldsymbol\mu_0,\lambda,\boldsymbol\Psi,\nu)

then

(\boldsymbol\mu,\boldsymbol\Sigma^) \sim \mathrm(\boldsymbol\mu_0,\lambda,\boldsymbol\Psi^,\nu)

. * The

normal-inverse-gamma distribution In probability theory and statistics, the normal-inverse-gamma distribution (or Gaussian-inverse-gamma distribution) is a four-parameter family of multivariate continuous probability distributions. It is the conjugate prior of a normal distributi ...

is the one-dimensional equivalent. * The

and

are the component distributions out of which this distribution is made.

Notes

References

* Bishop, Christopher M. (2006). ''Pattern Recognition and Machine Learning.'' Springer Science+Business Media. * Murphy, Kevin P. (2007). "Conjugate Bayesian analysis of the Gaussian distribution.

{{ProbDistributions, multivariate Multivariate continuous distributions Conjugate prior distributions Normal distribution