Inverse-Wishart Distribution
In statistics, the inverse Wishart distribution, also called the inverted Wishart distribution, is a probability distribution defined on real-valued positive-definite matrices. In Bayesian statistics it is used as the conjugate prior for the covariance matrix of a multivariate normal distribution. We say \mathbf follows an inverse Wishart distribution, denoted as \mathbf\sim \mathcal^(\mathbf\Psi,\nu), if its inverse \mathbf^ has a Wishart distribution \mathcal(\mathbf \Psi^, \nu) . Important identities have been derived for the inverse-Wishart distribution. Density The probability density function of the inverse Wishart is: : f_(; , \nu) = \frac \left, \mathbf\^ e^ where \mathbf and are p\times p positive definite matrices, , \cdot , is the determinant, and Γ''p''(·) is the multivariate gamma function. Theorems Distribution of the inverse of a Wishart-distributed matrix If \sim \mathcal(,\nu) and is of size p \times p, then \mathbf=^ has an inverse ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Degrees Of Freedom (statistics)
In statistics, the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary. Estimates of statistical parameters can be based upon different amounts of information or data. The number of independent pieces of information that go into the estimate of a parameter is called the degrees of freedom. In general, the degrees of freedom of an estimate of a parameter are equal to the number of independent scores that go into the estimate minus the number of parameters used as intermediate steps in the estimation of the parameter itself. For example, if the variance is to be estimated from a random sample of ''N'' independent scores, then the degrees of freedom is equal to the number of independent scores (''N'') minus the number of parameters estimated as intermediate steps (one, namely, the sample mean) and is therefore equal to ''N'' − 1. Mathematically, degrees of freedom is the number of dimensions of the domain o ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Conformable Matrix
In mathematics, a matrix is conformable if its dimensions are suitable for defining some operation (''e.g.'' addition, multiplication, etc.). Examples * If two matrices have the same dimensions (number of rows and number of columns), they are ''conformable for addition''. * Multiplication of two matrices is defined if and only if the number of columns of the left matrix is the same as the number of rows of the right matrix. That is, if is an matrix and is an matrix, then needs to be equal to for the matrix product to be defined. In this case, we say that and are ''conformable for multiplication'' (in that sequence). * Since squaring a matrix involves multiplying it by itself () a matrix must be (that is, it must be a square matrix) to be ''conformable for squaring''. Thus for example only a square matrix can be idempotent. * Only a square matrix is ''conformable for matrix inversion''. However, the Moore–Penrose pseudoinverse and other generalized inverses do not hav ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Normal-inverse-Wishart Distribution
In probability theory and statistics, the normal-inverse-Wishart distribution (or Gaussian-inverse-Wishart distribution) is a multivariate four-parameter family of continuous probability distributions. It is the conjugate prior of a multivariate normal distribution with unknown mean and covariance matrix (the inverse of the precision matrix).Murphy, Kevin P. (2007). "Conjugate Bayesian analysis of the Gaussian distribution./ref> Definition Suppose : \boldsymbol\mu, \boldsymbol\mu_0,\lambda,\boldsymbol\Sigma \sim \mathcal\left(\boldsymbol\mu\Big, \boldsymbol\mu_0,\frac\boldsymbol\Sigma\right) has a multivariate normal distribution with mean \boldsymbol\mu_0 and covariance matrix \tfrac\boldsymbol\Sigma, where :\boldsymbol\Sigma, \boldsymbol\Psi,\nu \sim \mathcal^(\boldsymbol\Sigma, \boldsymbol\Psi,\nu) has an inverse Wishart distribution. Then (\boldsymbol\mu,\boldsymbol\Sigma) has a normal-inverse-Wishart distribution, denoted as : (\boldsymbol\mu,\boldsymbol\Sigma) \sim \mathr ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Inverse Matrix Gamma Distribution
In statistics, the inverse matrix gamma distribution is a generalization of the inverse gamma distribution to positive-definite matrices. It is a more general version of the inverse Wishart distribution, and is used similarly, e.g. as the conjugate prior of the covariance matrix of a multivariate normal distribution or matrix normal distribution. The compound distribution resulting from compounding a matrix normal with an inverse matrix gamma prior over the covariance matrix is a generalized matrix t-distribution. This reduces to the inverse Wishart distribution with \nu degrees of freedom when \beta=2, \alpha=\frac. See also * inverse Wishart distribution. * matrix gamma distribution. * matrix normal distribution. * matrix t-distribution. * Wishart distribution In statistics, the Wishart distribution is a generalization to multiple dimensions of the gamma distribution. It is named in honor of John Wishart, who first formulated the distribution in 1928. It is a family of p ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Gamma Function
In mathematics, the gamma function (represented by , the capital letter gamma from the Greek alphabet) is one commonly used extension of the factorial function to complex numbers. The gamma function is defined for all complex numbers except the non-positive integers. For every positive integer , \Gamma(n) = (n-1)!\,. Derived by Daniel Bernoulli, for complex numbers with a positive real part, the gamma function is defined via a convergent improper integral: \Gamma(z) = \int_0^\infty t^ e^\,dt, \ \qquad \Re(z) > 0\,. The gamma function then is defined as the analytic continuation of this integral function to a meromorphic function that is holomorphic in the whole complex plane except zero and the negative integers, where the function has simple poles. The gamma function has no zeroes, so the reciprocal gamma function is an entire function. In fact, the gamma function corresponds to the Mellin transform of the negative exponential function: \Gamma(z) = \mathcal M \ (z ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Inverse-gamma Distribution
In probability theory and statistics, the inverse gamma distribution is a two-parameter family of continuous probability distributions on the positive real line, which is the distribution of the reciprocal of a variable distributed according to the gamma distribution. Perhaps the chief use of the inverse gamma distribution is in Bayesian statistics, where the distribution arises as the marginal posterior distribution for the unknown variance of a normal distribution, if an uninformative prior is used, and as an analytically tractable conjugate prior, if an informative prior is required. It is common among some Bayesians to consider an alternative parametrization of the normal distribution in terms of the precision, defined as the reciprocal of the variance, which allows the gamma distribution to be used directly as a conjugate prior. Other Bayesians prefer to parametrize the inverse gamma distribution differently, as a scaled inverse chi-squared distribution. Characterizatio ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Univariate
In mathematics, a univariate object is an expression, equation, function or polynomial involving only one variable. Objects involving more than one variable are multivariate. In some cases the distinction between the univariate and multivariate cases is fundamental; for example, the fundamental theorem of algebra and Euclid's algorithm for polynomials are fundamental properties of univariate polynomials that cannot be generalized to multivariate polynomials. In statistics, a univariate distribution characterizes one variable, although it can be applied in other ways as well. For example, univariate data are composed of a single scalar component. In time series analysis, the whole time series is the "variable": a univariate time series is the series of values over time of a single quantity. Correspondingly, a "multivariate time series" characterizes the changing values over time of several quantities. In some cases, the terminology is ambiguous, since the values within a univariate t ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Inverse-chi-squared Distribution
In probability and statistics, the inverse-chi-squared distribution (or inverted-chi-square distributionBernardo, J.M.; Smith, A.F.M. (1993) ''Bayesian Theory'' ,Wiley (pages 119, 431) ) is a continuous probability distribution of a positive-valued random variable. It is closely related to the chi-squared distribution. It arises in Bayesian inference, where it can be used as the prior and posterior distribution for an unknown variance of the normal distribution. Definition The inverse-chi-squared distribution (or inverted-chi-square distribution ) is the probability distribution of a random variable whose multiplicative inverse (reciprocal) has a chi-squared distribution. It is also often defined as the distribution of a random variable whose reciprocal divided by its degrees of freedom is a chi-squared distribution. That is, if X has the chi-squared distribution with \nu degrees of freedom, then according to the first definition, 1/X has the inverse-chi-squared distribution w ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Vectorization (mathematics)
In mathematics, especially in linear algebra and matrix theory, the vectorization of a matrix is a linear transformation which converts the matrix into a column vector. Specifically, the vectorization of a matrix ''A'', denoted vec(''A''), is the column vector obtained by stacking the columns of the matrix ''A'' on top of one another: :\operatorname(A) = _, \ldots, a_, a_, \ldots, a_, \ldots, a_, \ldots, a_\mathrm Here, a_ represents A(i,j) and the superscript ^\mathrm denotes the transpose. Vectorization expresses, through coordinates, the isomorphism \mathbf^ := \mathbf^m \otimes \mathbf^n \cong \mathbf^ between these (i.e., of matrices and vectors) as vector spaces. For example, for the 2×2 matrix A = \begin a & b \\ c & d \end, the vectorization is \operatorname(A) = \begin a \\ c \\ b \\ d \end. The connection between the vectorization of ''A'' and the vectorization of its transpose is given by the commutation matrix. Compatibility with Kronecker products The vector ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Commutation Matrix
In mathematics, especially in linear algebra and matrix theory, the commutation matrix is used for transforming the vectorized form of a matrix into the vectorized form of its transpose. Specifically, the commutation matrix K(''m'',''n'') is the ''nm'' × ''mn'' matrix which, for any ''m'' × ''n'' matrix A, transforms vec(A) into vec(AT): :K(''m'',''n'') vec(A) = vec(AT) . Here vec(A) is the ''mn'' × 1 column vector obtain by stacking the columns of A on top of one another: :\operatorname(\mathbf) = mathbf_, \ldots, \mathbf_, \mathbf_, \ldots, \mathbf_, \ldots, \mathbf_, \ldots, \mathbf_ where A = ''A''i'',''j'' In other words, vec(A) is the vector obtained by vectorizing A in column-major order. Similarly, vec(AT) is the vector obtaining by vectorizing A in row-major order. In the context of quantum information theory, the commutation matrix is sometimes referred to as the swap matrix or swap operator Properties * The commutation matrix is a special type of permutation matr ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Prior Knowledge For Pattern Recognition
Pattern recognition is a very active field of research intimately bound to machine learning. Also known as classification or statistical classification, pattern recognition aims at building a classifier that can determine the class of an input pattern. This procedure, known as training, corresponds to learning an unknown decision function based only on a set of input-output pairs (\boldsymbol_i,y_i) that form the training data (or training set). Nonetheless, in real world applications such as character recognition, a certain amount of information on the problem is usually known beforehand. The incorporation of this prior knowledge into the training is the key element that will allow an increase of performance in many applications. Prior Knowledge Prior knowledgeB. Scholkopf and A. Smola,Learning with Kernels, MIT Press 2002. refers to all information about the problem available in addition to the training data. However, in this most general form, determining a model from a finite ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Marginalize Out
In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. It gives the probabilities of various values of the variables in the subset without reference to the values of the other variables. This contrasts with a conditional distribution, which gives the probabilities contingent upon the values of the other variables. Marginal variables are those variables in the subset of variables being retained. These concepts are "marginal" because they can be found by summing values in a table along rows or columns, and writing the sum in the margins of the table. The distribution of the marginal variables (the marginal distribution) is obtained by marginalizing (that is, focusing on the sums in the margin) over the distribution of the variables being discarded, and the discarded variables are said to have been marginalized out. The context here is that the theoretical ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |