Scatter Matrix
   HOME

TheInfoList



OR:

: ''For the notion in quantum mechanics, see
scattering matrix In physics, the ''S''-matrix or scattering matrix relates the initial state and the final state of a physical system undergoing a scattering process. It is used in quantum mechanics, scattering theory and quantum field theory (QFT). More forma ...
.'' In
multivariate statistics Multivariate statistics is a subdivision of statistics encompassing the simultaneous observation and analysis of more than one outcome variable. Multivariate statistics concerns understanding the different aims and background of each of the dif ...
and
probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...
, the scatter matrix is a
statistic A statistic (singular) or sample statistic is any quantity computed from values in a sample which is considered for a statistical purpose. Statistical purposes include estimating a population parameter, describing a sample, or evaluating a hypo ...
that is used to make
estimates {{otheruses, Estimate (disambiguation) In the Westminster system of government, the ''Estimates'' are an outline of government spending for the following fiscal year presented by the cabinet to parliament. The Estimates are drawn up by bureaucrat ...
of the
covariance matrix In probability theory and statistics, a covariance matrix (also known as auto-covariance matrix, dispersion matrix, variance matrix, or variance–covariance matrix) is a square matrix giving the covariance between each pair of elements of ...
, for instance of the
multivariate normal distribution In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One d ...
.


Definition

Given ''n'' samples of ''m''-dimensional data, represented as the m-by-n matrix, X= mathbf_1,\mathbf_2,\ldots,\mathbf_n/math>, the
sample mean The sample mean (or "empirical mean") and the sample covariance are statistics computed from a Sample (statistics), sample of data on one or more random variables. The sample mean is the average value (or mean, mean value) of a sample (statistic ...
is :\overline = \frac\sum_^n \mathbf_j where \mathbf_j is the ''j''-th column of X. The scatter matrix is the ''m''-by-''m'' positive semi-definite matrix :S = \sum_^n (\mathbf_j-\overline)(\mathbf_j-\overline)^T = \sum_^n (\mathbf_j-\overline)\otimes(\mathbf_j-\overline) = \left( \sum_^n \mathbf_j \mathbf_j^T \right) - n \overline \overline^T where (\cdot)^T denotes
matrix transpose In linear algebra, the transpose of a matrix is an operator which flips a matrix over its diagonal; that is, it switches the row and column indices of the matrix by producing another matrix, often denoted by (among other notations). The tr ...
, and multiplication is with regards to the
outer product In linear algebra, the outer product of two coordinate vector In linear algebra, a coordinate vector is a representation of a vector as an ordered list of numbers (a tuple) that describes the vector in terms of a particular ordered basis. An ea ...
. The scatter matrix may be expressed more succinctly as :S = X\,C_n\,X^T where \,C_n is the ''n''-by-''n''
centering matrix In mathematics and multivariate statistics, the centering matrixJohn I. Marden, ''Analyzing and Modeling Rank Data'', Chapman & Hall, 1995, , page 59. is a symmetric and idempotent matrix, which when multiplied with a vector has the same effect a ...
.


Application

The
maximum likelihood In statistics, maximum likelihood estimation (MLE) is a method of estimation theory, estimating the Statistical parameter, parameters of an assumed probability distribution, given some observed data. This is achieved by Mathematical optimization, ...
estimate, given ''n'' samples, for the covariance matrix of a multivariate normal distribution can be expressed as the normalized scatter matrix :C_=\fracS. When the columns of X are independently sampled from a multivariate normal distribution, then S has a
Wishart distribution In statistics, the Wishart distribution is a generalization to multiple dimensions of the gamma distribution. It is named in honor of John Wishart, who first formulated the distribution in 1928. It is a family of probability distributions define ...
.


See also

*
Estimation of covariance matrices In statistics, sometimes the covariance matrix of a multivariate random variable is not known but has to be estimated. Estimation of covariance matrices then deals with the question of how to approximate the actual covariance matrix on the basis o ...
*
Sample covariance matrix The sample mean (or "empirical mean") and the sample covariance are statistics computed from a sample of data on one or more random variables. The sample mean is the average value (or mean value) of a sample of numbers taken from a larger popula ...
*
Wishart distribution In statistics, the Wishart distribution is a generalization to multiple dimensions of the gamma distribution. It is named in honor of John Wishart, who first formulated the distribution in 1928. It is a family of probability distributions define ...
*
Outer product In linear algebra, the outer product of two coordinate vector In linear algebra, a coordinate vector is a representation of a vector as an ordered list of numbers (a tuple) that describes the vector in terms of a particular ordered basis. An ea ...
XX^\topor X⊗X is the outer product of X with itself. *
Gram matrix In linear algebra, the Gram matrix (or Gramian matrix, Gramian) of a set of vectors v_1,\dots, v_n in an inner product space is the Hermitian matrix of inner products, whose entries are given by the inner product G_ = \left\langle v_i, v_j \right\r ...


References

Covariance and correlation Matrices {{Statistics-stub