In the mathematical theory of

random matrices In probability theory and mathematical physics, a random matrix is a matrix-valued random variable—that is, a matrix in which some or all elements are random variables. Many important properties of physical systems can be represented mathemat ...

, the Marchenko–Pastur distribution, or Marchenko–Pastur law, describes the

asymptotic In analytic geometry, an asymptote () of a curve is a line such that the distance between the curve and the line approaches zero as one or both of the ''x'' or ''y'' coordinates tends to infinity. In projective geometry and related contexts, ...

behavior of

singular value In mathematics, in particular functional analysis, the singular values, or ''s''-numbers of a compact operator T: X \rightarrow Y acting between Hilbert spaces X and Y, are the square roots of the (necessarily non-negative) eigenvalues of the self- ...

s of large rectangular

. The theorem is named after

Ukrainian Ukrainian may refer to: * Something of, from, or related to Ukraine * Something relating to Ukrainians, an East Slavic people from Eastern Europe * Something relating to demographics of Ukraine in terms of demography and population of Ukraine * So ...

mathematician A mathematician is someone who uses an extensive knowledge of mathematics in their work, typically to solve mathematical problems. Mathematicians are concerned with numbers, data, quantity, structure, space, models, and change. History On ...

Vladimir Marchenko Vladimir Alexandrovich Marchenko (russian: Влади́мир Алекса́ндрович Ма́рченко, uk, Володи́мир Олекса́ндрович Ма́рченко; born 7 July 1922) is a Soviet and Ukrainian mathematician who ...

and

Leonid Pastur Leonid Andreevich Pastur ( uk, Леонід Андрійович Пастур, russian: Леонид Андреевич Пастур) (born 21 August 1937) is a Ukrainian mathematical physicist and theoretical physicist, known in particular for con ...

who proved this result in 1967. If

X

denotes a

m\times n

random matrix whose entries are independent identically distributed random variables with mean 0 and variance

\sigma^2 < \infty

, let :

Y_n =  \fracX X^T

and let

\lambda_1,\, \lambda_2, \,\dots,\, \lambda_m

be the

eigenvalue In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...

s of

Y_n

(viewed as

random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...

s). Finally, consider the random measure :

\mu_m (A) = \frac \# \left\, \quad A \subset \mathbb.

counting the number of eigenvalues in the subset

A

included in

\mathbb

. Theorem. Assume that

m,\,n \,\to\, \infty

so that the ratio

m/n \,\to\, \lambda \in (0, +\infty)

. Then

\mu_ \,\to\, \mu

(in

weak* topology In mathematics, weak topology is an alternative term for certain initial topologies, often on topological vector spaces or spaces of linear operators, for instance on a Hilbert space. The term is most commonly used for the initial topology of a ...

distribution Distribution may refer to: Mathematics *Distribution (mathematics), generalized functions used to formulate solutions of partial differential equations * Probability distribution, the probability of a particular value or value range of a vari ...

), where :

\mu(A) =\begin (1-\frac) \mathbf_ + \nu(A),& \text \lambda >1\\
\nu(A),& \text 0\leq \lambda \leq 1,
\end

and :

d\nu(x) = \frac \frac \,\mathbf_\, dx

with :

\lambda_ = \sigma^2(1 \pm \sqrt)^2.

The Marchenko–Pastur law also arises as the

free Poisson law In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known co ...

in free probability theory, having rate

1/\lambda

and jump size

\sigma^2

Cumulative distribution function

Using the same notation, cumulative distribution function reads :

F_\lambda(x) =\begin \frac \mathbf_ + \left ( \frac + F(x) \right ) \mathbf_ + \mathbf_ ,& \text \lambda >1\\
F(x)\mathbf_ + \mathbf_,& \text 0\leq \lambda \leq 1,
\end

where

F(x) = \frac \left ( \pi \lambda + \sigma^ \sqrt - (1+\lambda) \arctan \frac + (1-\lambda) \arctan  \frac \right )

and

r(x) = \sqrt

Some transforms of this law

The Cauchy transform (which is the negative of the

Stieltjes transformation In mathematics, the Stieltjes transformation of a measure of density on a real interval is the function of the complex variable defined outside by the formula S_(z)=\int_I\frac, \qquad z \in \mathbb \setminus I. Under certain conditions we c ...

), when

\sigma^2=1

, is given by :

G_\mu(z)=\frac

This gives an

R

-transform of: :

R_\mu(z)=\frac

Application to correlation matrices

When applied to correlation matrices

\sigma^2=1

and

\lambda=m/n

which leads to the bounds :

\lambda_ = \left(1 \pm \sqrt\right)^2.

Hence, it is often assumed that eigenvalues of correlation matrices lower than

\lambda_+

are by a chance, and the values higher than

\lambda_+

are the significant common factors. For instance, obtaining a correlation matrix of a year long series (i.e. 252 trading days) of 10 stock returns, would render

\lambda_+=\left(1+\sqrt\right)^2\approx 1.43

. Out of 10 eigen values of the correlation matrix only the values higher than 1.43 would be considered significant.

References

* *
Link to free-access pdf of Russian version
*
Another free access site
* * {{DEFAULTSORT:Marchenko-Pastur distribution Probability distributions Random matrices

Cumulative distribution function

Some transforms of this law

Application to correlation matrices

See also

References