In
probability, and
statistics
Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, a multivariate random variable or random vector is a list of mathematical
variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge of its value. The individual variables in a random vector are grouped together because they are all part of a single mathematical system — often they represent different properties of an individual
statistical unit. For example, while a given person has a specific age, height and weight, the representation of these features of ''an unspecified person'' from within a group would be a random vector. Normally each element of a random vector is a
real number.
Random vectors are often used as the underlying implementation of various types of aggregate
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s, e.g. a
random matrix,
random tree,
random sequence,
stochastic process
In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables. Stochastic processes are widely used as mathematical models of systems and phenomena that appea ...
, etc.
More formally, a multivariate random variable is a
column vector (or its
transpose, which is a
row vector) whose components are
scalar-valued
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s on the same
probability space as each other,
, where
is the
sample space,
is the
sigma-algebra (the collection of all events), and
is the
probability measure
In mathematics, a probability measure is a real-valued function defined on a set of events in a probability space that satisfies measure properties such as ''countable additivity''. The difference between a probability measure and the more gener ...
(a function returning each event's
probability).
Probability distribution
Every random vector gives rise to a probability measure on
with the
Borel algebra as the underlying sigma-algebra. This measure is also known as the
joint probability distribution
Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considered ...
, the joint distribution, or the multivariate distribution of the random vector.
The
distributions of each of the component random variables
are called
marginal distributions. The
conditional probability distribution of
given
is the probability distribution of
when
is known to be a particular value.
The cumulative distribution function
Operations on random vectors
Random vectors can be subjected to the same kinds of
algebraic operations
Algebraic may refer to any subject related to algebra in mathematics and related branches like algebraic number theory and algebraic topology. The word algebra itself has several meanings.
Algebraic may also refer to:
* Algebraic data type, a dat ...
as can non-random vectors: addition, subtraction, multiplication by a
scalar, and the taking of
inner products
In mathematics, an inner product space (or, rarely, a Hausdorff pre-Hilbert space) is a real vector space or a complex vector space with an operation called an inner product. The inner product of two vectors in the space is a scalar, often d ...
.
Affine transformations
Similarly, a new random vector
\mathbf can be defined by applying an
affine transformation
In Euclidean geometry, an affine transformation or affinity (from the Latin, ''affinis'', "connected with") is a geometric transformation that preserves lines and parallelism, but not necessarily Euclidean distances and angles.
More generally, ...
g\colon \mathbb^n \to \mathbb^n to a random vector
\mathbf:
:
\mathbf=\mathcal\mathbf+b, where
\mathcal is an
n \times n matrix and
b is an
n \times 1 column vector.
If
\mathcal is an invertible matrix and
\textstyle\mathbf has a probability density function
f_, then the probability density of
\mathbf is
:
f_(y)=\frac.
Invertible mappings
More generally we can study invertible mappings of random vectors.
Let
g be a one-to-one mapping from an open subset
\mathcal of
\mathbb^n onto a subset
\mathcal of
\mathbb^n, let
g have continuous partial derivatives in
\mathcal and let the
Jacobian determinant of
g be zero at no point of
\mathcal. Assume that the real random vector
\mathbf has a probability density function
f_(\mathbf) and satisfies
P(\mathbf \in \mathcal) = 1. Then the random vector
\mathbf=g(\mathbf) is of probability density
:
\left. f_(\mathbf)=\frac \right , _ \mathbf(\mathbf \in R_\mathbf)
where
\mathbf denotes the
indicator function
In mathematics, an indicator function or a characteristic function of a subset of a set is a function that maps elements of the subset to one, and all other elements to zero. That is, if is a subset of some set , one has \mathbf_(x)=1 if x\i ...
and set
R_\mathbf = \ \subseteq \mathcal denotes support of
\mathbf.
Expected value
The
expected value
In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a l ...
or mean of a random vector
\mathbf is a fixed vector
\operatorname mathbf/math> whose elements are the expected values of the respective random variables.
Covariance and cross-covariance
Definitions
The
covariance matrix
In probability theory and statistics, a covariance matrix (also known as auto-covariance matrix, dispersion matrix, variance matrix, or variance–covariance matrix) is a square matrix giving the covariance between each pair of elements of ...
(also called second central moment or variance-covariance matrix) of an
n \times 1 random vector is an
n \times n matrix
Matrix most commonly refers to:
* ''The Matrix'' (franchise), an American media franchise
** '' The Matrix'', a 1999 science-fiction action film
** "The Matrix", a fictional setting, a virtual reality environment, within ''The Matrix'' (franchi ...
whose (''i,j'')
th element is the
covariance between the ''i''
th and the ''j''
th random variables. The covariance matrix is the expected value, element by element, of the
n \times n matrix
computed as mathbf-\operatorname[\mathbf mathbf-\operatorname[\mathbf^T, where the superscript T refers to the transpose of the indicated vector:
[
By extension, the cross-covariance matrix between two random vectors \mathbf and \mathbf (\mathbf having n elements and \mathbf having p elements) is the n \times p matrix][
where again the matrix expectation is taken element-by-element in the matrix. Here the (''i,j'')th element is the covariance between the ''i'' th element of \mathbf and the ''j'' th element of \mathbf.
]
Properties
The covariance matrix is a symmetric matrix, i.e.[
:\operatorname_^T = \operatorname_.
The covariance matrix is a ]positive semidefinite matrix
In mathematics, a symmetric matrix M with real entries is positive-definite if the real number z^\textsfMz is positive for every nonzero real column vector z, where z^\textsf is the transpose of More generally, a Hermitian matrix (that is, a co ...
, i.e.[
:\mathbf^T \operatorname_ \mathbf \ge 0 \quad \text \mathbf \in \mathbb^n.
The cross-covariance matrix \operatorname mathbf,\mathbf/math> is simply the transpose of the matrix \operatorname mathbf,\mathbf/math>, i.e.
:\operatorname_ = \operatorname_^T.
]
Uncorrelatedness
Two random vectors \mathbf=(X_1,...,X_m)^T and \mathbf=(Y_1,...,Y_n)^T are called uncorrelated if
:\operatorname mathbf \mathbf^T= \operatorname mathbfoperatorname mathbfT.
They are uncorrelated if and only if their cross-covariance matrix \operatorname_ is zero.[
]
Correlation and cross-correlation
Definitions
The correlation matrix
In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistic ...
(also called second moment) of an n \times 1 random vector is an n \times n matrix whose (''i,j'')th element is the correlation between the ''i'' th and the ''j'' th random variables. The correlation matrix is the expected value, element by element, of the n \times n matrix computed as \mathbf \mathbf^T, where the superscript T refers to the transpose of the indicated vector:[
By extension, the cross-correlation matrix between two random vectors \mathbf and \mathbf (\mathbf having n elements and \mathbf having p elements) is the n \times p matrix
]
Properties
The correlation matrix is related to the covariance matrix by
:\operatorname_ = \operatorname_ + \operatorname mathbfoperatorname mathbfT.
Similarly for the cross-correlation matrix and the cross-covariance matrix:
:\operatorname_ = \operatorname_ + \operatorname mathbfoperatorname mathbfT
Orthogonality
Two random vectors of the same size \mathbf=(X_1,...,X_n)^T and \mathbf=(Y_1,...,Y_n)^T are called orthogonal if
:\operatorname mathbf^T \mathbf= 0.
Independence
Two random vectors \mathbf and \mathbf are called independent if for all \mathbf and \mathbf
:F_(\mathbf) = F_(\mathbf) \cdot F_(\mathbf)
where F_(\mathbf) and F_(\mathbf) denote the cumulative distribution functions of \mathbf and \mathbf andF_(\mathbf) denotes their joint cumulative distribution function. Independence of \mathbf and \mathbf is often denoted by \mathbf \perp\!\!\!\perp \mathbf.
Written component-wise, \mathbf and \mathbf are called independent if for all x_1,\ldots,x_m,y_1,\ldots,y_n
:F_(x_1,\ldots,x_m,y_1,\ldots,y_n) = F_(x_1,\ldots,x_m) \cdot F_(y_1,\ldots,y_n).
Characteristic function
The characteristic function of a random vector \mathbf with n components is a function \mathbb^n \to \mathbb that maps every vector \mathbf = (\omega_1,\ldots,\omega_n)^T to a complex number. It is defined by[
: \varphi_(\mathbf) = \operatorname \left e^ \right = \operatorname \left e^ \right /math>.
]
Further properties
Expectation of a quadratic form
One can take the expectation of a quadratic form
In mathematics, a quadratic form is a polynomial with terms all of degree two ("form" is another name for a homogeneous polynomial). For example,
:4x^2 + 2xy - 3y^2
is a quadratic form in the variables and . The coefficients usually belong to a ...
in the random vector \mathbf as follows:
:\operatorname mathbf^A\mathbf= \operatorname mathbfA\operatorname mathbf+ \operatorname(A K_),
where K_ is the covariance matrix of \mathbf and \operatorname refers to the trace of a matrix — that is, to the sum of the elements on its main diagonal (from upper left to lower right). Since the quadratic form is a scalar, so is its expectation.
Proof: Let \mathbf be an m \times 1 random vector with \operatorname mathbf= \mu and \operatorname mathbf V and let A be an m \times m non-stochastic matrix.
Then based on the formula for the covariance, if we denote \mathbf^T = \mathbf and \mathbf^T A^T = \mathbf, we see that:
:\operatorname mathbf,\mathbf= \operatorname mathbf\mathbf^T\operatorname mathbfoperatorname mathbfT
Hence
:\begin
\operatorname Y^T &= \operatorname ,Y\operatorname operatorname T \\
\operatorname^T Az
In computing, a Control key is a modifier key which, when pressed in conjunction with another key, performs a special operation (for example, ); similar to the Shift key, the Control key rarely performs any function when pressed by itself. ...
&= \operatorname ^T,z^T A^T+ \operatorname ^Toperatorname^T A^T
In computing, a Control key is a modifier key which, when pressed in conjunction with another key, performs a special operation (for example, Control-C, ); similar to the Shift key, the Control key rarely performs any function when pressed by ...
T \\
&=\operatorname^T , z^T A^T
In computing, a Control key is a modifier key which, when pressed in conjunction with another key, performs a special operation (for example, Control-C, ); similar to the Shift key, the Control key rarely performs any function when pressed by ...
+ \mu^T (\mu^T A^T)^T \\
&=\operatorname^T , z^T A^T
In computing, a Control key is a modifier key which, when pressed in conjunction with another key, performs a special operation (for example, Control-C, ); similar to the Shift key, the Control key rarely performs any function when pressed by ...
+ \mu^T A \mu ,
\end
which leaves us to show that
:\operatorname^T , z^T A^T
In computing, a Control key is a modifier key which, when pressed in conjunction with another key, performs a special operation (for example, ); similar to the Shift key, the Control key rarely performs any function when pressed by itself. ...
\operatorname(AV).
This is true based on the fact that one can cyclically permute matrices when taking a trace without changing the end result (e.g.: \operatorname(AB) = \operatorname(BA)).
We see that
:\begin
\operatorname ^T,z^T A^T&= \operatorname \left left(z^T - E(z^T) \right)\left(z^T A^T - E\left(z^T A^T \right) \right)^T \right\\
&= \operatorname \left (z^T - \mu^T) (z^T A^T - \mu^T A^T )^T \right\
&= \operatorname \left (z - \mu)^T (Az - A\mu) \right
\end
And since
:\left( \right)^T \left( \right)
is a scalar, then
:(z - \mu)^T ( Az - A\mu)= \operatorname\left( \right) = \operatorname \left((z - \mu )^T A(z - \mu ) \right)
trivially. Using the permutation we get:
:\operatorname\left( \right) = \operatorname\left( \right),
and by plugging this into the original formula we get:
:\begin
\operatorname \left \right&= E\left \right\\
&= E \left \operatorname\left( A(z - \mu )(z - \mu )^T \right) \right\\
&= \operatorname \left( \right) \\
&= \operatorname (A V).
\end
Expectation of the product of two different quadratic forms
One can take the expectation of the product of two different quadratic forms in a zero-mean Gaussian random vector \mathbf as follows:[
:\operatorname\left \mathbf^A\mathbf)(\mathbf^B\mathbf)\right= 2\operatorname(A K_ B K_) + \operatorname(A K_)\operatorname(B K_)
where again K_ is the covariance matrix of \mathbf. Again, since both quadratic forms are scalars and hence their product is a scalar, the expectation of their product is also a scalar.
]
Applications
Portfolio theory
In portfolio theory in finance
Finance is the study and discipline of money, currency and capital assets. It is related to, but not synonymous with economics, the study of production, distribution, and consumption of money, assets, goods and services (the discipline of ...
, an objective often is to choose a portfolio of risky assets such that the distribution of the random portfolio return has desirable properties. For example, one might want to choose the portfolio return having the lowest variance for a given expected value. Here the random vector is the vector \mathbf of random returns on the individual assets, and the portfolio return ''p'' (a random scalar) is the inner product of the vector of random returns with a vector ''w'' of portfolio weights — the fractions of the portfolio placed in the respective assets. Since ''p'' = ''w''T\mathbf, the expected value of the portfolio return is ''w''TE(\mathbf) and the variance of the portfolio return can be shown to be ''w''TC''w'', where C is the covariance matrix of \mathbf.
Regression theory
In linear regression
In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is call ...
theory, we have data on ''n'' observations on a dependent variable ''y'' and ''n'' observations on each of ''k'' independent variables ''xj''. The observations on the dependent variable are stacked into a column vector ''y''; the observations on each independent variable are also stacked into column vectors, and these latter column vectors are combined into a design matrix ''X'' (not denoting a random vector in this context) of observations on the independent variables. Then the following regression equation is postulated as a description of the process that generated the data:
:y = X \beta + e,
where β is a postulated fixed but unknown vector of ''k'' response coefficients, and ''e'' is an unknown random vector reflecting random influences on the dependent variable. By some chosen technique such as ordinary least squares
In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one effects of a linear function of a set of explanatory variables) by the prin ...
, a vector \hat \beta is chosen as an estimate of β, and the estimate of the vector ''e'', denoted \hat e, is computed as
:\hat e = y - X \hat \beta.
Then the statistician must analyze the properties of \hat \beta and \hat e, which are viewed as random vectors since a randomly different selection of ''n'' cases to observe would have resulted in different values for them.
Vector time series
The evolution of a ''k''×1 random vector \mathbf through time can be modelled as a vector autoregression (VAR) as follows:
:\mathbf_t = c + A_1 \mathbf_ + A_2 \mathbf_ + \cdots + A_p \mathbf_ + \mathbf_t, \,
where the ''i''-periods-back vector observation \mathbf_ is called the ''i''-th lag of \mathbf, ''c'' is a ''k'' × 1 vector of constants ( intercepts), ''Ai'' is a time-invariant ''k'' × ''k'' matrix
Matrix most commonly refers to:
* ''The Matrix'' (franchise), an American media franchise
** '' The Matrix'', a 1999 science-fiction action film
** "The Matrix", a fictional setting, a virtual reality environment, within ''The Matrix'' (franchi ...
and \mathbf_t is a ''k'' × 1 random vector of error terms.
References
Further reading
*{{cite book , first=Henry , last=Stark , first2=John W. , last2=Woods , title=Probability, Statistics, and Random Processes for Engineers , publisher=Pearson , edition=Fourth , year=2012 , chapter=Random Vectors , pages=295–339 , isbn=978-0-13-231123-6
Multivariate statistics
Algebra of random variables
de:Zufallsvariable#Mehrdimensionale Zufallsvariable
pl:Zmienna losowa#Uogólnienia