In
mathematics
Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...
, the square root of a matrix extends the notion of
square root
In mathematics, a square root of a number is a number such that y^2 = x; in other words, a number whose ''square'' (the result of multiplying the number by itself, or y \cdot y) is . For example, 4 and −4 are square roots of 16 because 4 ...
from numbers to
matrices
Matrix (: matrices or matrixes) or MATRIX may refer to:
Science and mathematics
* Matrix (mathematics), a rectangular array of numbers, symbols or expressions
* Matrix (logic), part of a formula in prenex normal form
* Matrix (biology), the ...
. A matrix is said to be a square root of if the
matrix product
In mathematics, specifically in linear algebra, matrix multiplication is a binary operation that produces a matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the s ...
is equal to .
Some authors use the name ''square root'' or the notation only for the specific case when is
positive semidefinite, to denote the unique matrix that is positive semidefinite and such that (for real-valued matrices, where is the
transpose
In linear algebra, the transpose of a Matrix (mathematics), matrix is an operator which flips a matrix over its diagonal;
that is, it switches the row and column indices of the matrix by producing another matrix, often denoted by (among other ...
of ).
Less frequently, the name ''square root'' may be used for any factorization of a positive semidefinite matrix as , as in the
Cholesky factorization, even if . This distinct meaning is discussed in '.
Examples
In general, a matrix can have several square roots. In particular, if
then
as well.
For example, the 2×2
identity matrix
In linear algebra, the identity matrix of size n is the n\times n square matrix with ones on the main diagonal and zeros elsewhere. It has unique properties, for example when the identity matrix represents a geometric transformation, the obje ...
has infinitely many square roots. They are given by
:
and
where
are any numbers (real or complex) such that
In particular if
is any
Pythagorean triple
A Pythagorean triple consists of three positive integers , , and , such that . Such a triple is commonly written , a well-known example is . If is a Pythagorean triple, then so is for any positive integer . A triangle whose side lengths are a Py ...
then
is one of the matrix square roots of
which happens to be symmetric and has rational entries.
Thus
:
Minus also has a square root, for example:
:
which can be used to represent the
imaginary unit
The imaginary unit or unit imaginary number () is a mathematical constant that is a solution to the quadratic equation Although there is no real number with this property, can be used to extend the real numbers to what are called complex num ...
and hence all
complex numbers
In mathematics, a complex number is an element of a number system that extends the real numbers with a specific element denoted , called the imaginary unit and satisfying the equation i^= -1; every complex number can be expressed in the form a ...
using 2×2 real matrices, see
matrix representation of complex numbers.
Just as with the
real numbers
In mathematics, a real number is a number that can be used to measurement, measure a continuous variable, continuous one-dimensional quantity such as a time, duration or temperature. Here, ''continuous'' means that pairs of values can have arbi ...
, a real matrix may fail to have a real square root, but have a square root with
complex
Complex commonly refers to:
* Complexity, the behaviour of a system whose components interact in multiple ways so possible interactions are difficult to describe
** Complex system, a system composed of many components which may interact with each ...
-valued entries.
Some matrices have no square root. An example is the matrix
Notice that some ideas from
number theory
Number theory is a branch of pure mathematics devoted primarily to the study of the integers and arithmetic functions. Number theorists study prime numbers as well as the properties of mathematical objects constructed from integers (for example ...
do not carry over to matrices: The square root of a nonnegative
integer
An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
must either be another integer or an
irrational number
In mathematics, the irrational numbers are all the real numbers that are not rational numbers. That is, irrational numbers cannot be expressed as the ratio of two integers. When the ratio of lengths of two line segments is an irrational number, ...
, excluding non-integer
rationals. Contrast that to a matrix of integers, which can have a square root whose entries are all non-integer rational numbers, as demonstrated in some of the above examples.
Positive semidefinite matrices
A symmetric real ''n'' × ''n'' matrix
is called ''
positive semidefinite'' if
for all
(here
denotes the
transpose
In linear algebra, the transpose of a Matrix (mathematics), matrix is an operator which flips a matrix over its diagonal;
that is, it switches the row and column indices of the matrix by producing another matrix, often denoted by (among other ...
, changing a column vector into a row vector).
A square real matrix
is positive semidefinite if and only if
for some matrix .
There can be many different such matrices .
A positive semidefinite matrix can also have many matrices such that
.
However, always has precisely one square root that is ''both'' positive semidefinite and symmetric.
In particular, since is required to be symmetric,
, so the two conditions
or
are equivalent.
For complex-valued matrices, the
conjugate transpose
In mathematics, the conjugate transpose, also known as the Hermitian transpose, of an m \times n complex matrix \mathbf is an n \times m matrix obtained by transposing \mathbf and applying complex conjugation to each entry (the complex conjugate ...
is used instead and positive semidefinite matrices are
Hermitian {{Short description, none
Numerous things are named after the French mathematician Charles Hermite (1822–1901):
Hermite
* Cubic Hermite spline, a type of third-degree spline
* Gauss–Hermite quadrature, an extension of Gaussian quadrature me ...
, meaning
.
This unique matrix is called the principal, non-negative, or positive square root (the latter in the case of
positive definite matrices).
The principal square root of a real positive semidefinite matrix is real.
The principal square root of a positive definite matrix is positive definite; more generally, the rank of the principal square root of is the same as the rank of .
The operation of taking the principal square root is continuous on this set of matrices. These properties are consequences of the
holomorphic functional calculus
In mathematics, holomorphic functional calculus is functional calculus with holomorphic functions. That is to say, given a holomorphic function ''f'' of a complex argument ''z'' and an operator ''T'', the aim is to construct an operator, ''f''(''T ...
applied to matrices.
The existence and uniqueness of the principal square root can be deduced directly from the
Jordan normal form
\begin
\lambda_1 1\hphantom\hphantom\\
\hphantom\lambda_1 1\hphantom\\
\hphantom\lambda_1\hphantom\\
\hphantom\lambda_2 1\hphantom\hphantom\\
\hphantom\hphantom\lambda_2\hphantom\\
\hphantom\lambda_3\hphantom\\
\hphantom\ddots\hphantom\\
...
(see below).
Matrices with distinct eigenvalues
An matrix with ''distinct nonzero eigenvalues'' has 2
''n'' square roots. Such a matrix, , has an
eigendecomposition where is the matrix whose columns are eigenvectors of and is the diagonal matrix whose diagonal elements are the corresponding eigenvalues . Thus the square roots of are given by , where is any square root matrix of , which, for distinct eigenvalues, must be diagonal with diagonal elements equal to square roots of the diagonal elements of ; since there are two possible choices for a square root of each diagonal element of , there are 2
''n'' choices for the matrix .
This also leads to a proof of the above observation, that a positive-definite matrix has precisely one positive-definite square root: a positive definite matrix has only positive eigenvalues, and each of these eigenvalues has only one positive square root; and since the eigenvalues of the square root matrix are the diagonal elements of , for the square root matrix to be itself positive definite necessitates the use of only the unique positive square roots of the original eigenvalues.
Solutions in closed form
If a matrix is
idempotent
Idempotence (, ) is the property of certain operations in mathematics and computer science whereby they can be applied multiple times without changing the result beyond the initial application. The concept of idempotence arises in a number of pl ...
, meaning
, then by definition one of its square roots is the matrix itself.
Diagonal and triangular matrices
If is a
diagonal
In geometry, a diagonal is a line segment joining two vertices of a polygon or polyhedron, when those vertices are not on the same edge. Informally, any sloping line is called diagonal. The word ''diagonal'' derives from the ancient Greek � ...
''n'' × ''n'' matrix
,
then some of its square roots are diagonal matrices
, where
.
If the diagonal elements of are real and non-negative then it is positive semidefinite, and if the square roots are taken with the (+) sign (i.e. all non-negative), the resulting matrix is the principal root of .
A diagonal matrix may have additional non-diagonal roots if some entries on the diagonal are equal, as exemplified by the identity matrix above.
If is an
upper triangular matrix
In mathematics, a triangular matrix is a special kind of square matrix. A square matrix is called if all the entries ''above'' the main diagonal are zero. Similarly, a square matrix is called if all the entries ''below'' the main diagonal are ...
(meaning its entries are
for
) and at most one of its diagonal entries is zero, then one upper triangular solution of the equation
can be found as follows.
Since the equation
should be satisfied, let
be the
principal square root of the complex number
.
By the assumption
, this guarantees that
for all (because the principal square roots of complex numbers all lie on one half of the complex plane).
From the equation
:
we deduce that
can be computed recursively for
increasing from 1 to ''n''-1 as:
:
If is upper triangular but has multiple zeroes on the diagonal, then a square root might not exist, as exemplified by
.
Note the diagonal entries of a triangular matrix are precisely its
eigenvalue
In linear algebra, an eigenvector ( ) or characteristic vector is a vector that has its direction unchanged (or reversed) by a given linear transformation. More precisely, an eigenvector \mathbf v of a linear transformation T is scaled by a ...
s (see
Triangular matrix#Properties).
By diagonalization
An ''n'' × ''n'' matrix is
diagonalizable
In linear algebra, a square matrix A is called diagonalizable or non-defective if it is similar to a diagonal matrix. That is, if there exists an invertible matrix P and a diagonal matrix D such that . This is equivalent to (Such D are not ...
if there is a matrix and a diagonal matrix such that . This happens if and only if has ''n''
eigenvector
In linear algebra, an eigenvector ( ) or characteristic vector is a vector that has its direction unchanged (or reversed) by a given linear transformation. More precisely, an eigenvector \mathbf v of a linear transformation T is scaled by ...
s which constitute a basis for . In this case, can be chosen to be the matrix with the ''n'' eigenvectors as columns, and thus a square root of is
:
where is any square root of . Indeed,
:
For example, the matrix
can be diagonalized as , where
:
and
.
has principal square root
:
,
giving the square root
:
.
When is symmetric, the diagonalizing matrix can be made an
orthogonal matrix
In linear algebra, an orthogonal matrix, or orthonormal matrix, is a real square matrix whose columns and rows are orthonormal vectors.
One way to express this is
Q^\mathrm Q = Q Q^\mathrm = I,
where is the transpose of and is the identi ...
by suitably choosing the eigenvectors (see
spectral theorem
In linear algebra and functional analysis, a spectral theorem is a result about when a linear operator or matrix can be diagonalized (that is, represented as a diagonal matrix in some basis). This is extremely useful because computations involvin ...
). Then the inverse of is simply the transpose, so that
:
By Schur decomposition
Every complex-valued square matrix
, regardless of diagonalizability, has a
Schur decomposition
In the mathematical discipline of linear algebra, the Schur decomposition or Schur triangulation, named after Issai Schur, is a matrix decomposition. It allows one to write an arbitrary complex square matrix as unitarily similar to an upper tria ...
given by
where
is upper triangular and
is
unitary
Unitary may refer to:
Mathematics
* Unitary divisor
* Unitary element
* Unitary group
* Unitary matrix
* Unitary morphism
* Unitary operator
* Unitary transformation
* Unitary representation
* Unitarity (physics)
* ''E''-unitary inverse semigr ...
(meaning
The
eigenvalue
In linear algebra, an eigenvector ( ) or characteristic vector is a vector that has its direction unchanged (or reversed) by a given linear transformation. More precisely, an eigenvector \mathbf v of a linear transformation T is scaled by a ...
s of
are exactly the diagonal entries of
;
if at most one of them is zero, then the following is a square root
:
where a square root
of the upper triangular matrix
can be found as described above.
If
is positive definite, then the eigenvalues are all positive reals, so the chosen diagonal of
also consists of positive reals.
Hence the eigenvalues of
are positive reals, which means the resulting matrix is the principal root of
.
By Jordan decomposition
Similarly as for the Schur decomposition, every square matrix
can be decomposed as
where is
invertible
In mathematics, the concept of an inverse element generalises the concepts of opposite () and reciprocal () of numbers.
Given an operation denoted here , and an identity element denoted , if , one says that is a left inverse of , and that ...
and is in
Jordan normal form
\begin
\lambda_1 1\hphantom\hphantom\\
\hphantom\lambda_1 1\hphantom\\
\hphantom\lambda_1\hphantom\\
\hphantom\lambda_2 1\hphantom\hphantom\\
\hphantom\hphantom\lambda_2\hphantom\\
\hphantom\lambda_3\hphantom\\
\hphantom\ddots\hphantom\\
...
.
To see that any complex matrix with positive eigenvalues has a square root of the same form, it suffices to check this for a Jordan block. Any such block has the form λ(''I'' + ''N'') with λ > 0 and ''N''
nilpotent
In mathematics, an element x of a ring (mathematics), ring R is called nilpotent if there exists some positive integer n, called the index (or sometimes the degree), such that x^n=0.
The term, along with its sister Idempotent (ring theory), idem ...
. If is the binomial expansion for the square root (valid in , ''z'', < 1), then as a
formal power series
In mathematics, a formal series is an infinite sum that is considered independently from any notion of convergence, and can be manipulated with the usual algebraic operations on series (addition, subtraction, multiplication, division, partial su ...
its square equals 1 + ''z''. Substituting ''N'' for ''z'', only finitely many terms will be non-zero and
gives a square root of the Jordan block with eigenvalue .
It suffices to check uniqueness for a Jordan block with λ = 1. The square constructed above has the form where ''L'' is polynomial in ''N'' without constant term. Any other square root ''T'' with positive eigenvalues has the form ''T'' = ''I'' + ''M'' with nilpotent, commuting with ''N'' and hence ''L''. But then . Since and commute, the matrix is nilpotent and is invertible with inverse given by a
Neumann series. Hence .
If is a matrix with positive eigenvalues and
minimal polynomial , then the Jordan decomposition into generalized eigenspaces of can be deduced from the partial fraction expansion of . The corresponding projections onto the generalized eigenspaces are given by real polynomials in . On each eigenspace, has the form as above. The
power series
In mathematics, a power series (in one variable) is an infinite series of the form
\sum_^\infty a_n \left(x - c\right)^n = a_0 + a_1 (x - c) + a_2 (x - c)^2 + \dots
where ''a_n'' represents the coefficient of the ''n''th term and ''c'' is a co ...
expression for the square root on the eigenspace show that the principal square root of has the form ''q''(''A'') where ''q''(''t'') is a polynomial with real coefficients.
Power series
Recall the formal power series
, which converges provided
(since the coefficients of the power series are summable). Plugging in
into this expression yields
:
provided that
. By virtue of
Gelfand formula, that condition is equivalent to the requirement that the spectrum of
is contained within the disk
. This method of defining or computing
is especially useful in the case where
is positive semi-definite. In that case, we have
and therefore
, so that the expression
defines a square root of
which moreover turns out to be the unique positive semi-definite root. This method remains valid to define square roots of operators on infinite-dimensional Banach or Hilbert spaces or certain elements of (C*) Banach algebras.
Iterative solutions
By Denman–Beavers iteration
Another way to find the square root of an matrix ''A'' is the Denman–Beavers square root iteration.
[; ]
Let ''Y''
0 = ''A'' and ''Z''
0 = ''I'', where ''I'' is the
identity matrix
In linear algebra, the identity matrix of size n is the n\times n square matrix with ones on the main diagonal and zeros elsewhere. It has unique properties, for example when the identity matrix represents a geometric transformation, the obje ...
. The iteration is defined by
:
As this uses a pair of sequences of matrix inverses whose later elements change comparatively little, only the first elements have a high computational cost since the remainder can be computed from earlier elements with only a few passes of a variant of
Newton's method
In numerical analysis, the Newton–Raphson method, also known simply as Newton's method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or zeroes) of a ...
for
computing inverses,
:
With this, for later values of one would set
and
and then use
for some small
(perhaps just 1), and similarly for
Convergence is not guaranteed, even for matrices that do have square roots, but if the process converges, the matrix
converges quadratically to a square root
1/2, while
converges to its inverse,
−1/2.
By the Babylonian method
Yet another iterative method is obtained by taking the well-known formula of the
Babylonian method for computing the square root of a real number, and applying it to matrices. Let ''X''
0 = ''I'', where ''I'' is the
identity matrix
In linear algebra, the identity matrix of size n is the n\times n square matrix with ones on the main diagonal and zeros elsewhere. It has unique properties, for example when the identity matrix represents a geometric transformation, the obje ...
. The iteration is defined by
:
Again, convergence is not guaranteed, but if the process converges, the matrix
converges quadratically to a square root ''A''
1/2. Compared to Denman–Beavers iteration, an advantage of the Babylonian method is that only one
matrix inverse
In linear algebra, an invertible matrix (''non-singular'', ''non-degenarate'' or ''regular'') is a square matrix that has an inverse. In other words, if some other matrix is multiplied by the invertible matrix, the result can be multiplied by an ...
need be computed per iteration step. On the other hand, as Denman–Beavers iteration uses a pair of sequences of matrix inverses whose later elements change comparatively little, only the first elements have a high computational cost since the remainder can be computed from earlier elements with only a few passes of a variant of
Newton's method
In numerical analysis, the Newton–Raphson method, also known simply as Newton's method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or zeroes) of a ...
for
computing inverses (see
Denman–Beavers iteration above); of course, the same approach can be used to get the single sequence of inverses needed for the Babylonian method. However, unlike Denman–Beavers iteration, the Babylonian method is numerically unstable and more likely to fail to converge.
The Babylonian method follows from
Newton's method
In numerical analysis, the Newton–Raphson method, also known simply as Newton's method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or zeroes) of a ...
for the equation
and using
for
Square roots of positive operators
In
linear algebra
Linear algebra is the branch of mathematics concerning linear equations such as
:a_1x_1+\cdots +a_nx_n=b,
linear maps such as
:(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n,
and their representations in vector spaces and through matrix (mathemat ...
and
operator theory
In mathematics, operator theory is the study of linear operators on function spaces, beginning with differential operators and integral operators. The operators may be presented abstractly by their characteristics, such as bounded linear operato ...
, given a
bounded positive semidefinite operator (a non-negative operator) ''T'' on a complex Hilbert space, ''B'' is a square root of ''T'' if ''T'' = ''B* B'', where ''B*'' denotes the
Hermitian adjoint
In mathematics, specifically in operator theory, each linear operator A on an inner product space defines a Hermitian adjoint (or adjoint) operator A^* on that space according to the rule
:\langle Ax,y \rangle = \langle x,A^*y \rangle,
where \l ...
of ''B''. According to the
spectral theorem
In linear algebra and functional analysis, a spectral theorem is a result about when a linear operator or matrix can be diagonalized (that is, represented as a diagonal matrix in some basis). This is extremely useful because computations involvin ...
, the
continuous functional calculus In mathematics, particularly in operator theory and C*-algebra theory, the continuous functional calculus is a functional calculus which allows the application of a continuous function to normal elements of a C*-algebra.
In advanced theory, the ap ...
can be applied to obtain an operator ''T''
1/2 such that
''T''
1/2 is itself positive and (''T''
1/2)
2 = ''T''. The operator ''T''
1/2 is the unique non-negative square root of ''T''.
A bounded non-negative operator on a complex Hilbert space is self adjoint by definition. So Conversely, it is trivially true that every operator of the form ''B* B'' is non-negative. Therefore, an operator ''T'' is non-negative
if and only if
In logic and related fields such as mathematics and philosophy, "if and only if" (often shortened as "iff") is paraphrased by the biconditional, a logical connective between statements. The biconditional is true in two cases, where either bo ...
''T'' = ''B* B'' for some ''B'' (equivalently, ''T'' = ''CC*'' for some ''C'').
The
Cholesky factorization provides another particular example of square root, which should not be confused with the unique non-negative square root.
Unitary freedom of square roots
If ''T'' is a non-negative operator on a finite-dimensional Hilbert space, then all square roots of ''T'' are related by unitary transformations. More precisely, if ''T'' = ''A*A'' = ''B*B'', then there exists a
unitary
Unitary may refer to:
Mathematics
* Unitary divisor
* Unitary element
* Unitary group
* Unitary matrix
* Unitary morphism
* Unitary operator
* Unitary transformation
* Unitary representation
* Unitarity (physics)
* ''E''-unitary inverse semigr ...
''U'' such that ''A'' = ''UB''.
Indeed, take ''B'' = ''T''
to be the unique non-negative square root of ''T''. If ''T'' is strictly positive, then ''B'' is invertible, and so is unitary:
:
If ''T'' is non-negative without being strictly positive, then the inverse of ''B'' cannot be defined, but the
Moore–Penrose pseudoinverse ''B''
+ can be. In that case, the operator is a
partial isometry
Partial may refer to:
Mathematics
*Partial derivative, derivative with respect to one of several variables of a function, with the other variables held constant
** ∂, a symbol that can denote a partial derivative, sometimes pronounced "partial ...
, that is, a unitary operator from the range of ''T'' to itself. This can then be extended to a unitary operator ''U'' on the whole space by setting it equal to the identity on the kernel of ''T''. More generally, this is true on an infinite-dimensional Hilbert space if, in addition, ''T'' has
closed range. In general, if ''A'', ''B'' are
closed and densely defined operators on a Hilbert space ''H'', and ''A* A'' = ''B* B'', then ''A = UB'' where ''U'' is a partial isometry.
Some applications
Square roots, and the unitary freedom of square roots, have applications throughout
functional analysis
Functional analysis is a branch of mathematical analysis, the core of which is formed by the study of vector spaces endowed with some kind of limit-related structure (for example, Inner product space#Definition, inner product, Norm (mathematics ...
and linear algebra.
Polar decomposition
If ''A'' is an invertible operator on a finite-dimensional Hilbert space, then there is a unique unitary operator ''U'' and positive operator ''P'' such that
:
this is the polar decomposition of ''A''. The positive operator ''P'' is the unique positive square root of the positive operator ''A''
∗''A'', and ''U'' is defined by .
If ''A'' is not invertible, then it still has a polar composition in which ''P'' is defined in the same way (and is unique). The unitary operator ''U'' is not unique. Rather it is possible to determine a "natural" unitary operator as follows: ''AP''
+ is a unitary operator from the range of ''A'' to itself, which can be extended by the identity on the kernel of ''A''
∗. The resulting unitary operator ''U'' then yields the polar decomposition of ''A''.
Kraus operators
By Choi's result, a linear map
:
is completely positive if and only if it is of the form
:
where ''k'' ≤ ''nm''. Let ⊂ C
''n'' × ''n'' be the ''n''
2 elementary matrix units. The positive matrix
:
is called the ''Choi matrix'' of Φ. The Kraus operators correspond to the, not necessarily square, square roots of ''M''
Φ: For any square root ''B'' of ''M''
Φ, one can obtain a family of Kraus operators ''V
i'' by undoing the Vec operation to each column ''b
i'' of ''B''. Thus all sets of Kraus operators are related by partial isometries.
Mixed ensembles
In quantum physics, a density matrix for an ''n''-level quantum system is an ''n'' × ''n'' complex matrix ''ρ'' that is positive semidefinite with trace 1. If ''ρ'' can be expressed as
:
where
and Σ ''p
i'' = 1, the set
:
is said to be an ensemble that describes the mixed state ''ρ''. Notice is not required to be orthogonal. Different ensembles describing the state ''ρ'' are related by unitary operators, via the square roots of ''ρ''. For instance, suppose
:
The trace 1 condition means
:
Let
:
and ''v
i'' be the normalized ''a
i''. We see that
:
gives the mixed state ''ρ''.
Footnotes
See also
*
Matrix function
In mathematics, every analytic function can be used for defining a matrix function that maps square matrices with complex entries to square matrices of the same size.
This is used for defining the exponential of a matrix, which is involved in th ...
*
Holomorphic functional calculus
In mathematics, holomorphic functional calculus is functional calculus with holomorphic functions. That is to say, given a holomorphic function ''f'' of a complex argument ''z'' and an operator ''T'', the aim is to construct an operator, ''f''(''T ...
*
Logarithm of a matrix
*
Sylvester's formula
*
Square root of a 2 by 2 matrix
Citations
References
*
* , Chapter IV, Reisz functional calculus
*
*
*
*
*
*
*
Matrix theory