In numerical analysis, one of the most important problems is designing efficient and

stable A stable is a building in which livestock, especially horses, are kept. It most commonly means a building that is divided into separate stalls for individual animals and livestock. There are many different types of stables in use today; the ...

algorithms for finding the eigenvalues of a

matrix Matrix most commonly refers to: * ''The Matrix'' (franchise), an American media franchise ** ''The Matrix'', a 1999 science-fiction action film ** "The Matrix", a fictional setting, a virtual reality environment, within ''The Matrix'' (franchis ...

. These eigenvalue algorithms may also find eigenvectors.

Eigenvalues and eigenvectors

Given an

square matrix In mathematics, a square matrix is a matrix with the same number of rows and columns. An ''n''-by-''n'' matrix is known as a square matrix of order Any two square matrices of the same order can be added and multiplied. Square matrices are often ...

of real or complex numbers, an '' eigenvalue'' and its associated '' generalized eigenvector'' are a pair obeying the relation :

\left(A - \lambda I\right)^k  = 0,

where is a nonzero column vector, is the

identity matrix In linear algebra, the identity matrix of size n is the n\times n square matrix with ones on the main diagonal and zeros elsewhere. Terminology and notation The identity matrix is often denoted by I_n, or simply by I if the size is immaterial o ...

, is a positive integer, and both and are allowed to be complex even when is real. When , the vector is called simply an '' eigenvector'', and the pair is called an ''eigenpair''. In this case, . Any eigenvalue of has ordinaryThe term "ordinary" is used here only to emphasize the distinction between "eigenvector" and "generalized eigenvector". eigenvectors associated to it, for if is the smallest integer such that for a generalized eigenvector , then is an ordinary eigenvector. The value can always be taken as less than or equal to . In particular, for all generalized eigenvectors associated with . For each eigenvalue of , the kernel consists of all eigenvectors associated with (along with 0), called the '' eigenspace'' of , while the vector space consists of all generalized eigenvectors, and is called the ''

generalized eigenspace In linear algebra, a generalized eigenvector of an n\times n matrix A is a vector which satisfies certain criteria which are more relaxed than those for an (ordinary) eigenvector. Let V be an n-dimensional vector space; let \phi be a linear map ...

''. The '' geometric multiplicity'' of is the dimension of its eigenspace. The '' algebraic multiplicity'' of is the dimension of its generalized eigenspace. The latter terminology is justified by the equation :

p_A\left(z\right) = \det\left( zI - A \right) = \prod_^k (z - \lambda_i)^,

where is the determinant function, the are all the distinct eigenvalues of and the are the corresponding algebraic multiplicities. The function is the ''

characteristic polynomial In linear algebra, the characteristic polynomial of a square matrix is a polynomial which is invariant under matrix similarity and has the eigenvalues as roots. It has the determinant and the trace of the matrix among its coefficients. The chara ...

'' of . So the algebraic multiplicity is the multiplicity of the eigenvalue as a zero of the characteristic polynomial. Since any eigenvector is also a generalized eigenvector, the geometric multiplicity is less than or equal to the algebraic multiplicity. The algebraic multiplicities sum up to , the degree of the characteristic polynomial. The equation is called the ''characteristic equation'', as its roots are exactly the eigenvalues of . By the Cayley–Hamilton theorem, itself obeys the same equation: .where the constant term is multiplied by the identity matrix . As a consequence, the columns of the matrix

\prod_ (A - \lambda_iI)^

must be either 0 or generalized eigenvectors of the eigenvalue , since they are annihilated by

(A - \lambda_jI)^

. In fact, the column space is the generalized eigenspace of . Any collection of generalized eigenvectors of distinct eigenvalues is linearly independent, so a basis for all of can be chosen consisting of generalized eigenvectors. More particularly, this basis can be chosen and organized so that * if and have the same eigenvalue, then so does for each between and , and * if is not an ordinary eigenvector, and if is its eigenvalue, then (in particular, must be an ordinary eigenvector). If these basis vectors are placed as the column vectors of a matrix , then can be used to convert to its Jordan normal form: :

V^AV = \begin \lambda_1 & \beta_1 & 0 & \ldots & 0 \\ 0 & \lambda_2 & \beta_2 & \ldots & 0 \\ 0 & 0 & \lambda_3 & \ldots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \ldots & \lambda_n \end,

where the are the eigenvalues, if and otherwise. More generally, if is any invertible matrix, and is an eigenvalue of with generalized eigenvector , then . Thus is an eigenvalue of with generalized eigenvector . That is, similar matrices have the same eigenvalues.

Normal, Hermitian, and real-symmetric matrices

The adjoint of a complex matrix is the transpose of the conjugate of : . A square matrix is called '' normal'' if it commutes with its adjoint: . It is called '' Hermitian'' if it is equal to its adjoint: . All Hermitian matrices are normal. If has only real elements, then the adjoint is just the transpose, and is Hermitian if and only if it is symmetric. When applied to column vectors, the adjoint can be used to define the canonical inner product on : .This ordering of the inner product (with the conjugate-linear position on the left), is preferred by physicists. Algebraists often place the conjugate-linear position on the right: . Normal, Hermitian, and real-symmetric matrices have several useful properties: * Every generalized eigenvector of a normal matrix is an ordinary eigenvector. * Any normal matrix is similar to a diagonal matrix, since its Jordan normal form is diagonal. * Eigenvectors of distinct eigenvalues of a normal matrix are orthogonal. * The null space and the image (or column space) of a normal matrix are orthogonal to each other. * For any normal matrix , has an orthonormal basis consisting of eigenvectors of . The corresponding matrix of eigenvectors is

unitary Unitary may refer to: Mathematics * Unitary divisor * Unitary element * Unitary group * Unitary matrix * Unitary morphism * Unitary operator * Unitary transformation * Unitary representation * Unitarity (physics) * ''E''-unitary inverse semigroup ...

. * The eigenvalues of a Hermitian matrix are real, since for a non-zero eigenvector . * If is real, there is an orthonormal basis for consisting of eigenvectors of if and only if is symmetric. It is possible for a real or complex matrix to have all real eigenvalues without being Hermitian. For example, a real triangular matrix has its eigenvalues along its diagonal, but in general is not symmetric.

Condition number

Any problem of numeric calculation can be viewed as the evaluation of some function for some input . The

condition number In numerical analysis, the condition number of a function measures how much the output value of the function can change for a small change in the input argument. This is used to measure how sensitive a function is to changes or errors in the input ...

of the problem is the ratio of the relative error in the function's output to the relative error in the input, and varies with both the function and the input. The condition number describes how error grows during the calculation. Its base-10 logarithm tells how many fewer digits of accuracy exist in the result than existed in the input. The condition number is a best-case scenario. It reflects the instability built into the problem, regardless of how it is solved. No algorithm can ever produce more accurate results than indicated by the condition number, except by chance. However, a poorly designed algorithm may produce significantly worse results. For example, as mentioned below, the problem of finding eigenvalues for normal matrices is always well-conditioned. However, the problem of finding the roots of a polynomial can be very ill-conditioned. Thus eigenvalue algorithms that work by finding the roots of the characteristic polynomial can be ill-conditioned even when the problem is not. For the problem of solving the linear equation where is invertible, the matrix condition number is given by , where is the operator norm subordinate to the normal

Euclidean norm Euclidean space is the fundamental space of geometry, intended to represent physical space. Originally, that is, in Euclid's ''Elements'', it was the three-dimensional space of Euclidean geometry, but in modern mathematics there are Euclidean s ...

on . Since this number is independent of and is the same for and , it is usually just called the condition number of the matrix . This value is also the absolute value of the ratio of the largest eigenvalue of to its smallest. If is

, then , so . For general matrices, the operator norm is often difficult to calculate. For this reason, other

matrix norms In mathematics, a matrix norm is a vector norm in a vector space whose elements (vectors) are matrices (of given dimensions). Preliminaries Given a field K of either real or complex numbers, let K^ be the -vector space of matrices with m rows ...

are commonly used to estimate the condition number. For the eigenvalue problem, Bauer and Fike proved that if is an eigenvalue for a diagonalizable matrix with eigenvector matrix , then the absolute error in calculating is bounded by the product of and the absolute error in . As a result, the condition number for finding is . If is normal, then is unitary, and . Thus the eigenvalue problem for all normal matrices is well-conditioned. The condition number for the problem of finding the eigenspace of a normal matrix corresponding to an eigenvalue has been shown to be inversely proportional to the minimum distance between and the other distinct eigenvalues of . In particular, the eigenspace problem for normal matrices is well-conditioned for isolated eigenvalues. When eigenvalues are not isolated, the best that can be hoped for is to identify the span of all eigenvectors of nearby eigenvalues.

Algorithms

Any monic polynomial is the characteristic polynomial of its companion matrix. Therefore, a general algorithm for finding eigenvalues could also be used to find the roots of polynomials. The Abel–Ruffini theorem shows that any such algorithm for dimensions greater than 4 must either be infinite, or involve functions of greater complexity than elementary arithmetic operations and fractional powers. For this reason algorithms that exactly calculate eigenvalues in a finite number of steps only exist for a few special classes of matrices. For general matrices, algorithms are iterative, producing better approximate solutions with each iteration. Some algorithms produce every eigenvalue, others will produce a few, or only one. However, even the latter algorithms can be used to find all eigenvalues. Once an eigenvalue of a matrix has been identified, it can be used to either direct the algorithm towards a different solution next time, or to reduce the problem to one that no longer has as a solution. Redirection is usually accomplished by shifting: replacing with for some constant . The eigenvalue found for must have added back in to get an eigenvalue for . For example, for power iteration, . Power iteration finds the largest eigenvalue in absolute value, so even when is only an approximate eigenvalue, power iteration is unlikely to find it a second time. Conversely,

inverse iteration In numerical analysis, inverse iteration (also known as the ''inverse power method'') is an Iterative method, iterative eigenvalue algorithm. It allows one to find an approximate eigenvector when an approximation to a corresponding eigenvalue is alr ...

based methods find the lowest eigenvalue, so is chosen well away from and hopefully closer to some other eigenvalue. Reduction can be accomplished by restricting to the column space of the matrix , which carries to itself. Since is singular, the column space is of lesser dimension. The eigenvalue algorithm can then be applied to the restricted matrix. This process can be repeated until all eigenvalues are found. If an eigenvalue algorithm does not produce eigenvectors, a common practice is to use an inverse iteration based algorithm with set to a close approximation to the eigenvalue. This will quickly converge to the eigenvector of the closest eigenvalue to . For small matrices, an alternative is to look at the column space of the product of for each of the other eigenvalues . A formula for the norm of unit eigenvector components of normal matrices was discovered by Robert Thompson in 1966 and rediscovered independently by several others. If is an

n \times n

normal matrix with eigenvalues and corresponding unit eigenvectors whose component entries are , let be the

n - 1 \times n - 1

matrix obtained by removing the -th row and column from , and let be its -th eigenvalue. Then

, v_, ^2 \prod_^n (\lambda_i(A) - \lambda_k(A)) = \prod_^(\lambda_i(A) - \lambda_k(A_j))

p, p_j

are the characteristic polynomials of

A

and

A_j

, the formula can be re-written as

, v_, ^2 = \frac

assuming the derivative

p'

is not zero at

\lambda_i(A)

Hessenberg and tridiagonal matrices

Because the eigenvalues of a triangular matrix are its diagonal elements, for general matrices there is no finite method like

gaussian elimination In mathematics, Gaussian elimination, also known as row reduction, is an algorithm for solving systems of linear equations. It consists of a sequence of operations performed on the corresponding matrix of coefficients. This method can also be used ...

to convert a matrix to triangular form while preserving eigenvalues. But it is possible to reach something close to triangular. An upper Hessenberg matrix is a square matrix for which all entries below the

subdiagonal In geometry, a diagonal is a line segment joining two vertices of a polygon or polyhedron, when those vertices are not on the same edge. Informally, any sloping line is called diagonal. The word ''diagonal'' derives from the ancient Greek δ� ...

are zero. A lower Hessenberg matrix is one for which all entries above the

superdiagonal In geometry, a diagonal is a line segment joining two vertices of a polygon or polyhedron, when those vertices are not on the same edge. Informally, any sloping line is called diagonal. The word ''diagonal'' derives from the ancient Greek δ� ...

are zero. Matrices that are both upper and lower Hessenberg are

tridiagonal In linear algebra, a tridiagonal matrix is a band matrix that has nonzero elements only on the main diagonal, the subdiagonal/lower diagonal (the first diagonal below this), and the supradiagonal/upper diagonal (the first diagonal above the main di ...

. Hessenberg and tridiagonal matrices are the starting points for many eigenvalue algorithms because the zero entries reduce the complexity of the problem. Several methods are commonly used to convert a general matrix into a Hessenberg matrix with the same eigenvalues. If the original matrix was symmetric or Hermitian, then the resulting matrix will be tridiagonal. When only eigenvalues are needed, there is no need to calculate the similarity matrix, as the transformed matrix has the same eigenvalues. If eigenvectors are needed as well, the similarity matrix may be needed to transform the eigenvectors of the Hessenberg matrix back into eigenvectors of the original matrix. For symmetric tridiagonal eigenvalue problems all eigenvalues (without eigenvectors) can be computed numerically in time O(n log(n)), using bisection on the characteristic polynomial.

Iterative algorithms

Iterative algorithms solve the eigenvalue problem by producing sequences that converge to the eigenvalues. Some algorithms also produce sequences of vectors that converge to the eigenvectors. Most commonly, the eigenvalue sequences are expressed as sequences of similar matrices which converge to a triangular or diagonal form, allowing the eigenvalues to be read easily. The eigenvector sequences are expressed as the corresponding similarity matrices.

Direct calculation

While there is no simple algorithm to directly calculate eigenvalues for general matrices, there are numerous special classes of matrices where eigenvalues can be directly calculated. These include:

Triangular matrices

Since the determinant of a triangular matrix is the product of its diagonal entries, if ''T'' is triangular, then

\det(\lambda I - T) = \prod_i (\lambda - T_)

. Thus the eigenvalues of ''T'' are its diagonal entries.

Factorable polynomial equations

If is any polynomial and then the eigenvalues of also satisfy the same equation. If happens to have a known factorization, then the eigenvalues of lie among its roots. For example, a

projection Projection, projections or projective may refer to: Physics * Projection (physics), the action/process of light, heat, or sound reflecting from a surface to another in a different direction * The display of images by a projector Optics, graphic ...

is a square matrix satisfying . The roots of the corresponding scalar polynomial equation, , are 0 and 1. Thus any projection has 0 and 1 for its eigenvalues. The multiplicity of 0 as an eigenvalue is the nullity of , while the multiplicity of 1 is the rank of . Another example is a matrix that satisfies for some scalar . The eigenvalues must be . The projection operators :

P_+=\frac\left(I+\frac\right)

P_-=\frac\left(I-\frac\right)

satisfy :

AP_+=\alpha P_+ \quad AP_-=-\alpha P_-

and :

P_+P_+=P_+ \quad P_-P_-=P_- \quad P_+P_-=P_-P_+=0.

The column spaces of and are the eigenspaces of corresponding to and , respectively.

2×2 matrices

For dimensions 2 through 4, formulas involving radicals exist that can be used to find the eigenvalues. While a common practice for 2×2 and 3×3 matrices, for 4×4 matrices the increasing complexity of the root formulas makes this approach less attractive. For the 2×2 matrix :

A = \begin a  & b \\ c & d \end,

the characteristic polynomial is :

\det \begin \lambda - a & -b \\ -c & \lambda - d \end = \lambda^2\, -\, \left( a + d \right )\lambda\, +\, \left ( ad - bc \right ) = \lambda^2\, -\, \lambda\, (A)\, +\, \det(A).

Thus the eigenvalues can be found by using the quadratic formula: :

\lambda = \frac.

Defining

\left ( A \right ) = \sqrt

to be the distance between the two eigenvalues, it is straightforward to calculate :

\frac = \frac\left ( 1 \pm \frac \right ),\qquad \frac =  \frac

with similar formulas for and . From this it follows that the calculation is well-conditioned if the eigenvalues are isolated. Eigenvectors can be found by exploiting the Cayley–Hamilton theorem. If are the eigenvalues, then , so the columns of are annihilated by and vice versa. Assuming neither matrix is zero, the columns of each must include eigenvectors for the other eigenvalue. (If either matrix is zero, then is a multiple of the identity and any non-zero vector is an eigenvector.) For example, suppose :

A = \begin 4 & 3 \\ -2 & -3 \end,

then and , so the characteristic equation is :

0 = \lambda^2 - \lambda - 6 = (\lambda - 3)(\lambda + 2),

and the eigenvalues are 3 and -2. Now, :

A - 3I = \begin 1 & 3 \\ -2 & -6 \end, \qquad  A + 2I = \begin 6 & 3 \\ -2 & -1 \end.

In both matrices, the columns are multiples of each other, so either column can be used. Thus, can be taken as an eigenvector associated with the eigenvalue −2, and as an eigenvector associated with the eigenvalue 3, as can be verified by multiplying them by .

3×3 matrices

The characteristic equation of a symmetric 3×3 matrix is: :

\det \left( \alpha I - A \right) = \alpha^3 - \alpha^2 (A) - \alpha \frac\left( (A^2) - ^2(A) \right) - \det(A) = 0.

This equation may be solved using the methods of Cardano or

Lagrange Joseph-Louis Lagrange (born Giuseppe Luigi Lagrangiatrigonometric solution. If , then and have the same eigenvectors, and is an eigenvalue of if and only if is an eigenvalue of . Letting

q = (A)/3

and

p =\left(\left((A - qI)^2\right)/ 6\right)^

, gives :

\det \left( \beta I - B \right) = \beta^3 - 3 \beta - \det(B) = 0.

The substitution and some simplification using the identity reduces the equation to . Thus :

\beta = 2\left(\frac\left( \det(B)/2 \right) + \frac\right), \quad k = 0, 1, 2.

If is complex or is greater than 2 in absolute value, the arccosine should be taken along the same branch for all three values of . This issue doesn't arise when is real and symmetric, resulting in a simple algorithm: % Given a real symmetric 3x3 matrix A, compute the eigenvalues % Note that acos and cos operate on angles in radians p1 = A(1,2)^2 + A(1,3)^2 + A(2,3)^2 if (p1

0) % A is diagonal. eig1 = A(1,1) eig2 = A(2,2) eig3 = A(3,3) else q = trace(A)/3 % trace(A) is the sum of all diagonal values p2 = (A(1,1) - q)^2 + (A(2,2) - q)^2 + (A(3,3) - q)^2 + 2 * p1 p = sqrt(p2 / 6) B = (1 / p) * (A - q * I) % I is the identity matrix r = det(B) / 2 % In exact arithmetic for a symmetric matrix -1 <= r <= 1 % but computation error can leave it slightly outside this range. if (r <= -1) phi = pi / 3 elseif (r >= 1) phi = 0 else phi = acos(r) / 3 end % the eigenvalues satisfy eig3 <= eig2 <= eig1 eig1 = q + 2 * p * cos(phi) eig3 = q + 2 * p * cos(phi + (2*pi/3)) eig2 = 3 * q - eig1 - eig3 % since trace(A) = eig1 + eig2 + eig3 end Once again, the eigenvectors of can be obtained by recourse to the Cayley–Hamilton theorem. If are distinct eigenvalues of , then . Thus the columns of the product of any two of these matrices will contain an eigenvector for the third eigenvalue. However, if , then and . Thus the ''generalized'' eigenspace of is spanned by the columns of while the ordinary eigenspace is spanned by the columns of . The ordinary eigenspace of is spanned by the columns of . For example, let :

A = \begin 3 & 2 & 6 \\ 2 & 2 & 5 \\ -2 & -1 & -4 \end.

The characteristic equation is :

0 = \lambda^3 - \lambda^2 - \lambda + 1 = (\lambda - 1)^2(\lambda + 1),

with eigenvalues 1 (of multiplicity 2) and -1. Calculating, :

A - I = \begin 2 & 2 & 6 \\ 2 & 1 & 5 \\ -2 & -1 & -5 \end, \qquad A + I = \begin 4 & 2 & 6 \\ 2 & 3 & 5 \\ -2 & -1 & -3 \end

and :

(A - I)^2 = \begin -4 & 0 & -8 \\ -4 & 0 & -8 \\ 4 & 0 & 8 \end, \qquad (A - I)(A + I) = \begin 0 & 4 & 4 \\ 0 & 2 & 2 \\ 0 & -2 & -2 \end

Thus is an eigenvector for −1, and is an eigenvector for 1. and are both generalized eigenvectors associated with 1, either one of which could be combined with and to form a basis of generalized eigenvectors of . Once found, the eigenvectors can be normalized if needed.

Eigenvectors of normal 3×3 matrices

If a 3×3 matrix

A

is normal, then the cross-product can be used to find eigenvectors. If

\lambda

is an eigenvalue of

A

, then the null space of

A - \lambda I

is perpendicular to its column space. The

cross product In mathematics, the cross product or vector product (occasionally directed area product, to emphasize its geometric significance) is a binary operation on two vectors in a three-dimensional oriented Euclidean vector space (named here E), and is ...

of two independent columns of

A - \lambda I

will be in the null space. That is, it will be an eigenvector associated with

\lambda

. Since the column space is two dimensional in this case, the eigenspace must be one dimensional, so any other eigenvector will be parallel to it. If

A - \lambda I

does not contain two independent columns but is not , the cross-product can still be used. In this case

\lambda

is an eigenvalue of multiplicity 2, so any vector perpendicular to the column space will be an eigenvector. Suppose

\mathbf v

is a non-zero column of

A - \lambda I

. Choose an arbitrary vector

\mathbf u

not parallel to

\mathbf v

. Then

\mathbf v\times \mathbf u

and

(\mathbf v\times \mathbf u)\times \mathbf v

will be perpendicular to

\mathbf v

and thus will be eigenvectors of

\lambda

. This does not work when

A

is not normal, as the null space and column space do not need to be perpendicular for such matrices.