In numerical

linear algebra Linear algebra is the branch of mathematics concerning linear equations such as: :a_1x_1+\cdots +a_nx_n=b, linear maps such as: :(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n, and their representations in vector spaces and through matrices. ...

, the Arnoldi iteration is an

eigenvalue algorithm In numerical analysis, one of the most important problems is designing efficient and stable algorithms for finding the eigenvalues of a matrix. These eigenvalue algorithms may also find eigenvectors. Eigenvalues and eigenvectors Given an square ...

and an important example of an

iterative method In computational mathematics, an iterative method is a Algorithm, mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which the ''n''-th approximation is derived fr ...

. Arnoldi finds an approximation to the

eigenvalue In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...

s and

eigenvector In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...

s of general (possibly non-

Hermitian {{Short description, none Numerous things are named after the French mathematician Charles Hermite (1822–1901): Hermite * Cubic Hermite spline, a type of third-degree spline * Gauss–Hermite quadrature, an extension of Gaussian quadrature m ...

)

matrices Matrix most commonly refers to: * ''The Matrix'' (franchise), an American media franchise ** ''The Matrix'', a 1999 science-fiction action film ** "The Matrix", a fictional setting, a virtual reality environment, within ''The Matrix'' (franchis ...

by constructing an orthonormal basis of the

Krylov subspace In linear algebra, the order-''r'' Krylov subspace generated by an ''n''-by-''n'' matrix ''A'' and a vector ''b'' of dimension ''n'' is the linear subspace spanned by the images of ''b'' under the first ''r'' powers of ''A'' (starting from A^0=I), ...

, which makes it particularly useful when dealing with large

sparse matrices In numerical analysis and scientific computing, a sparse matrix or sparse array is a matrix in which most of the elements are zero. There is no strict definition regarding the proportion of zero-value elements for a matrix to qualify as sparse b ...

. The Arnoldi method belongs to a class of linear algebra algorithms that give a partial result after a small number of iterations, in contrast to so-called ''direct methods'' which must complete to give any useful results (see for example,

Householder transformation In linear algebra, a Householder transformation (also known as a Householder reflection or elementary reflector) is a linear transformation that describes a reflection about a plane or hyperplane containing the origin. The Householder transformati ...

). The partial result in this case being the first few vectors of the basis the algorithm is building. When applied to Hermitian matrices it reduces to the

Lanczos algorithm The Lanczos algorithm is an iterative method devised by Cornelius Lanczos that is an adaptation of power iteration, power methods to find the m "most useful" (tending towards extreme highest/lowest) eigenvalues and eigenvectors of an n \times n ...

. The Arnoldi iteration was invented by W. E. Arnoldi in 1951.

Krylov subspaces and the power iteration

An intuitive method for finding the largest (in absolute value) eigenvalue of a given ''m'' × ''m'' matrix

A

is the

power iteration In mathematics, power iteration (also known as the power method) is an eigenvalue algorithm: given a diagonalizable matrix (mathematics), matrix A, the algorithm will produce a number \lambda, which is the greatest (in absolute value) eigenvalue of ...

: starting with an arbitrary initial

vector Vector most often refers to: *Euclidean vector, a quantity with a magnitude and a direction *Vector (epidemiology), an agent that carries and transmits an infectious pathogen into another living organism Vector may also refer to: Mathematic ...

b, calculate normalizing the result after every application of the matrix ''A''. This sequence converges to the

corresponding to the eigenvalue with the largest absolute value,

\lambda_

. However, much potentially useful computation is wasted by using only the final result,

A^b

. This suggests that instead, we form the so-called ''Krylov matrix'': :

K_ = \beginb & Ab & A^b & \cdots & A^b \end.

The columns of this matrix are not in general

orthogonal In mathematics, orthogonality is the generalization of the geometric notion of ''perpendicularity''. By extension, orthogonality is also used to refer to the separation of specific features of a system. The term also has specialized meanings in ...

, but we can extract an orthogonal

basis Basis may refer to: Finance and accounting * Adjusted basis, the net cost of an asset after adjusting for various tax-related items *Basis point, 0.01%, often used in the context of interest rates * Basis trading, a trading strategy consisting ...

, via a method such as Gram–Schmidt orthogonalization. The resulting set of vectors is thus an orthogonal basis of the ''

'',

\mathcal_

. We may expect the vectors of this basis to

span Span may refer to: Science, technology and engineering * Span (unit), the width of a human hand * Span (engineering), a section between two intermediate supports * Wingspan, the distance between the wingtips of a bird or aircraft * Sorbitan ester ...

good approximations of the eigenvectors corresponding to the

n

largest eigenvalues, for the same reason that

A^b

approximates the dominant eigenvector.

The Arnoldi iteration

The Arnoldi iteration uses the modified Gram–Schmidt process to produce a sequence of orthonormal vectors, ''q''₁, ''q''₂, ''q''₃, ..., called the ''Arnoldi vectors'', such that for every ''n'', the vectors ''q''₁, ..., ''q''_''n'' span the Krylov subspace

\mathcal_n

. Explicitly, the algorithm is as follows: Start with an arbitrary vector ''q''₁ with norm 1. Repeat for ''k'' = 2, 3, ... ''q''_''k'' := ''A'' ''q''_''k''−1 for ''j'' from 1 to ''k'' − 1 ''h''_{''j'',''k''−1} := ''q''_''j''^* ''q''_''k'' ''q''_''k'' := ''q''_''k'' − h_{''j'',''k''−1} ''q''_''j'' ''h''_{''k'',''k''−1} := , , ''q''_''k'', , ''q''_''k'' := ''q''_''k'' / ''h''_{''k'',''k''−1} The ''j''-loop projects out the component of

q_k

in the directions of

q_1,\dots,q_

. This ensures the orthogonality of all the generated vectors. The algorithm breaks down when ''q''_''k'' is the zero vector. This happens when the minimal polynomial of ''A'' is of degree ''k''. In most applications of the Arnoldi iteration, including the eigenvalue algorithm below and

GMRES In mathematics, the generalized minimal residual method (GMRES) is an iterative method for the numerical solution of an indefinite nonsymmetric system of linear equations. The method approximates the solution by the vector in a Krylov subspace wit ...

, the algorithm has converged at this point. Every step of the ''k''-loop takes one matrix-vector product and approximately 4''mk'' floating point operations. In the programming language

Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...

with support of the NumPy library: import numpy as np def arnoldi_iteration(A, b, n: int): """Computes a basis of the (n + 1)-Krylov subspace of A: the space spanned by . Arguments A: m × m array b: initial vector (length m) n: dimension of Krylov subspace, must be >= 1 Returns Q: m x (n + 1) array, the columns are an orthonormal basis of the Krylov subspace. h: (n + 1) x n array, A on basis Q. It is upper Hessenberg. """ eps = 1e-12 h = np.zeros((n+1,n)) Q = np.zeros((A.shape n+1)) # Normalize the input vector Q ,0= b / np.linalg.norm(b,2) # Use it as the first Krylov vector for k in range(1,n+1): v = np.dot(A, Q ,k-1 # Generate a new candidate vector for j in range(k): # Subtract the projections on previous vectors h ,k-1= np.dot(Q ,jT, v) v = v - h ,k-1* Q ,j h ,k-1= np.linalg.norm(v,2) if h ,k-1> eps: # Add the produced vector to the list, unless Q ,k= v/h ,k-1 else: # If that happens, stop iterating. return Q, h return Q, h

Properties of the Arnoldi iteration

Let ''Q''_''n'' denote the ''m''-by-''n'' matrix formed by the first ''n'' Arnoldi vectors ''q''₁, ''q''₂, ..., ''q''_''n'', and let ''H''_''n'' be the (upper Hessenberg) matrix formed by the numbers ''h''_''j'',''k'' computed by the algorithm: :

H_n = Q_n^* A Q_n.

The orthogonalization method has to be specifically chosen such that the lower Arnoldi/Krylov components are removed from higher Krylov vectors. As

A q_i

can be expressed in terms of ''q''₁, ..., ''q''_''i''+1 by construction, they are orthogonal to ''q''_''i''+2, ..., ''q''_''n'', We then have :

H_n = \begin
   h_ & h_ & h_ & \cdots  & h_ \\
   h_ & h_ & h_ & \cdots  & h_ \\
   0       & h_ & h_ & \cdots  & h_ \\
   \vdots  & \ddots  & \ddots  & \ddots  & \vdots  \\
   0       & \cdots  & 0     & h_ & h_
\end.

The matrix ''H''_''n'' can be viewed as ''A'' in the subspace

\mathcal_n

with the Arnoldi vectors as an orthogonal basis; ''A'' is orthogonally projected onto

\mathcal_n

. The matrix ''H''_''n'' can be characterized by the following optimality condition. The

characteristic polynomial In linear algebra, the characteristic polynomial of a square matrix is a polynomial which is invariant under matrix similarity and has the eigenvalues as roots. It has the determinant and the trace of the matrix among its coefficients. The chara ...

of ''H''_''n'' minimizes , , ''p''(''A'')''q''₁, , ₂ among all

monic polynomial In algebra, a monic polynomial is a single-variable polynomial (that is, a univariate polynomial) in which the leading coefficient (the nonzero coefficient of highest degree) is equal to 1. Therefore, a monic polynomial has the form: :x^n+c_x^+\cd ...

s of degree ''n''. This optimality problem has a unique solution if and only if the Arnoldi iteration does not break down. The relation between the ''Q'' matrices in subsequent iterations is given by :

A Q_n = Q_ \tilde_n

where :

\tilde_n = \begin
   h_ & h_ & h_ & \cdots  & h_ \\
   h_ & h_ & h_ & \cdots  & h_ \\
   0       & h_ & h_ & \cdots  & h_ \\
   \vdots  & \ddots  & \ddots  & \ddots  & \vdots  \\
   \vdots  &         & 0       & h_ & h_ \\
   0       & \cdots  & \cdots  & 0       & h_
\end

is an (''n''+1)-by-''n'' matrix formed by adding an extra row to ''H''_''n''.

Finding eigenvalues with the Arnoldi iteration

The idea of the Arnoldi iteration as an

is to compute the eigenvalues in the Krylov subspace. The eigenvalues of ''H''_''n'' are called the ''Ritz eigenvalues''. Since ''H''_''n'' is a Hessenberg matrix of modest size, its eigenvalues can be computed efficiently, for instance with the

QR algorithm In numerical linear algebra, the QR algorithm or QR iteration is an eigenvalue algorithm: that is, a procedure to calculate the eigenvalues and eigenvectors of a matrix. The QR algorithm was developed in the late 1950s by John G. F. Francis and by ...

, or somewhat related, Francis's algorithm. Also Francis's algorithm itself can be considered to be related to power iterations, operating on nested Krylov subspace. In fact, the most basic form of Francis's algorithm appears to be to choose ''b'' to be equal to ''Ae''₁, and extending ''n'' to the full dimension of ''A''. Improved versions include one or more shifts, and higher powers of ''A'' may be applied in a single steps. This is an example of the Rayleigh-Ritz method. It is often observed in practice that some of the Ritz eigenvalues converge to eigenvalues of ''A''. Since ''H''_''n'' is ''n''-by-''n'', it has at most ''n'' eigenvalues, and not all eigenvalues of ''A'' can be approximated. Typically, the Ritz eigenvalues converge to the largest eigenvalues of ''A''. To get the smallest eigenvalues of ''A'', the inverse (operation) of ''A'' should be used instead. This can be related to the characterization of ''H''_''n'' as the matrix whose characteristic polynomial minimizes , , ''p''(''A'')''q''₁, , in the following way. A good way to get ''p''(''A'') small is to choose the polynomial ''p'' such that ''p''(''x'') is small whenever ''x'' is an eigenvalue of ''A''. Hence, the zeros of ''p'' (and thus the Ritz eigenvalues) will be close to the eigenvalues of ''A''. However, the details are not fully understood yet. This is in contrast to the case where ''A'' is

. In that situation, the Arnoldi iteration becomes the

Lanczos iteration The Lanczos algorithm is an iterative method devised by Cornelius Lanczos that is an adaptation of power methods to find the m "most useful" (tending towards extreme highest/lowest) eigenvalues and eigenvectors of an n \times n Hermitian matri ...

, for which the theory is more complete. Arnoldi Iteration

]

Implicitly restarted Arnoldi method (IRAM)

Due to practical storage consideration, common implementations of Arnoldi methods typically restart after some number of iterations. One major innovation in restarting was due to Lehoucq and Sorensen who proposed the Implicitly Restarted Arnoldi Method. They also implemented the algorithm in a freely available software package called

ARPACK ARPACK, the ARnoldi PACKage, is a numerical computation, numerical software library written in Fortran, FORTRAN 77 for solving large scale eigenvalue problems in the matrix-free methods, matrix-free fashion. The package is designed to compute a fe ...

. This has spurred a number of other variations including Implicitly Restarted Lanczos method. It also influenced how other restarted methods are analyzed. Theoretical results have shown that convergence improves with an increase in the Krylov subspace dimension ''n''. However, an a-priori value of ''n'' which would lead to optimal convergence is not known. Recently a dynamic switching strategy has been proposed which fluctuates the dimension ''n'' before each restart and thus leads to acceleration in the rate of convergence.

References

* W. E. Arnoldi, "The principle of minimized iterations in the solution of the matrix eigenvalue problem," ''Quarterly of Applied Mathematics'', volume 9, pages 17–29, 1951. *

Yousef Saad Yousef Saad (born 1950) is an I.T. Distinguished Professor of Computer Science in the Department of Computer Science and Engineering at the University of Minnesota.

, ''Numerical Methods for Large Eigenvalue Problems'', Manchester University Press, 1992. . * Lloyd N. Trefethen and David Bau, III, ''Numerical Linear Algebra'', Society for Industrial and Applied Mathematics, 1997. . * Jaschke, Leonhard: ''Preconditioned Arnoldi Methods for Systems of Nonlinear Equations''. (2004). * Implementation:

Matlab MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementation ...