The Rayleigh–Ritz method is a direct numerical method of approximating
eigenvalues
In linear algebra, an eigenvector ( ) or characteristic vector is a vector that has its direction unchanged (or reversed) by a given linear transformation. More precisely, an eigenvector \mathbf v of a linear transformation T is scaled by a ...
, originated in the context of solving physical
boundary value problems and named after
Lord Rayleigh and
Walther Ritz.
In this method, an infinite-dimensional
linear operator
In mathematics, and more specifically in linear algebra, a linear map (also called a linear mapping, linear transformation, vector space homomorphism, or in some contexts linear function) is a mapping V \to W between two vector spaces that pr ...
is approximated by a finite-dimensional
compression, on which we can use an
eigenvalue algorithm.
It is used in all applications that involve approximating
eigenvalues and eigenvectors
In linear algebra, an eigenvector ( ) or characteristic vector is a vector that has its direction unchanged (or reversed) by a given linear transformation. More precisely, an eigenvector \mathbf v of a linear transformation T is scaled by a ...
, often under different names. In
quantum mechanics
Quantum mechanics is the fundamental physical Scientific theory, theory that describes the behavior of matter and of light; its unusual characteristics typically occur at and below the scale of atoms. Reprinted, Addison-Wesley, 1989, It is ...
, where a system of particles is described using a
Hamiltonian
Hamiltonian may refer to:
* Hamiltonian mechanics, a function that represents the total energy of a system
* Hamiltonian (quantum mechanics), an operator corresponding to the total energy of that system
** Dyall Hamiltonian, a modified Hamiltonian ...
, the Ritz method uses
trial wave functions to approximate the ground state eigenfunction with the lowest energy. In the
finite element method
Finite element method (FEM) is a popular method for numerically solving differential equations arising in engineering and mathematical modeling. Typical problem areas of interest include the traditional fields of structural analysis, heat tran ...
context, mathematically the same algorithm is commonly called the
Ritz-Galerkin method. The Rayleigh–Ritz method or Ritz method terminology is typical in mechanical and structural engineering to approximate the
eigenmodes and
resonant frequencies of a structure.
Naming and attribution
The name of the method and its origin story have been debated by historians.
It has been called Ritz method after
Walther Ritz, since the numerical procedure has been published by
Walther Ritz in 1908-1909. According to A. W. Leissa,
Lord Rayleigh wrote a paper congratulating Ritz on his work in 1911, but stating that he himself had used Ritz's method in many places in his book and in another publication. This statement, although later disputed, and the fact that the method in the trivial case of a single vector results in the
Rayleigh quotient make the case for the name ''Rayleigh–Ritz'' method. According to S. Ilanko,
citing
Richard Courant, both
Lord Rayleigh and
Walther Ritz independently conceived the idea of utilizing the equivalence between
boundary value problems of
partial differential equations
In mathematics, a partial differential equation (PDE) is an equation which involves a multivariable function and one or more of its partial derivatives.
The function is often thought of as an "unknown" that solves the equation, similar to how ...
on the one hand and problems of the
calculus of variations
The calculus of variations (or variational calculus) is a field of mathematical analysis that uses variations, which are small changes in Function (mathematics), functions
and functional (mathematics), functionals, to find maxima and minima of f ...
on the other hand for numerical calculation of the solutions, by substituting for the variational problems simpler approximating extremum problems in which a finite number of parameters need to be determined. Ironically for the debate, the modern justification of the algorithm drops the
calculus of variations
The calculus of variations (or variational calculus) is a field of mathematical analysis that uses variations, which are small changes in Function (mathematics), functions
and functional (mathematics), functionals, to find maxima and minima of f ...
in favor of the simpler and more general approach of
orthogonal projection as in
Galerkin method named after
Boris Galerkin, thus leading also to the
Ritz-Galerkin method naming.
Method
Let
be a
linear operator
In mathematics, and more specifically in linear algebra, a linear map (also called a linear mapping, linear transformation, vector space homomorphism, or in some contexts linear function) is a mapping V \to W between two vector spaces that pr ...
on a
Hilbert space
In mathematics, a Hilbert space is a real number, real or complex number, complex inner product space that is also a complete metric space with respect to the metric induced by the inner product. It generalizes the notion of Euclidean space. The ...
, with
inner product
In mathematics, an inner product space (or, rarely, a Hausdorff pre-Hilbert space) is a real vector space or a complex vector space with an operation called an inner product. The inner product of two vectors in the space is a scalar, ofte ...
. Now consider a finite set of functions
. Depending on the application these functions may be:
* A subset of the
orthonormal basis
In mathematics, particularly linear algebra, an orthonormal basis for an inner product space V with finite Dimension (linear algebra), dimension is a Basis (linear algebra), basis for V whose vectors are orthonormal, that is, they are all unit vec ...
of the original operator;
* A space of
splines (as in the
Galerkin method);
* A set of functions which approximate the
eigenfunctions
In mathematics, an eigenfunction of a linear map, linear operator ''D'' defined on some function space is any non-zero function (mathematics), function f in that space that, when acted upon by ''D'', is only multiplied by some scaling factor calle ...
of the operator.
One could use the orthonormal basis generated from the eigenfunctions of the operator, which will produce
diagonal
In geometry, a diagonal is a line segment joining two vertices of a polygon or polyhedron, when those vertices are not on the same edge. Informally, any sloping line is called diagonal. The word ''diagonal'' derives from the ancient Greek � ...
approximating matrices, but in this case we would have already had to calculate the spectrum.
We now approximate
by
, which is defined as the matrix with entries
and solve the eigenvalue problem
. It can be shown that the matrix
is the
compression of
to
.
For
differential operators (such as
Sturm-Liouville operators), the inner product
can be replaced by the
weak formulation .
If a subset of the orthonormal basis was used to find the matrix, the eigenvectors of
will be
linear combinations of orthonormal basis functions, and as a result they will be approximations of the eigenvectors of
.
Properties
Spectral pollution
It is possible for the Rayleigh–Ritz method to produce values which do not converge to actual values in the spectrum of the operator as the truncation gets large. These values are known as spectral pollution.
In some cases (such as for the
Schrödinger equation
The Schrödinger equation is a partial differential equation that governs the wave function of a non-relativistic quantum-mechanical system. Its discovery was a significant landmark in the development of quantum mechanics. It is named after E ...
), there is no approximation which both includes all eigenvalues of the equation, and contains no pollution.
The spectrum of the compression (and thus pollution) is bounded by the
numerical range In the mathematics, mathematical field of linear algebra and convex analysis, the numerical range or field of values of a complex number, complex n \times n square matrix, matrix ''A'' is the set
:W(A)
= \left\
= \left\
where \mathbf^* denotes t ...
of the operator; in many cases it is bounded by a subset of the numerical range known as the
essential numerical range.
For matrix eigenvalue problems
In numerical linear algebra, the Rayleigh–Ritz method is commonly
applied to approximate an eigenvalue problem
for the matrix
of size
using a projected matrix of a smaller size
, generated from a given matrix
with
orthonormal columns. The matrix version of the algorithm is the most simple:
# Compute the
matrix
, where
denotes the complex-conjugate transpose of
# Solve the eigenvalue problem
# Compute the Ritz vectors
and the Ritz value
# Output approximations
, called the Ritz pairs, to
eigenvalues and eigenvectors
In linear algebra, an eigenvector ( ) or characteristic vector is a vector that has its direction unchanged (or reversed) by a given linear transformation. More precisely, an eigenvector \mathbf v of a linear transformation T is scaled by a ...
of the original matrix
.
If the subspace with the orthonormal basis given by the columns of the matrix
contains
vectors that are close to eigenvectors of the matrix
, the Rayleigh–Ritz method above finds
Ritz vectors that well approximate these eigenvectors. The easily computable quantity
determines the accuracy of such an approximation for every Ritz pair.
In the easiest case
, the
matrix
turns into a unit column-vector
, the
matrix
is a scalar that is equal to the
Rayleigh quotient , the only
solution to the eigenvalue problem is
and
, and the only one Ritz vector is
itself. Thus, the Rayleigh–Ritz method turns into computing of the
Rayleigh quotient if
.
Another useful connection to the
Rayleigh quotient is that
for every Ritz pair
, allowing to derive some properties of Ritz values
from the corresponding theory for the
Rayleigh quotient. For example, if
is a
Hermitian matrix
In mathematics, a Hermitian matrix (or self-adjoint matrix) is a complex square matrix that is equal to its own conjugate transpose—that is, the element in the -th row and -th column is equal to the complex conjugate of the element in the ...
, its
Rayleigh quotient (and thus its every Ritz value) is real and takes values within the closed interval of the smallest and largest eigenvalues of
.
Example
The matrix
has eigenvalues
and the corresponding eigenvectors
Let us take
then
with eigenvalues
and the corresponding eigenvectors
so that the Ritz values are
and the Ritz vectors are
We observe that each one of the Ritz vectors is exactly one of the eigenvectors of
for the given
as well as the Ritz values give exactly two of the three eigenvalues of
. A mathematical explanation for the exact approximation is based on the fact that the
column space of the matrix
happens to be exactly the same as the subspace spanned by the two eigenvectors
and
in this example.
For matrix singular value problems
Truncated singular value decomposition (SVD) in numerical linear algebra can also use the Rayleigh–Ritz method to find approximations to left and right singular vectors of the matrix
of size
in given subspaces by turning the singular value problem into an eigenvalue problem.
Using the normal matrix
The definition of the singular value
and the corresponding left and right singular vectors is
and
. Having found one set (left of right) of approximate singular vectors and singular values by applying naively the Rayleigh–Ritz method to the
Hermitian normal matrix
or
, whichever one is smaller size, one could determine the other set of left of right singular vectors simply by dividing by the singular values, i.e.,
and
. However, the division is unstable or fails for small or zero singular values.
An alternative approach, e.g., defining the normal matrix as
of size
, takes advantage of the fact that for a given
matrix
with
orthonormal columns the eigenvalue problem of the Rayleigh–Ritz method for the
matrix
can be interpreted as a singular value problem for the
matrix
. This interpretation allows simple simultaneous calculation of both left and right approximate singular vectors as follows.
# Compute the
matrix
.
# Compute the
thin, or economy-sized, SVD with
matrix
,
diagonal matrix
, and
matrix
.
# Compute the matrices of the Ritz left
and right
singular vectors.
# Output approximations
, called the Ritz singular triplets, to selected singular values and the corresponding left and right singular vectors of the original matrix
representing an approximate
Truncated singular value decomposition (SVD) with left singular vectors restricted to the column-space of the matrix
.
The algorithm can be used as a post-processing step where the matrix
is an output of an eigenvalue solver, e.g., such as
LOBPCG, approximating numerically selected eigenvectors of the normal matrix
.
Example
The matrix
has its normal matrix
singular values
and the corresponding
thin SVD
where the columns of the first multiplier from the complete set of the left singular vectors of the matrix
, the diagonal entries of the middle term are the singular values, and the columns of the last multiplier transposed (although the transposition does not change it)
are the corresponding right singular vectors.
Let us take
with the column-space that is spanned by the two exact right singular vectors
corresponding to the singular values 1 and 2.
Following the algorithm step 1, we compute
and on step 2 its
thin SVD with
Thus we already obtain the singular values 2 and 1 from
and from
the corresponding two left singular vectors
as
and
, which span the column-space of the matrix
, explaining why the approximations are exact for the given
.
Finally, step 3 computes the matrix
recovering from its rows the two right singular vectors
as
and
.
We validate the first vector:
and
Thus, for the given matrix
with its column-space that is spanned by two exact right singular vectors, we determine these right singular vectors, as well as the corresponding left singular vectors and the singular values, all exactly. For an arbitrary matrix
, we obtain approximate singular triplets which are optimal given
in the sense of optimality of the Rayleigh–Ritz method.
Applications and examples
In quantum physics
In quantum physics, where the spectrum of the
Hamiltonian
Hamiltonian may refer to:
* Hamiltonian mechanics, a function that represents the total energy of a system
* Hamiltonian (quantum mechanics), an operator corresponding to the total energy of that system
** Dyall Hamiltonian, a modified Hamiltonian ...
is the set of discrete energy levels allowed by a quantum mechanical system, the Rayleigh–Ritz method is used to approximate the energy states and wavefunctions of a complicated atomic or nuclear system.
In fact, for any system more complicated than a single hydrogen atom, there is no known exact solution for the spectrum of the Hamiltonian.
In this case, a
trial wave function,
, is tested on the system. This trial function is selected to meet boundary conditions (and any other physical constraints). The exact function is not known; the trial function contains one or more adjustable parameters, which are varied to find a lowest energy configuration.
It can be shown that the ground state energy,
, satisfies an inequality:
That is, the ground-state energy is less than this value.
The trial wave-function will always give an expectation value larger than or equal to the ground-energy.
If the trial wave function is known to be
orthogonal
In mathematics, orthogonality (mathematics), orthogonality is the generalization of the geometric notion of ''perpendicularity''. Although many authors use the two terms ''perpendicular'' and ''orthogonal'' interchangeably, the term ''perpendic ...
to the ground state, then it will provide a boundary for the energy of some excited state.
The Ritz ansatz function is a linear combination of ''N'' known basis functions
, parametrized by unknown coefficients:
With a known Hamiltonian, we can write its expected value as
The basis functions are usually not orthogonal, so that the
overlap matrix ''S'' has nonzero nondiagonal elements. Either
or
(the conjugation of the first) can be used to minimize the expectation value. For instance, by making the partial derivatives of
over
zero, the following equality is obtained for every ''k'' = 1, 2, ..., ''N'':
which leads to a set of ''N''
secular equations:
In the above equations, energy
and the coefficients
are unknown. With respect to ''c'', this is a homogeneous set of linear equations, which has a solution when the
determinant
In mathematics, the determinant is a Scalar (mathematics), scalar-valued function (mathematics), function of the entries of a square matrix. The determinant of a matrix is commonly denoted , , or . Its value characterizes some properties of the ...
of the coefficients to these unknowns is zero:
which in turn is true only for ''N'' values of
. Furthermore, since the Hamiltonian is a
hermitian operator, the ''H'' matrix is also
hermitian and the values of
will be real. The lowest value among
(i=1,2,..,N),
, will be the best approximation to the ground state for the basis functions used. The remaining ''N-1'' energies are estimates of excited state energies. An approximation for the wave function of state ''i'' can be obtained by finding the coefficients
from the corresponding secular equation.
In mechanical engineering
The Rayleigh–Ritz method is often used in
mechanical engineering
Mechanical engineering is the study of physical machines and mechanism (engineering), mechanisms that may involve force and movement. It is an engineering branch that combines engineering physics and engineering mathematics, mathematics principl ...
for finding the approximate real
resonant frequencies of multi
degree of freedom systems, such as
spring mass systems or
flywheels on a shaft with varying
cross section. It is an extension of Rayleigh's method. It can also be used for finding buckling loads and post-buckling behaviour for columns.
Consider the case whereby we want to find the resonant frequency of oscillation of a system. First, write the oscillation in the form,
with an unknown mode shape
. Next, find the total energy of the system, consisting of a kinetic energy term and a potential energy term. The kinetic energy term involves the square of the time derivative of
and thus gains a factor of
. Thus, we can calculate the total energy of the system and express it in the following form:
By conservation of energy, the average kinetic energy must be equal to the average potential energy. Thus,