In mathematics, an eigenvalue perturbation problem is that of finding the
eigenvectors and eigenvalues
In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted ...
of a system
that is
perturbed from one with known eigenvectors and eigenvalues
. This is useful for studying how sensitive the original system's eigenvectors and eigenvalues
are to changes in the system.
This type of analysis was popularized by
Lord Rayleigh
John William Strutt, 3rd Baron Rayleigh, (; 12 November 1842 – 30 June 1919) was an English mathematician and physicist who made extensive contributions to science. He spent all of his academic career at the University of Cambridge. Amo ...
, in his investigation of harmonic vibrations of a string perturbed by small inhomogeneities.
The derivations in this article are essentially self-contained and can be found in many texts on numerical linear algebra or numerical functional analysis.
This article is focused on the case of the perturbation of a simple eigenvalue (see in
multiplicity of eigenvalues)
Why generalized eigenvalues?
In the entry
, applications of eigenvalues and eigenvectors we find numerous scientific fields in which eigenvalues are used to obtain solutions.
Generalized_eigenvalue_problem
In linear algebra, eigendecomposition is the factorization of a matrix into a canonical form, whereby the matrix is represented in terms of its eigenvalues and eigenvectors. Only diagonalizable matrices can be factorized in this way. When the matri ...
are less widespread but are a key in the study of
vibrations
Vibration is a mechanical phenomenon whereby oscillations occur about an equilibrium point. The word comes from Latin ''vibrationem'' ("shaking, brandishing"). The oscillations may be periodic, such as the motion of a pendulum—or random, such ...
.
They are useful when we use the
Galerkin method
In mathematics, in the area of numerical analysis, Galerkin methods, named after the Russian mathematician Boris Galerkin, convert a continuous operator problem, such as a differential equation, commonly in a weak formulation, to a discrete proble ...
or
Rayleigh-Ritz method to find approximate
solutions of partial differential equations modeling vibrations of structures such as strings and plates; the paper of Courant (1943)
is fundamental. The
Finite element method
The finite element method (FEM) is a popular method for numerically solving differential equations arising in engineering and mathematical modeling. Typical problem areas of interest include the traditional fields of structural analysis, heat ...
is a widespread particular case.
In classical mechanics, we may find generalized eigenvalues when we look for vibrations of
multiple degrees of freedom systems close to equilibrium; the kinetic energy provides the mass matrix
, the potential strain energy provides the rigidity matrix
.
To get details, for example see the first section of this article of Weinstein (1941, in French)
With both methods, we obtain a system of differential equations or
Matrix differential equation
A differential equation is a mathematical equation for an unknown function of one or several variables that relates the values of the function itself and its derivatives of various orders. A matrix differential equation contains more than one funct ...
with the mass matrix
, the damping matrix
and the rigidity matrix
. If we neglect the damping effect, we use
, we can look for a solution of the following form
; we obtain that
and
are solution of the generalized eigenvalue problem
Setting of perturbation for a generalized eigenvalue problem
Suppose we have solutions to the
generalized eigenvalue problem
In linear algebra, eigendecomposition is the factorization of a matrix into a canonical form, whereby the matrix is represented in terms of its eigenvalues and eigenvectors. Only diagonalizable matrices can be factorized in this way. When the matri ...
,
:
where
and
are matrices. That is, we know the eigenvalues and eigenvectors for . It is also required that ''the eigenvalues are distinct.''
Now suppose we want to change the matrices by a small amount. That is, we want to find the eigenvalues and eigenvectors of
:
where
:
with the perturbations
and
much smaller than
and
respectively. Then we expect the new eigenvalues and eigenvectors to be similar to the original, plus small perturbations:
:
Steps
We assume that the matrices are
symmetric
Symmetry (from grc, συμμετρία "agreement in dimensions, due proportion, arrangement") in everyday language refers to a sense of harmonious and beautiful proportion and balance. In mathematics, "symmetry" has a more precise definiti ...
and
positive definite In mathematics, positive definiteness is a property of any object to which a bilinear form or a sesquilinear form may be naturally associated, which is positive-definite. See, in particular:
* Positive-definite bilinear form
* Positive-definite f ...
, and assume we have scaled the eigenvectors such that
:
where is the
Kronecker delta
In mathematics, the Kronecker delta (named after Leopold Kronecker) is a function of two variables, usually just non-negative integers. The function is 1 if the variables are equal, and 0 otherwise:
\delta_ = \begin
0 &\text i \neq j, \\
1 &\ ...
.
Now we want to solve the equation
:
In this article we restrict the study to first order perturbation.
First order expansion of the equation
Substituting in (1), we get
:
which expands to
:
Canceling from (0) (
) leaves
:
Removing the higher-order terms, this simplifies to
:
:In other words,
no longer denotes the exact variation of the eigenvalue but its first order approximation.
As the matrix is symmetric, the unperturbed eigenvectors are
orthogonal and so we use them as a basis for the perturbed eigenvectors.
That is, we want to construct
:
with
,
where the are small constants that are to be determined.
In the same way, substituting in (2), and removing higher order terms, we get
The derivation can go on with two forks.
First fork: get first eigenvalue perturbation
= Eigenvalue perturbation
=
:We start with (3)
we left multiply with
and use (2) as well as its first order variation (5); we get
:
or
:
We notice that it is the first order perturbation of the generalized
Rayleigh quotient
In mathematics, the Rayleigh quotient () for a given complex Hermitian matrix ''M'' and nonzero vector ''x'' is defined as:
R(M,x) = .
For real matrices and vectors, the condition of being Hermitian reduces to that of being symmetric, and the con ...
with fixed
:
Moreover, for
, the formula
should be compared with
Bauer-Fike theorem which provides a bound for eigenvalue perturbation.
= Eigenvector perturbation
=
We left multiply (3) with
for
and get
:
We use
for
.
:
or
:
As the eigenvalues are assumed to be simple, for
:
Moreover (5) (the first order variation of (2) ) yields
We have obtained all the components of
.
Second fork: Straightforward manipulations
Substituting (4) into (3) and rearranging gives
:
Because the eigenvectors are -orthogonal when is positive definite, we can remove the summations by left-multiplying by
:
:
By use of equation (1) again:
:
The two terms containing are equal because left-multiplying (1) by
gives
:
Canceling those terms in (6) leaves
:
Rearranging gives
:
But by (2), this denominator is equal to 1. Thus
:
Then, as
for
(assumption simple eigenvalues) by left-multiplying equation (5) by
:
:
Or by changing the name of the indices:
:
To find , use the fact that:
:
implies:
:
Summary of the first order perturbation result
In the case where ''all the matrices are Hermitian positive definite and all the eigenvalues are distinct'',
:
for infinitesimal
and
(the higher order terms in (3) being neglected).
So far, we have not proved that these higher order terms may be neglected. This point may be derived using the implicit function theorem; in next section, we summarize the use of this theorem in order to obtain a first order expansion.
Theoretical derivation
Perturbation of an implicit function.
In the next paragraph, we shall use the
Implicit function theorem (Statement of the theorem ); we notice that for a continuously differentiable function
, with an invertible Jacobian matrix
, from a point
solution of
, we get solutions of
with
close to
in the form
where
is a continuously differentiable function ; moreover the Jacobian marix of
is provided by the linear system
.
As soon as the hypothesis of the theorem is satisfied, the Jacobian matrix of
may be computed with a first order expansion of
, we get
; as
, it is equivalent to equation
.
Eigenvalue perturbation: a theoretical basis.
We use the previous paragraph (Perturbation of an implicit function) with somewhat different notations suited to eigenvalue perturbation; we introduce
, with
*
with
. In order to use the
Implicit function theorem, we study the invertibility of the Jacobian
with
. Indeed, the solution of
may be derived with computations similar to the derivation of the expansion.
When
is a simple eigenvalue, as the eigenvectors
form an orthonormal basis, for any right-hand side, we have obtained one solution therefore, the Jacobian is invertible.
The
implicit function theorem provides a continuously differentiable function
hence the expansion with
little o notation
Big ''O'' notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. Big O is a member of a family of notations invented by Paul Bachmann, Edmund Land ...
:
.
with
This is the first order expansion of the perturbed eigenvalues and eigenvectors. which is proved.
Results of sensitivity analysis with respect to the entries of the matrices
The results
This means it is possible to efficiently do a
sensitivity analysis
Sensitivity analysis is the study of how the uncertainty in the output of a mathematical model or system (numerical or otherwise) can be divided and allocated to different sources of uncertainty in its inputs. A related practice is uncertainty anal ...
on as a function of changes in the entries of the matrices. (Recall that the matrices are symmetric and so changing will also change , hence the term.)
:
Similarly
:
Eigenvalue sensitivity, a small example
A simple case is
; however you can compute eigenvalues and eigenvectors with the help of online tools such as
(see introduction in Wikipedia
WWW Interactive Multipurpose Server, WIMS) or using Sage
SageMath
SageMath (previously Sage or SAGE, "System for Algebra and Geometry Experimentation") is a computer algebra system (CAS) with features covering many aspects of mathematics, including algebra, combinatorics, graph theory, numerical analysis, numbe ...
. You get the smallest eigenvalue