mathematics Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...

, in the field of

control theory Control theory is a field of control engineering and applied mathematics that deals with the control system, control of dynamical systems in engineered processes and machines. The objective is to develop a model or algorithm governing the applic ...

, a Sylvester equation is a

matrix Matrix (: matrices or matrixes) or MATRIX may refer to: Science and mathematics * Matrix (mathematics), a rectangular array of numbers, symbols or expressions * Matrix (logic), part of a formula in prenex normal form * Matrix (biology), the m ...

equation In mathematics, an equation is a mathematical formula that expresses the equality of two expressions, by connecting them with the equals sign . The word ''equation'' and its cognates in other languages may have subtly different meanings; for ...

of the form: :

A X + X B = C.

It is named after English mathematician

James Joseph Sylvester James Joseph Sylvester (3 September 1814 – 15 March 1897) was an English mathematician. He made fundamental contributions to matrix theory, invariant theory, number theory, partition theory, and combinatorics. He played a leadership ...

. Then given matrices ''A'', ''B'', and ''C'', the problem is to find the possible matrices ''X'' that obey this equation. All matrices are assumed to have coefficients in the

complex number In mathematics, a complex number is an element of a number system that extends the real numbers with a specific element denoted , called the imaginary unit and satisfying the equation i^= -1; every complex number can be expressed in the for ...

s. For the equation to make sense, the matrices must have appropriate sizes, for example they could all be square matrices of the same size. But more generally, ''A'' and ''B'' must be square matrices of sizes ''n'' and ''m'' respectively, and then ''X'' and ''C'' both have ''n'' rows and ''m'' columns. A Sylvester equation has a unique solution for ''X'' exactly when there are no common

eigenvalue In linear algebra, an eigenvector ( ) or characteristic vector is a vector that has its direction unchanged (or reversed) by a given linear transformation. More precisely, an eigenvector \mathbf v of a linear transformation T is scaled by a ...

s of ''A'' and −''B''. More generally, the equation ''AX'' + ''XB'' = ''C'' has been considered as an equation of

bounded operator In functional analysis and operator theory, a bounded linear operator is a linear transformation L : X \to Y between topological vector spaces (TVSs) X and Y that maps bounded subsets of X to bounded subsets of Y. If X and Y are normed vector ...

s on a (possibly infinite-dimensional)

Banach space In mathematics, more specifically in functional analysis, a Banach space (, ) is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vectors and ...

. In this case, the condition for the uniqueness of a solution ''X'' is almost the same: There exists a unique solution ''X'' exactly when the spectra of ''A'' and −''B'' are disjoint.

Existence and uniqueness of the solutions

Using the

Kronecker product In mathematics, the Kronecker product, sometimes denoted by ⊗, is an operation on two matrices of arbitrary size resulting in a block matrix. It is a specialization of the tensor product (which is denoted by the same symbol) from vector ...

notation and the vectorization operator

\operatorname

, we can rewrite Sylvester's equation in the form :

(I_m \otimes A +  B^T \otimes I_n) \operatornameX = \operatornameC,

where

A

is of dimension

n\! \times\! n

B

is of dimension

m\!\times\!m

X

of dimension

n\!\times\!m

and

I_k

is the

k \times k

identity matrix In linear algebra, the identity matrix of size n is the n\times n square matrix with ones on the main diagonal and zeros elsewhere. It has unique properties, for example when the identity matrix represents a geometric transformation, the obje ...

. In this form, the equation can be seen as a

linear system In systems theory, a linear system is a mathematical model of a system based on the use of a linear operator. Linear systems typically exhibit features and properties that are much simpler than the nonlinear case. As a mathematical abstractio ...

of dimension

mn \times mn

. Theorem. Given matrices

A\in \mathbb^

and

B\in \mathbb^

, the Sylvester equation

AX+XB=C

has a unique solution

X\in \mathbb^

for any

C\in\mathbb^

if and only if

A

and

-B

do not share any eigenvalue. Proof. The equation

AX+XB=C

is a linear system with

mn

unknowns and the same number of equations. Hence it is uniquely solvable for any given

C

if and only if the homogeneous equation

AX+XB=0

admits only the trivial solution

0

. (i) Assume that

A

and

-B

do not share any eigenvalue. Let

X

be a solution to the abovementioned homogeneous equation. Then

AX=X(-B)

, which can be lifted to

A^kX = X(-B)^k

for each

k \ge 0

by mathematical induction. Consequently,

p(A) X = X p(-B)

for any polynomial

p

. In particular, let

p

be the characteristic polynomial of

A

. Then

p(A)=0

due to the

Cayley–Hamilton theorem In linear algebra, the Cayley–Hamilton theorem (named after the mathematicians Arthur Cayley and William Rowan Hamilton) states that every square matrix over a commutative ring (such as the real or complex numbers or the integers) satisfies ...

; meanwhile, the spectral mapping theorem tells us

\sigma(p(-B)) = p(\sigma(-B)),

where

\sigma(\cdot)

denotes the spectrum of a matrix. Since

A

and

-B

do not share any eigenvalue,

p(\sigma(-B))

does not contain zero, and hence

p(-B)

is nonsingular. Thus

X= 0

as desired. This proves the "if" part of the theorem. (ii) Now assume that

A

and

-B

share an eigenvalue

\lambda

. Let

u

be a corresponding right

eigenvector In linear algebra, an eigenvector ( ) or characteristic vector is a vector that has its direction unchanged (or reversed) by a given linear transformation. More precisely, an eigenvector \mathbf v of a linear transformation T is scaled by ...

for

A

v

be a corresponding left eigenvector for

-B

, and

X=u^*

. Then

X\neq 0

, and

AX+XB = A(uv^*)-(uv^*)(-B) = \lambda uv^*-\lambda uv^* = 0.

Hence

X

is a nontrivial solution to the aforesaid homogeneous equation, justifying the "only if" part of the theorem. Q.E.D. As an alternative to the spectral mapping theorem, the nonsingularity of

p(-B)

in part (i) of the proof can also be demonstrated by the

Bézout's identity In mathematics, Bézout's identity (also called Bézout's lemma), named after Étienne Bézout who proved it for polynomials, is the following theorem: Here the greatest common divisor of and is taken to be . The integers and are called B� ...

for coprime polynomials. Let

q

be the characteristic polynomial of

-B

. Since

A

and

-B

do not share any eigenvalue,

p

and

q

are coprime. Hence there exist polynomials

f

and

g

such that

p(z)f(z)+q(z)g(z)\equiv 1

. By the

q(-B)=0

. Thus

p(-B)f(-B)=I

, implying that

p(-B)

is nonsingular. The theorem remains true for real matrices with the caveat that one considers their complex eigenvalues. The proof for the "if" part is still applicable; for the "only if" part, note that both

\mathrm(uv^*)

and

\mathrm(uv^*)

satisfy the homogenous equation

AX+XB=0

, and they cannot be zero simultaneously.

Roth's removal rule

Given two square complex matrices ''A'' and ''B'', of size ''n'' and ''m'', and a matrix ''C'' of size ''n'' by ''m'', then one can ask when the following two square matrices of size ''n'' + ''m'' are similar to each other:

\begin A & C \\ 0 & B \end

and

\begin A & 0 \\0&B \end

. The answer is that these two matrices are similar exactly when there exists a matrix ''X'' such that ''AX'' − ''XB'' = ''C''. In other words, ''X'' is a solution to a Sylvester equation. This is known as Roth's removal rule. One easily checks one direction: If ''AX'' − ''XB'' = ''C'' then :

\beginI_n & X \\ 0 & I_m \end \begin A&C\\0&B \end \begin I_n & -X \\ 0& I_m \end = \begin A&0\\0&B \end.

Roth's removal rule does not generalize to infinite-dimensional bounded operators on a Banach space. Nevertheless, Roth's removal rule generalizes to the systems of Sylvester equations.

Numerical solutions

A classical algorithm for the numerical solution of the Sylvester equation is the Bartels–Stewart algorithm, which consists of transforming

A

and

B

into Schur form by a

QR algorithm In numerical linear algebra, the QR algorithm or QR iteration is an eigenvalue algorithm: that is, a procedure to calculate the eigenvalues and eigenvectors of a Matrix (mathematics), matrix. The QR algorithm was developed in the late 1950s by John ...

, and then solving the resulting triangular system via back-substitution. This algorithm, whose computational cost is

\mathcal(n^3)

arithmetical operations, is used, among others, by

LAPACK LAPACK ("Linear Algebra Package") is a standard software library for numerical linear algebra. It provides routines for solving systems of linear equations and linear least squares, eigenvalue problems, and singular value decomposition. It als ...

and the lyap function in

GNU Octave GNU Octave is a scientific programming language for scientific computing and numerical computation. Octave helps in solving linear and nonlinear problems numerically, and for performing other numerical experiments using a language that is mostly ...

. See also the sylvester function in that language. In some specific image processing applications, the derived Sylvester equation has a closed form solution.

Notes

References

* * * * * * *

External links

Online solver for arbitrary sized matrices.

{{Authority control Matrices (mathematics) Control theory