linear algebra
Linear algebra is the branch of mathematics concerning linear equations such as
:a_1x_1+\cdots +a_nx_n=b,
linear maps such as
:(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n,
and their representations in vector spaces and through matrix (mathemat ...
, the Cayley–Hamilton theorem (named after the mathematicians Arthur Cayley and
William Rowan Hamilton
Sir William Rowan Hamilton (4 August 1805 – 2 September 1865) was an Irish astronomer, mathematician, and physicist who made numerous major contributions to abstract algebra, classical mechanics, and optics. His theoretical works and mathema ...
commutative ring
In mathematics, a commutative ring is a Ring (mathematics), ring in which the multiplication operation is commutative. The study of commutative rings is called commutative algebra. Complementarily, noncommutative algebra is the study of ring prope ...
complex number
In mathematics, a complex number is an element of a number system that extends the real numbers with a specific element denoted , called the imaginary unit and satisfying the equation i^= -1; every complex number can be expressed in the for ...
s or the
integers
An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
characteristic polynomial
In linear algebra, the characteristic polynomial of a square matrix is a polynomial which is invariant under matrix similarity and has the eigenvalues as roots. It has the determinant and the trace of the matrix among its coefficients. The ...
of an matrix is defined as , where is the
determinant
In mathematics, the determinant is a Scalar (mathematics), scalar-valued function (mathematics), function of the entries of a square matrix. The determinant of a matrix is commonly denoted , , or . Its value characterizes some properties of the ...
identity matrix
In linear algebra, the identity matrix of size n is the n\times n square matrix with ones on the main diagonal and zeros elsewhere. It has unique properties, for example when the identity matrix represents a geometric transformation, the obje ...
. Since each entry of the matrix is either constant or linear in , the determinant of is a degree- monic
polynomial
In mathematics, a polynomial is a Expression (mathematics), mathematical expression consisting of indeterminate (variable), indeterminates (also called variable (mathematics), variables) and coefficients, that involves only the operations of addit ...
in , so it can be written as
By replacing the scalar variable with the matrix , one can define an analogous matrix polynomial expression,
(Here, is the given matrix—not a variable, unlike —so is a constant rather than a function.)
The Cayley–Hamilton theorem states that this polynomial expression is equal to the
zero matrix
In mathematics, particularly linear algebra, a zero matrix or null matrix is a matrix all of whose entries are zero. It also serves as the additive identity of the additive group of m \times n matrices, and is denoted by the symbol O or 0 followe ...
, which is to say that that is, the characteristic polynomial is an annihilating polynomial for
One use for the Cayley–Hamilton
theorem
In mathematics and formal logic, a theorem is a statement (logic), statement that has been Mathematical proof, proven, or can be proven. The ''proof'' of a theorem is a logical argument that uses the inference rules of a deductive system to esta ...
is that it allows to be expressed as a
linear combination
In mathematics, a linear combination or superposition is an Expression (mathematics), expression constructed from a Set (mathematics), set of terms by multiplying each term by a constant and adding the results (e.g. a linear combination of ''x'' a ...
of the lower matrix powers of :
When the ring is a field, the Cayley–Hamilton theorem is equivalent to the statement that the minimal polynomial of a square matrix
divides
In mathematics, a divisor of an integer n, also called a factor of n, is an integer m that may be multiplied by some integer to produce n. In this case, one also says that n is a '' multiple'' of m. An integer n is divisible or evenly divisibl ...
its characteristic polynomial.
A special case of the theorem was first proved by Hamilton in 1853 in terms of inverses of linear functions of
quaternion
In mathematics, the quaternion number system extends the complex numbers. Quaternions were first described by the Irish mathematician William Rowan Hamilton in 1843 and applied to mechanics in three-dimensional space. The algebra of quater ...
s. This corresponds to the special case of certain real or complex matrices. Cayley in 1858 stated the result for and smaller matrices, but only published a proof for the case. As for matrices, Cayley stated “..., I have not thought it necessary to undertake the labor of a formal proof of the theorem in the general case of a matrix of any degree”. The general case was first proved by Ferdinand Frobenius in 1878.
Examples
matrices
For a matrix , the characteristic polynomial is given by , and so is trivial.
matrices
As a concrete example, let
Its characteristic polynomial is given by
The Cayley–Hamilton theorem claims that, if we ''define''
then
We can verify by computation that indeed,
For a generic matrix,
the characteristic polynomial is given by , so the Cayley–Hamilton theorem states that
which is indeed always the case, evident by working out the entries of .
Applications
Determinant and inverse matrix
For a general
invertible matrix
In linear algebra, an invertible matrix (''non-singular'', ''non-degenarate'' or ''regular'') is a square matrix that has an inverse. In other words, if some other matrix is multiplied by the invertible matrix, the result can be multiplied by a ...
, i.e., one with nonzero determinant, −1 can thus be written as an order
polynomial expression
In mathematics, especially in the field of algebra, a polynomial ring or polynomial algebra is a ring (mathematics), ring formed from the set (mathematics), set of polynomials in one or more indeterminate (variable), indeterminates (traditionally ...
in : As indicated, the Cayley–Hamilton theorem amounts to the identity
The coefficients are given by the
elementary symmetric polynomial
In mathematics, specifically in commutative algebra, the elementary symmetric polynomials are one type of basic building block for symmetric polynomials, in the sense that any symmetric polynomial can be expressed as a polynomial in elementary sy ...
s of the
eigenvalue
In linear algebra, an eigenvector ( ) or characteristic vector is a vector that has its direction unchanged (or reversed) by a given linear transformation. More precisely, an eigenvector \mathbf v of a linear transformation T is scaled by a ...
s of . Using Newton identities, the elementary symmetric polynomials can in turn be expressed in terms of power sum symmetric polynomials of the eigenvalues:
where is the trace of the matrix . Thus, we can express in terms of the trace of powers of .
In general, the formula for the coefficients is given in terms of complete exponential Bell polynomials asSee Sect. 2 of . An explicit expression for the coefficients
is provided by :
where the sum is taken over the sets of all integer partitions satisfying the equation
In particular, the determinant of equals . Thus, the determinant can be written as the trace identity:
Likewise, the characteristic polynomial can be written as
and, by multiplying both sides by (note ), one is led to an expression for the inverse of as a trace identity,
Another method for obtaining these coefficients for a general matrix, provided no root be zero, relies on the following alternative expression for the determinant,
Hence, by virtue of the Mercator series,
where the exponential ''only'' needs be expanded to order , since is of order , the net negative powers of automatically vanishing by the C–H theorem. (Again, this requires a ring containing the
rational number
In mathematics, a rational number is a number that can be expressed as the quotient or fraction of two integers, a numerator and a non-zero denominator . For example, is a rational number, as is every integer (for example,
The set of all ...
s.) Differentiation of this expression with respect to allows one to express the coefficients of the characteristic polynomial for general as determinants of matrices,See, e.g., p. 54 of , which solves Jacobi's formula,
where is the adjugate matrix of the next section.
There also exists an equivalent, related recursive algorithm introduced by Urbain Le Verrier and Dmitry Konstantinovich Faddeev—the Faddeev–LeVerrier algorithm, which reads
(see, e.g., .) Observe as the recursion terminates.
See the algebraic proof in the following section, which relies on the modes of the adjugate, .
Specifically, and the above derivative of when one traces it yields
(), and the above recursions, in turn.
; Examples
For instance, the first few Bell polynomials are = 1, , , and .
Using these to specify the coefficients of the characteristic polynomial of a matrix yields
The coefficient gives the determinant of the matrix, minus its trace, while its inverse is given by
It is apparent from the general formula for ''c''''n''−''k'', expressed in terms of Bell polynomials, that the expressions
always give the coefficients of and of in the characteristic polynomial of any matrix, respectively. So, for a matrix , the statement of the Cayley–Hamilton theorem can also be written as
where the right-hand side designates a matrix with all entries reduced to zero. Likewise, this determinant in the case, is now
This expression gives the negative of coefficient of in the general case, as seen below.
Similarly, one can write for a matrix ,
where, now, the determinant is ,
and so on for larger matrices. The increasingly complex expressions for the coefficients is deducible from Newton's identities or the Faddeev–LeVerrier algorithm.
''n''-th power of matrix
The Cayley–Hamilton theorem always provides a relationship between the powers of (though not always the simplest one), which allows one to simplify expressions involving such powers, and evaluate them without having to compute the power ''n'' or any higher powers of .
As an example, for the theorem gives
Then, to calculate , observe
Likewise,
Notice that we have been able to write the matrix power as the sum of two terms. In fact, matrix power of any order can be written as a matrix polynomial of degree at most , where is the size of a square matrix. This is an instance where Cayley–Hamilton theorem can be used to express a matrix function, which we will discuss below systematically.
Matrix functions
Given an
analytic function
In mathematics, an analytic function is a function that is locally given by a convergent power series. There exist both real analytic functions and complex analytic functions. Functions of each type are infinitely differentiable, but complex ...
and the characteristic polynomial of degree of an matrix , the function can be expressed using long division as
where is some quotient polynomial and is a remainder polynomial such that .
By the Cayley–Hamilton theorem, replacing by the matrix gives , so one has
Thus, the analytic function of the matrix can be expressed as a matrix polynomial of degree less than .
Let the remainder polynomial be
Since , evaluating the function at the eigenvalues of yields
This amounts to a system of
linear equation
In mathematics, a linear equation is an equation that may be put in the form
a_1x_1+\ldots+a_nx_n+b=0, where x_1,\ldots,x_n are the variables (or unknowns), and b,a_1,\ldots,a_n are the coefficients, which are often real numbers. The coeffici ...
s, which can be solved to determine the coefficients . Thus, one has
When the eigenvalues are repeated, that is for some , two or more equations are identical; and hence the linear equations cannot be solved uniquely. For such cases, for an eigenvalue with multiplicity , the first derivatives of vanish at the eigenvalue. This leads to the extra linearly independent solutions
which, combined with others, yield the required equations to solve for .
Finding a polynomial that passes through the points is essentially an interpolation problem, and can be solved using
Lagrange
Joseph-Louis Lagrange (born Giuseppe Luigi LagrangiaNewton interpolation techniques, leading to Sylvester's formula.
For example, suppose the task is to find the polynomial representation of
The characteristic polynomial is , and the eigenvalues are . Let . Evaluating at the eigenvalues, one obtains two linear equations, and .
Solving the equations yields and . Thus, it follows that
If, instead, the function were , then the coefficients would have been and ; hence
As a further example, when considering
then the characteristic polynomial is , and the eigenvalues are .
As before, evaluating the function at the eigenvalues gives us the linear equations and ; the solution of which gives, and . Thus, for this case,
which is a
rotation matrix
In linear algebra, a rotation matrix is a transformation matrix that is used to perform a rotation (mathematics), rotation in Euclidean space. For example, using the convention below, the matrix
:R = \begin
\cos \theta & -\sin \theta \\
\sin \t ...
.
Standard examples of such usage is the exponential map from the
Lie algebra
In mathematics, a Lie algebra (pronounced ) is a vector space \mathfrak g together with an operation called the Lie bracket, an alternating bilinear map \mathfrak g \times \mathfrak g \rightarrow \mathfrak g, that satisfies the Jacobi ident ...
matrix exponential
In mathematics, the matrix exponential is a matrix function on square matrix, square matrices analogous to the ordinary exponential function. It is used to solve systems of linear differential equations. In the theory of Lie groups, the matrix exp ...
,
Such expressions have long been known for ,
where the are the
Pauli matrices
In mathematical physics and mathematics, the Pauli matrices are a set of three complex matrices that are traceless, Hermitian, involutory and unitary. Usually indicated by the Greek letter sigma (), they are occasionally denoted by tau () ...
and for ,
which is
Rodrigues' rotation formula
In the theory of three-dimensional rotation, Rodrigues' rotation formula, named after Olinde Rodrigues, is an efficient algorithm for rotating a vector in space, given an axis and angle of rotation. By extension, this can be used to transfo ...
conformal group
In mathematics, the conformal group of an inner product space is the group (mathematics), group of transformations from the space to itself that preserve angles. More formally, it is the group of transformations that preserve the conformal geometr ...
of
spacetime
In physics, spacetime, also called the space-time continuum, is a mathematical model that fuses the three dimensions of space and the one dimension of time into a single four-dimensional continuum. Spacetime diagrams are useful in visualiz ...
, its
simply connected
In topology, a topological space is called simply connected (or 1-connected, or 1-simply connected) if it is path-connected and every Path (topology), path between two points can be continuously transformed into any other such path while preserving ...
cover (to be precise, the simply connected cover of the connected component of ). The expressions obtained apply to the standard representation of these groups. They require knowledge of (some of) the
eigenvalue
In linear algebra, an eigenvector ( ) or characteristic vector is a vector that has its direction unchanged (or reversed) by a given linear transformation. More precisely, an eigenvector \mathbf v of a linear transformation T is scaled by a ...
s of the matrix to exponentiate. For (and hence for ), closed expressions have been obtained for ''all'' irreducible representations, i.e. of any spin.
Algebraic number theory
The Cayley–Hamilton theorem is an effective tool for computing the minimal polynomial of
algebraic integer
In algebraic number theory, an algebraic integer is a complex number that is integral over the integers. That is, an algebraic integer is a complex root of some monic polynomial (a polynomial whose leading coefficient is 1) whose coefficients ...