In mathematics, the Smith normal form (sometimes abbreviated SNF) is a
normal form that can be defined for any matrix (not necessarily square) with entries in a
principal ideal domain
In mathematics, a principal ideal domain, or PID, is an integral domain in which every ideal is principal, i.e., can be generated by a single element. More generally, a principal ideal ring is a nonzero commutative ring whose ideals are principal, ...
(PID). The Smith normal form of a matrix is
diagonal
In geometry, a diagonal is a line segment joining two vertices of a polygon or polyhedron, when those vertices are not on the same edge. Informally, any sloping line is called diagonal. The word ''diagonal'' derives from the ancient Greek δ ...
, and can be obtained from the original matrix by multiplying on the left and right by
invertible
In mathematics, the concept of an inverse element generalises the concepts of opposite () and reciprocal () of numbers.
Given an operation denoted here , and an identity element denoted , if , one says that is a left inverse of , and that is ...
square matrices. In particular, the integers are a PID, so one can always calculate the Smith normal form of an integer matrix. The Smith normal form is very useful for working with finitely generated modules over a PID, and in particular for deducing the structure of a quotient of a
free module
In mathematics, a free module is a module that has a basis – that is, a generating set consisting of linearly independent elements. Every vector space is a free module, but, if the ring of the coefficients is not a division ring (not a field in t ...
. It is named after the Irish mathematician
Henry John Stephen Smith.
Definition
Let ''A'' be a nonzero ''m''×''n'' matrix over a
principal ideal domain
In mathematics, a principal ideal domain, or PID, is an integral domain in which every ideal is principal, i.e., can be generated by a single element. More generally, a principal ideal ring is a nonzero commutative ring whose ideals are principal, ...
''R''. There exist invertible
and
-matrices ''S, T'' (with coefficients in ''R'') such that the product ''S A T'' is
and the diagonal elements
satisfy
for all
. This is the Smith normal form of the matrix ''A''. The elements
are unique
up to Two Mathematical object, mathematical objects ''a'' and ''b'' are called equal up to an equivalence relation ''R''
* if ''a'' and ''b'' are related by ''R'', that is,
* if ''aRb'' holds, that is,
* if the equivalence classes of ''a'' and ''b'' wi ...
multiplication by a
unit
Unit may refer to:
Arts and entertainment
* UNIT, a fictional military organization in the science fiction television series ''Doctor Who''
* Unit of action, a discrete piece of action (or beat) in a theatrical presentation
Music
* ''Unit'' (alb ...
and are called the ''elementary divisors'', ''invariants'', or ''invariant factors''. They can be computed (up to multiplication by a unit) as
:
where
(called ''i''-th ''determinant divisor'') equals the
greatest common divisor
In mathematics, the greatest common divisor (GCD) of two or more integers, which are not all zero, is the largest positive integer that divides each of the integers. For two integers ''x'', ''y'', the greatest common divisor of ''x'' and ''y'' is ...
of the determinants of all
minors of the matrix ''A'' and
.
Algorithm
The first goal is to find invertible square matrices
and
such that the product
is diagonal. This is the hardest part of the algorithm. Once diagonality is achieved, it becomes relatively easy to put the matrix into Smith normal form. Phrased more abstractly, the goal is to show that, thinking of
as a map from
(the free
-
module
Module, modular and modularity may refer to the concept of modularity. They may also refer to:
Computing and engineering
* Modular design, the engineering discipline of designing complex devices using separately designed sub-components
* Mo ...
of rank
) to
(the free
-
module
Module, modular and modularity may refer to the concept of modularity. They may also refer to:
Computing and engineering
* Modular design, the engineering discipline of designing complex devices using separately designed sub-components
* Mo ...
of rank
), there are isomorphisms
and
such that
has the simple form of a
diagonal matrix
In linear algebra, a diagonal matrix is a matrix in which the entries outside the main diagonal are all zero; the term usually refers to square matrices. Elements of the main diagonal can either be zero or nonzero. An example of a 2×2 diagonal ma ...
. The matrices
and
can be found by starting out with identity matrices of the appropriate size, and modifying
each time a row operation is performed on
in the algorithm by the corresponding column operation (for example, if row
is added to row
of
, then column
should be subtracted from column
of
to retain the product invariant), and similarly modifying
for each column operation performed. Since row operations are left-multiplications and column operations are right-multiplications, this preserves the invariant
where
denote current values and
denotes the original matrix; eventually the matrices in this invariant become diagonal. Only invertible row and column operations are performed, which ensures that
and
remain invertible matrices.
For
, write
for the number of prime factors of
(these exist and are unique since any PID is also a
unique factorization domain
In mathematics, a unique factorization domain (UFD) (also sometimes called a factorial ring following the terminology of Bourbaki) is a ring in which a statement analogous to the fundamental theorem of arithmetic holds. Specifically, a UFD is an ...
). In particular,
is also a
Bézout domain In mathematics, a Bézout domain is a form of a Prüfer domain. It is an integral domain in which the sum of two principal ideals is again a principal ideal. This means that for every pair of elements a Bézout identity holds, and that every fini ...
, so it is a
gcd domain and the gcd of any two elements satisfies a
Bézout's identity
In mathematics, Bézout's identity (also called Bézout's lemma), named after Étienne Bézout, is the following theorem:
Here the greatest common divisor of and is taken to be . The integers and are called Bézout coefficients for ; they ...
.
To put a matrix into Smith normal form, one can repeatedly apply the following, where
loops from 1 to
.
Step I: Choosing a pivot
Choose
to be the smallest column index of
with a non-zero entry, starting the search at column index
if
.
We wish to have
; if this is the case this step is complete, otherwise there is by assumption some
with
, and we can exchange rows
and
, thereby obtaining
.
Our chosen pivot is now at position
.
Step II: Improving the pivot
If there is an entry at position (''k'',''j''
''t'') such that
, then, letting
, we know by the Bézout property that there exist σ, τ in ''R'' such that
:
By left-multiplication with an appropriate invertible matrix ''L'', it can be achieved that row ''t'' of the matrix product is the sum of σ times the original row ''t'' and τ times the original row ''k'', that row ''k'' of the product is another linear combination of those original rows, and that all other rows are unchanged. Explicitly, if σ and τ satisfy the above equation, then for
and
(which divisions are possible by the definition of β) one has
:
so that the matrix
:
is invertible, with inverse
:
Now ''L'' can be obtained by fitting
into rows and columns ''t'' and ''k'' of the identity matrix. By construction the matrix obtained after left-multiplying by ''L'' has entry β at position (''t'',''j''
''t'') (and due to our choice of α and γ it also has an entry 0 at position (''k'',''j''
''t''), which is useful though not essential for the algorithm). This new entry β divides the entry
that was there before, and so in particular
; therefore repeating these steps must eventually terminate. One ends up with a matrix having an entry at position (''t'',''j''
''t'') that divides all entries in column ''j''
''t''.
Step III: Eliminating entries
Finally, adding appropriate multiples of row ''t'', it can be achieved that all entries in column ''j''
''t'' except for that at position (''t'',''j''
''t'') are zero. This can be achieved by left-multiplication with an appropriate matrix. However, to make the matrix fully diagonal we need to eliminate nonzero entries on the row of position (''t'',''j''
''t'') as well. This can be achieved by repeating the steps in Step II for columns instead of rows, and using multiplication on the right by the transpose of the obtained matrix ''L''. In general this will result in the zero entries from the prior application of Step III becoming nonzero again.
However, notice that each application of Step II for either rows or columns must continue to reduce the value of
, and so the process must eventually stop after some number of iterations, leading to a matrix where the entry at position (''t'',''j''
''t'') is the only non-zero entry in both its row and column.
At this point, only the block of ''A'' to the lower right of (''t'',''j''
''t'') needs to be diagonalized, and conceptually the algorithm can be applied recursively, treating this block as a separate matrix. In other words, we can increment ''t'' by one and go back to Step I.
Final step
Applying the steps described above to the remaining non-zero columns of the resulting matrix (if any), we get an
-matrix with column indices
where
. The matrix entries
are non-zero, and every other entry is zero.
Now we can move the null columns of this matrix to the right, so that the nonzero entries are on positions
for
. For short, set
for the element at position
.
The condition of divisibility of diagonal entries might not be satisfied. For any index