In mathematics, particularly

linear algebra Linear algebra is the branch of mathematics concerning linear equations such as: :a_1x_1+\cdots +a_nx_n=b, linear maps such as: :(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n, and their representations in vector spaces and through matric ...

and

numerical analysis Numerical analysis is the study of algorithms that use numerical approximation (as opposed to symbolic manipulations) for the problems of mathematical analysis (as distinguished from discrete mathematics). It is the study of numerical methods th ...

, the Gram–Schmidt process is a method for orthonormalizing a set of vectors in an

inner product space In mathematics, an inner product space (or, rarely, a Hausdorff pre-Hilbert space) is a real vector space or a complex vector space with an operation called an inner product. The inner product of two vectors in the space is a scalar, often ...

, most commonly the

Euclidean space Euclidean space is the fundamental space of geometry, intended to represent physical space. Originally, that is, in Euclid's ''Elements'', it was the three-dimensional space of Euclidean geometry, but in modern mathematics there are Euclidean sp ...

equipped with the

standard inner product In mathematics, the dot product or scalar productThe term ''scalar product'' means literally "product with a scalar as a result". It is also used sometimes for other symmetric bilinear forms, for example in a pseudo-Euclidean space. is an algeb ...

. The Gram–Schmidt process takes a

finite Finite is the opposite of infinite. It may refer to: * Finite number (disambiguation) * Finite set, a set whose cardinality (number of elements) is some natural number * Finite verb Traditionally, a finite verb (from la, fīnītus, past partici ...

linearly independent In the theory of vector spaces, a set of vectors is said to be if there is a nontrivial linear combination of the vectors that equals the zero vector. If no such linear combination exists, then the vectors are said to be . These concepts ...

set of vectors for and generates an orthogonal set that spans the same ''k''-dimensional subspace of R^''n'' as ''S''. The method is named after

Jørgen Pedersen Gram Jørgen Pedersen Gram (27 June 1850 – 29 April 1916) was a Danish actuary and mathematician who was born in Nustrup, Duchy of Schleswig, Denmark and died in Copenhagen, Denmark. Important papers of his include ''On series expansions determ ...

and

Erhard Schmidt Erhard Schmidt (13 January 1876 – 6 December 1959) was a Baltic German mathematician whose work significantly influenced the direction of mathematics in the twentieth century. Schmidt was born in Tartu (german: link=no, Dorpat), in the Gover ...

, but

Pierre-Simon Laplace Pierre-Simon, marquis de Laplace (; ; 23 March 1749 – 5 March 1827) was a French scholar and polymath whose work was important to the development of engineering, mathematics, statistics, physics, astronomy, and philosophy. He summarized ...

had been familiar with it before Gram and Schmidt. In the theory of

Lie group decompositions {{unreferenced, date=September 2009 In mathematics, Lie group decompositions are used to analyse the structure of Lie groups and associated objects, by showing how they are built up out of subgroups. They are essential technical tools in the repres ...

it is generalized by the

Iwasawa decomposition In mathematics, the Iwasawa decomposition (aka KAN from its expression) of a semisimple Lie group generalises the way a square real matrix can be written as a product of an orthogonal matrix and an upper triangular matrix (QR decomposition, a con ...

. The application of the Gram–Schmidt process to the column vectors of a full column

rank Rank is the relative position, value, worth, complexity, power, importance, authority, level, etc. of a person or object within a ranking, such as: Level or position in a hierarchical organization * Academic rank * Diplomatic rank * Hierarchy * H ...

matrix Matrix most commonly refers to: * ''The Matrix'' (franchise), an American media franchise ** '' The Matrix'', a 1999 science-fiction action film ** "The Matrix", a fictional setting, a virtual reality environment, within ''The Matrix'' (franchi ...

yields the

QR decomposition In linear algebra, a QR decomposition, also known as a QR factorization or QU factorization, is a decomposition of a matrix ''A'' into a product ''A'' = ''QR'' of an orthogonal matrix ''Q'' and an upper triangular matrix ''R''. QR decom ...

(it is decomposed into an

orthogonal In mathematics, orthogonality is the generalization of the geometric notion of '' perpendicularity''. By extension, orthogonality is also used to refer to the separation of specific features of a system. The term also has specialized meanings in ...

and a

triangular matrix In mathematics, a triangular matrix is a special kind of square matrix. A square matrix is called if all the entries ''above'' the main diagonal are zero. Similarly, a square matrix is called if all the entries ''below'' the main diagonal ar ...

The Gram–Schmidt process

We define the projection

operator Operator may refer to: Mathematics * A symbol indicating a mathematical operation * Logical operator or logical connective in mathematical logic * Operator (mathematics), mapping that acts on elements of a space to produce elements of another ...

\operatorname_ (\mathbf) = \frac ,

where

\langle \mathbf, \mathbf\rangle

denotes the

inner product In mathematics, an inner product space (or, rarely, a Hausdorff pre-Hilbert space) is a real vector space or a complex vector space with an operation called an inner product. The inner product of two vectors in the space is a scalar, often ...

of the vectors u and v. This operator projects the vector v orthogonally onto the line spanned by vector u. If u = 0, we define

\operatorname_\mathbf (\mathbf) := \mathbf

, i.e., the projection map

\operatorname_\mathbf

is the zero map, sending every vector to the zero vector. The Gram–Schmidt process then works as follows:

\begin
\mathbf_1 & = \mathbf_1, & \mathbf_1 & = \frac \\
\mathbf_2 & = \mathbf_2-\operatorname_ (\mathbf_2),
& \mathbf_2 & = \frac \\
\mathbf_3 & = \mathbf_3-\operatorname_ (\mathbf_3) - \operatorname_ (\mathbf_3),
& \mathbf_3 & = \frac \\
\mathbf_4 & = \mathbf_4-\operatorname_ (\mathbf_4)-\operatorname_ (\mathbf_4)-\operatorname_ (\mathbf_4),
& \mathbf_4 & =  \\
& \ \  \vdots & & \ \  \vdots \\
\mathbf_k & = \mathbf_k - \sum_^\operatorname_ (\mathbf_k),
& \mathbf_k & = \frac.
\end

The sequence is the required system of orthogonal vectors, and the normalized vectors form an ortho''normal'' set. The calculation of the sequence is known as ''Gram–Schmidt

orthogonalization In linear algebra, orthogonalization is the process of finding a set of orthogonal vectors that span a particular subspace. Formally, starting with a linearly independent set of vectors in an inner product space (most commonly the Euclidean s ...

'', while the calculation of the sequence is known as ''Gram–Schmidt orthonormalization'' as the vectors are normalized. To check that these formulas yield an orthogonal sequence, first compute

\langle \mathbf_1, \mathbf_2 \rangle

by substituting the above formula for u₂: we get zero. Then use this to compute

\langle \mathbf_1, \mathbf_3 \rangle

again by substituting the formula for u₃: we get zero. The general proof proceeds by

mathematical induction Mathematical induction is a method for proving that a statement ''P''(''n'') is true for every natural number ''n'', that is, that the infinitely many cases ''P''(0), ''P''(1), ''P''(2), ''P''(3), ... all hold. Informal metaphors help ...

. Geometrically, this method proceeds as follows: to compute u_''i'', it projects v_''i'' orthogonally onto the subspace ''U'' generated by , which is the same as the subspace generated by . The vector u_''i'' is then defined to be the difference between v_''i'' and this projection, guaranteed to be orthogonal to all of the vectors in the subspace ''U''. The Gram–Schmidt process also applies to a linearly independent

countably infinite In mathematics, a set is countable if either it is finite or it can be made in one to one correspondence with the set of natural numbers. Equivalently, a set is ''countable'' if there exists an injective function from it into the natural numbers ...

sequence . The result is an orthogonal (or orthonormal) sequence such that for natural number : the algebraic span of is the same as that of . If the Gram–Schmidt process is applied to a linearly dependent sequence, it outputs the vector on the ''i''th step, assuming that is a linear combination of . If an orthonormal basis is to be produced, then the algorithm should test for zero vectors in the output and discard them because no multiple of a zero vector can have a length of 1. The number of vectors output by the algorithm will then be the dimension of the space spanned by the original inputs. A variant of the Gram–Schmidt process using transfinite recursion applied to a (possibly uncountably) infinite sequence of vectors

(v_\alpha)_

yields a set of orthonormal vectors

(u_\alpha)_

with

\kappa\leq\lambda

such that for any

\alpha\leq\lambda

, the completion of the span of

\

is the same as that of In particular, when applied to a (algebraic) basis of a

Hilbert space In mathematics, Hilbert spaces (named after David Hilbert) allow generalizing the methods of linear algebra and calculus from (finite-dimensional) Euclidean vector spaces to spaces that may be infinite-dimensional. Hilbert spaces arise natu ...

(or, more generally, a basis of any dense subspace), it yields a (functional-analytic) orthonormal basis. Note that in the general case often the strict inequality

\kappa < \lambda

holds, even if the starting set was linearly independent, and the span of

(u_\alpha)_

need not be a subspace of the span of

(v_\alpha)_

(rather, it's a subspace of its completion).

Example

Euclidean space

Consider the following set of vectors in (with the conventional

)

S = \left\.

Now, perform Gram–Schmidt, to obtain an orthogonal set of vectors:

\mathbf_1=\mathbf_1=\begin3\\1\end

\mathbf_2 = \mathbf_2 - \operatorname_ (\mathbf_2)
= \begin2\\2\end - \operatorname_ 
= \begin2\\2\end - \frac \begin 3 \\1 \end
= \begin -2/5 \\6/5 \end.

We check that the vectors and are indeed orthogonal:

\langle\mathbf_1,\mathbf_2\rangle = \left\langle \begin3\\1\end, \begin -2/5 \\ 6/5 \end \right\rangle = -\frac + \frac = 0,

noting that if the dot product of two vectors is 0 then they are orthogonal. For non-zero vectors, we can then normalize the vectors by dividing out their sizes as shown above:

\mathbf_1 = \frac\begin3\\1\end

\mathbf_2 = \frac \begin-2/5\\6/5\end
= \frac \begin-1\\3\end.

Properties

Denote by

\operatorname(\mathbf_1, \dots, \mathbf_k)

the result of applying the Gram–Schmidt process to a collection of vectors

\mathbf_1, \dots, \mathbf_k

. This yields a map

\operatorname \colon (\R^n)^ \to (\R^n)^

. It has the following properties: * It is continuous * It is

orientation Orientation may refer to: Positioning in physical space * Map orientation, the relationship between directions on a map and compass directions * Orientation (housing), the position of a building with respect to the sun, a concept in building desi ...

preserving in the sense that

\operatorname(\mathbf_1,\dots,\mathbf_k) = \operatorname(\operatorname(\mathbf_1,\dots,\mathbf_k))

. * It commutes with orthogonal maps: Let

g \colon \R^n \to \R^n

be orthogonal (with respect to the given inner product). Then we have

\operatorname(g(\mathbf_1),\dots,g(\mathbf_k)) = \left( g(\operatorname(\mathbf_1,\dots,\mathbf_k)_1),\dots,g(\operatorname(\mathbf_1,\dots,\mathbf_k)_k) \right)

Further a parametrized version of the Gram–Schmidt process yields a (strong) deformation retraction of the general linear group

\mathrm(\R^n)

onto the orthogonal group

O(\R^n)

Numerical stability

When this process is implemented on a computer, the vectors

\mathbf_k

are often not quite orthogonal, due to rounding errors. For the Gram–Schmidt process as described above (sometimes referred to as "classical Gram–Schmidt") this loss of orthogonality is particularly bad; therefore, it is said that the (classical) Gram–Schmidt process is

numerically unstable In the mathematical subfield of numerical analysis, numerical stability is a generally desirable property of numerical algorithms. The precise definition of stability depends on the context. One is numerical linear algebra and the other is algorit ...

. The Gram–Schmidt process can be stabilized by a small modification; this version is sometimes referred to as modified Gram-Schmidt or MGS. This approach gives the same result as the original formula in exact arithmetic and introduces smaller errors in finite-precision arithmetic. Instead of computing the vector as

\mathbf_k = \mathbf_k - \operatorname_ (\mathbf_k) - \operatorname_ (\mathbf_k) - \cdots - \operatorname_ (\mathbf_k),

it is computed as

\begin
\mathbf_k^ &= \mathbf_k - \operatorname_ (\mathbf_k), \\
\mathbf_k^ &= \mathbf_k^ - \operatorname_ \left(\mathbf_k^\right), \\
& \;\; \vdots \\
\mathbf_k^ &= \mathbf_k^ - \operatorname_ \left(\mathbf_k^\right), \\
\mathbf_k^ &= \mathbf_k^ - \operatorname_ \left(\mathbf_k^\right), \\
\mathbf_k &=  \frac
\end

This method is used in the previous animation, when the intermediate vector is used when orthogonalizing the blue vector . Here is another description of the modified algorithm. Given the vectors

v_1, v_2, \dots, v_n

, in our first step we produce vectors

v_1, v_2^, \dots, v_n^

by removing components along the direction of

v_1

. In formulas,

v_k^ := v_k - \fracv_1

. After this step we already have two of our desired orthogonal vectors

u_1, \dots, u_n

, namely

u_1 = v_1, u_2 = v_2^

, but we also made

v_3^, \dots, v_n^

already orthogonal to

u_1

. Next, we orthogonalize those remaining vectors against

u_2 = v_2^

. This means we compute

v_3^, v_4^, \dots, v_n^

by subtraction

v_k^ := v_k^ - \frac u_2

. Now we have stored the vectors

v_1, v_2^, v_3^, v_4^, \dots, v_n^

where the first three vectors are already

u_1, u_2, u_3

and the remaining vectors are already orthogonal to

u_1, u_2

. As should be clear now, the next step orthogonalizes

v_4^, \dots, v_n^

against

u_3 = v_3^

. Proceeding in this manner we find the full set of orthogonal vectors

u_1, \dots, u_n

. If orthonormal vectors are desired, then we normalize as we go, so that the denominators in the subtraction formulas turn into ones.

Algorithm

The following

MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementa ...

algorithm implements the Gram–Schmidt orthonormalization for Euclidean Vectors. The vectors (columns of matrix V, so that V(:,j) is the ''j''th vector) are replaced by orthonormal vectors (columns of U) which span the same subspace. function gramschmidt(V) ,k= size(V); U = zeros(n,k); U(:,1) = V(:,1)/norm(V(:,1)); for i = 2:k U(:,i)=V(:,i); for j=1:i-1 U(:,i)=U(:,i)-(U(:,j)'*U(:,i)) * U(:,j); end U(:,i) = U(:,i)/norm(U(:,i)); end end The cost of this algorithm is asymptotically floating point operations, where is the dimensionality of the vectors.

Via Gaussian elimination

If the rows are written as a matrix

A

, then applying

Gaussian elimination In mathematics, Gaussian elimination, also known as row reduction, is an algorithm for solving systems of linear equations. It consists of a sequence of operations performed on the corresponding matrix of coefficients. This method can also be used ...

to the augmented matrix

\left A \right /math> will produce the orthogonalized vectors in place of A . However the matrix A A^\mathsf must be brought to

row echelon form In linear algebra, a matrix is in echelon form if it has the shape resulting from a Gaussian elimination. A matrix being in row echelon form means that Gaussian elimination has operated on the rows, and column echelon form means that Gaussian el ...

, using only the row operation of adding a scalar multiple of one row to another. For example, taking

\mathbf_1 = \begin 3 & 1\end, \mathbf_2=\begin2 & 2\end

as above, we have

\left A \right = \left begin 10 & 8 & 3 & 1 \\ 8 & 8 & 2 & 2\end\right /math>

And reducing this to

produces

\left begin 1 & .8 & .3 & .1 \\ 0 & 1 & -.25 & .75\end\right /math>

The normalized vectors are then \mathbf_1 = \frac\begin.3 & .1\end = \frac \begin3 & 1\end \mathbf_2 = \frac \begin-.25 & .75\end = \frac \begin-1 & 3\end, as in the example above.

Determinant formula

The result of the Gram–Schmidt process may be expressed in a non-recursive formula using

determinant In mathematics, the determinant is a scalar value that is a function of the entries of a square matrix. It characterizes some properties of the matrix and the linear map represented by the matrix. In particular, the determinant is nonzero if ...

\mathbf_j = \frac \begin
\langle \mathbf_1, \mathbf_1 \rangle     & \langle \mathbf_2, \mathbf_1 \rangle     & \cdots & \langle \mathbf_j, \mathbf_1 \rangle \\
\langle \mathbf_1, \mathbf_2 \rangle     & \langle \mathbf_2, \mathbf_2 \rangle     & \cdots & \langle \mathbf_j, \mathbf_2 \rangle \\
\vdots                                         & \vdots                                         & \ddots & \vdots \\
\langle \mathbf_1, \mathbf_ \rangle & \langle \mathbf_2, \mathbf_ \rangle & \cdots & \langle \mathbf_j, \mathbf_ \rangle \\
\mathbf_1                                   & \mathbf_2                                   & \cdots & \mathbf_j
\end

\mathbf_j = \frac \begin
\langle \mathbf_1, \mathbf_1 \rangle     & \langle \mathbf_2, \mathbf_1 \rangle     & \cdots & \langle \mathbf_j, \mathbf_1 \rangle \\
\langle \mathbf_1, \mathbf_2 \rangle     & \langle \mathbf_2, \mathbf_2 \rangle     & \cdots & \langle \mathbf_j, \mathbf_2 \rangle \\
\vdots                                         & \vdots                                         & \ddots & \vdots \\
\langle \mathbf_1, \mathbf_ \rangle & \langle \mathbf_2, \mathbf_ \rangle & \cdots & \langle \mathbf_j, \mathbf_ \rangle \\
\mathbf_1                                   & \mathbf_2                                   & \cdots & \mathbf_j
\end

where ''D''₀=1 and, for ''j'' ≥ 1, ''D_j'' is the Gram determinant

D_j = \begin
\langle \mathbf_1, \mathbf_1 \rangle & \langle \mathbf_2, \mathbf_1 \rangle & \cdots & \langle \mathbf_j, \mathbf_1 \rangle \\
\langle \mathbf_1, \mathbf_2 \rangle & \langle \mathbf_2, \mathbf_2 \rangle & \cdots & \langle \mathbf_j, \mathbf_2 \rangle \\
\vdots & \vdots & \ddots & \vdots \\
\langle \mathbf_1, \mathbf_j \rangle & \langle \mathbf_2, \mathbf_j \rangle & \cdots & \langle \mathbf_j, \mathbf_j \rangle
\end.

Note that the expression for u_''k'' is a "formal" determinant, i.e. the matrix contains both scalars and vectors; the meaning of this expression is defined to be the result of a cofactor expansion along the row of vectors. The determinant formula for the Gram-Schmidt is computationally slower (exponentially slower) than the recursive algorithms described above; it is mainly of theoretical interest.

Expressed using geometric algebra

Expressed using notation used in

geometric algebra In mathematics, a geometric algebra (also known as a real Clifford algebra) is an extension of elementary algebra to work with geometrical objects such as vectors. Geometric algebra is built out of two fundamental operations, addition and the g ...

, the unnormalized results of the Gram–Schmidt process can be expressed as

\mathbf_k = \mathbf_k - \sum_^ (\mathbf_k \cdot \mathbf_j)\mathbf_j^\ ,

which is equivalent to the expression using the

\operatorname

operator defined above. The results can equivalently be expressed as :

\mathbf_k = \mathbf_\wedge\mathbf_\wedge\cdot\cdot\cdot\wedge\mathbf_(\mathbf_\wedge\cdot\cdot\cdot\wedge\mathbf_)^,

which is closely related to the expression using determinants above.

Alternatives

Other

algorithms use

Householder transformation In linear algebra, a Householder transformation (also known as a Householder reflection or elementary reflector) is a linear transformation that describes a reflection about a plane or hyperplane containing the origin. The Householder transformat ...

s or

Givens rotation In numerical linear algebra, a Givens rotation is a rotation in the plane spanned by two coordinates axes. Givens rotations are named after Wallace Givens, who introduced them to numerical analysts in the 1950s while he was working at Argonne Nation ...

s. The algorithms using Householder transformations are more stable than the stabilized Gram–Schmidt process. On the other hand, the Gram–Schmidt process produces the

j

th orthogonalized vector after the

j

th iteration, while orthogonalization using Householder reflections produces all the vectors only at the end. This makes only the Gram–Schmidt process applicable for

iterative method In computational mathematics, an iterative method is a mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which the ''n''-th approximation is derived from the pre ...

s like the

Arnoldi iteration In numerical linear algebra, the Arnoldi iteration is an eigenvalue algorithm and an important example of an iterative method. Arnoldi finds an approximation to the eigenvalues and eigenvectors of general (possibly non-Hermitian) matrices by con ...

. Yet another alternative is motivated by the use of

Cholesky decomposition In linear algebra, the Cholesky decomposition or Cholesky factorization (pronounced ) is a decomposition of a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose, which is useful for effi ...

for inverting the matrix of the normal equations in linear least squares. Let

V

be a

full column rank In linear algebra, the rank of a matrix is the dimension of the vector space generated (or spanned) by its columns. p. 48, § 1.16 This corresponds to the maximal number of linearly independent columns of . This, in turn, is identical to the dime ...

matrix, whose columns need to be orthogonalized. The matrix

V^* V

Hermitian {{Short description, none Numerous things are named after the French mathematician Charles Hermite (1822–1901): Hermite * Cubic Hermite spline, a type of third-degree spline * Gauss–Hermite quadrature, an extension of Gaussian quadrature m ...

and positive definite, so it can be written as

V^* V = L L^*,

using the

. The lower triangular matrix

L

with strictly positive diagonal entries is

invertible In mathematics, the concept of an inverse element generalises the concepts of opposite () and reciprocal () of numbers. Given an operation denoted here , and an identity element denoted , if , one says that is a left inverse of , and that ...

. Then columns of the matrix

U = V\left(L^\right)^*

are

orthonormal In linear algebra, two vectors in an inner product space are orthonormal if they are orthogonal (or perpendicular along a line) unit vectors. A set of vectors form an orthonormal set if all vectors in the set are mutually orthogonal and all of un ...

and

span Span may refer to: Science, technology and engineering * Span (unit), the width of a human hand * Span (engineering), a section between two intermediate supports * Wingspan, the distance between the wingtips of a bird or aircraft * Sorbitan es ...

the same subspace as the columns of the original matrix

V

. The explicit use of the product

V^* V

makes the algorithm unstable, especially if the product's

condition number In numerical analysis, the condition number of a function measures how much the output value of the function can change for a small change in the input argument. This is used to measure how sensitive a function is to changes or errors in the inpu ...

is large. Nevertheless, this algorithm is used in practice and implemented in some software packages because of its high efficiency and simplicity. In

quantum mechanics Quantum mechanics is a fundamental theory in physics that provides a description of the physical properties of nature at the scale of atoms and subatomic particles. It is the foundation of all quantum physics including quantum chemistry, q ...

there are several orthogonalization schemes with characteristics better suited for certain applications than original Gram–Schmidt. Nevertheless, it remains a popular and effective algorithm for even the largest electronic structure calculations.

References

Sources

* . * . * . * .

External links

*
Harvey Mudd College Math Tutorial on the Gram-Schmidt algorithm

The entry "Gram-Schmidt orthogonalization" has some information and references on the origins of the method. * Demos
Gram Schmidt process in plane
an
Gram Schmidt process in space

* Proof: ttp://planetmath.org/ProofOfGramSchmidtOrthogonalizationProcedure Raymond Puzio, Keenan Kidwell. "proof of Gram-Schmidt orthogonalization algorithm" (version 8). PlanetMath.org. {{DEFAULTSORT:Gram-Schmidt Process Linear algebra Functional analysis Articles with example MATLAB/Octave code