numerical mathematics Numerical analysis is the study of algorithms that use numerical approximation (as opposed to symbolic manipulations) for the problems of mathematical analysis (as distinguished from discrete mathematics). It is the study of numerical methods ...

, the Uzawa iteration is an algorithm for solving

saddle point In mathematics, a saddle point or minimax point is a point on the surface of the graph of a function where the slopes (derivatives) in orthogonal directions are all zero (a critical point), but which is not a local extremum of the function ...

problems. It is named after

Hirofumi Uzawa was a Japanese economist. Biography Uzawa was born on July 21, 1928 in Yonago, Tottori to a farming family. He attended the Tokyo First Middle School (currently the Hibiya High School ) and the First Higher School, Japan (now the University ...

and was originally introduced in the context of concave programming.

Basic idea

We consider a saddle point problem of the form :

\begin A & B\\ B^* & \end \begin x_1\\ x_2 \end
         = \begin b_1\\ b_2 \end,

where

A

is a symmetric

positive-definite matrix In mathematics, a symmetric matrix M with real entries is positive-definite if the real number z^\textsfMz is positive for every nonzero real column vector z, where z^\textsf is the transpose of More generally, a Hermitian matrix (that is, a co ...

. Multiplying the first row by

B^* A^

and subtracting from the second row yields the upper-triangular system :

\begin A & B\\ & -S \end \begin x_1\\ x_2 \end
         = \begin b_1\\ b_2 - B^* A^ b_1 \end,

where

S := B^* A^ B

denotes the

Schur complement In linear algebra and the theory of matrices, the Schur complement of a block matrix is defined as follows. Suppose ''p'', ''q'' are nonnegative integers, and suppose ''A'', ''B'', ''C'', ''D'' are respectively ''p'' × ''p'', ''p'' × ''q'', ''q'' ...

. Since

S

is symmetric positive-definite, we can apply standard iterative methods like the

gradient descent In mathematics, gradient descent (also often called steepest descent) is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of the ...

method or the

conjugate gradient method In mathematics, the conjugate gradient method is an algorithm for the numerical solution of particular systems of linear equations, namely those whose matrix is positive-definite. The conjugate gradient method is often implemented as an iterativ ...

to :

S x_2 = B^* A^ b_1 - b_2

in order to compute

x_2

. The vector

x_1

can be reconstructed by solving :

A x_1 = b_1 - B x_2. \,

It is possible to update

x_1

alongside

x_2

during the iteration for the Schur complement system and thus obtain an efficient algorithm.

Implementation

We start the conjugate gradient iteration by computing the residual :

r_2 := B^* A^ b_1 - b_2 - S x_2 = B^* A^ (b_1 - B x_2) - b_2 = B^* x_1 - b_2,

of the Schur complement system, where :

x_1 := A^ (b_1 - B x_2)

denotes the upper half of the solution vector matching the initial guess

x_2

for its lower half. We complete the initialization by choosing the first search direction :

p_2 := r_2.\,

In each step, we compute :

a_2 := S p_2 = B^* A^ B p_2 = B^* p_1

and keep the intermediate result :

p_1 := A^ B p_2

for later. The scaling factor is given by :

\alpha := p_2^* a_2 /p_2^* r_2

and leads to the updates :

x_2 := x_2 + \alpha p_2, \quad r_2 := r_2 - \alpha a_2.

Using the intermediate result

p_1

saved earlier, we can also update the upper part of the solution vector :

x_1 := x_1 - \alpha p_1.\,

Now we only have to construct the new search direction by the

Gram–Schmidt process In mathematics, particularly linear algebra and numerical analysis, the Gram–Schmidt process is a method for orthonormalizing a set of vectors in an inner product space, most commonly the Euclidean space equipped with the standard inner produ ...

, i.e., :

\beta := r_2^* a_2 / p_2^* a_2,\quad p_2 := r_2 - \beta p_2.

The iteration terminates if the residual

r_2

has become sufficiently small or if the norm of

p_2

is significantly smaller than

r_2

indicating that the

Krylov subspace In linear algebra, the order-''r'' Krylov subspace generated by an ''n''-by-''n'' matrix ''A'' and a vector ''b'' of dimension ''n'' is the linear subspace spanned by the images of ''b'' under the first ''r'' powers of ''A'' (starting from A^0=I), ...

has been almost exhausted.

Modifications and extensions

If solving the linear system

A x=b

exactly is not feasible, inexact solvers can be applied. If the Schur complement system is ill-conditioned, preconditioners can be employed to improve the speed of convergence of the underlying gradient method. Inequality constraints can be incorporated, e.g., in order to handle obstacle problems.

Basic idea

Implementation

Modifications and extensions

References

Further reading