mathematics Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...

, a block matrix pseudoinverse is a formula for the

pseudoinverse In mathematics, and in particular, algebra, a generalized inverse (or, g-inverse) of an element ''x'' is an element ''y'' that has some properties of an inverse element but not necessarily all of them. The purpose of constructing a generalized inv ...

of a partitioned matrix. This is useful for decomposing or approximating many algorithms updating parameters in

signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing ''signals'', such as audio signal processing, sound, image processing, images, Scalar potential, potential fields, Seismic tomograph ...

, which are based on the

least squares The method of least squares is a mathematical optimization technique that aims to determine the best fit function by minimizing the sum of the squares of the differences between the observed values and the predicted values of the model. The me ...

method.

Derivation

Consider a column-wise partitioned matrix: :

\begin\mathbf A & \mathbf B\end,\quad
  \mathbf A \in \reals^,\quad
  \mathbf B \in \reals^,\quad
  m \geq n + p.

If the above matrix is full column rank, the

Moore–Penrose inverse In mathematics, and in particular linear algebra, the Moore–Penrose inverse of a matrix , often called the pseudoinverse, is the most widely known generalization of the inverse matrix. It was independently described by E. H. Moore in 1920, Ar ...

matrices of it and its transpose are :

\begin
  \begin\mathbf A & \mathbf B\end^+ &=
  \left(
    \begin\mathbf A & \mathbf B\end^\textsf
    \begin\mathbf A & \mathbf B\end
  \right)^ \begin\mathbf A & \mathbf B\end^\textsf, \\

  \begin
    \mathbf A^\textsf \\
    \mathbf B^\textsf
  \end^+ &=
  \begin\mathbf A & \mathbf B\end \left(
    \begin\mathbf A & \mathbf B\end^\textsf
    \begin\mathbf A & \mathbf B\end
  \right)^.
\end

This computation of the pseudoinverse requires (''n'' + ''p'')-square matrix inversion and does not take advantage of the block form. To reduce computational costs to ''n''- and ''p''-square matrix inversions and to introduce parallelism, treating the blocks separately, one derives :

\begin
  \begin\mathbf A & \mathbf B\end^+ &=
  \begin
    \mathbf P_B^\perp \mathbf A\left(\mathbf A^\textsf \mathbf P_B^\perp \mathbf A\right)^ \\
    \mathbf P_A^\perp \mathbf B\left(\mathbf B^\textsf \mathbf P_A^\perp \mathbf B\right)^
  \end =
  \begin
    \left(\mathbf P_B^\perp\mathbf A\right)^+ \\
    \left(\mathbf P_A^\perp\mathbf B\right)^+
  \end, \\

  \begin
    \mathbf A^\textsf \\
    \mathbf B^\textsf
  \end^+ &=
  \begin
    \mathbf P_B^\perp \mathbf A\left(\mathbf A^\textsf \mathbf P_B^\perp \mathbf A\right)^,\quad
    \mathbf P_A^\perp \mathbf B\left(\mathbf B^\textsf \mathbf P_A^\perp \mathbf B\right)^
  \end =
  \begin
    \left(\mathbf A^\textsf \mathbf P_B^\perp\right)^+ &
    \left(\mathbf B^\textsf \mathbf P_A^\perp\right)^+
  \end,
\end

where

orthogonal projection In linear algebra and functional analysis, a projection is a linear transformation P from a vector space to itself (an endomorphism) such that P\circ P=P. That is, whenever P is applied twice to any vector, it gives the same result as if it we ...

matrices are defined by ::

\begin
  \mathbf P_A^\perp &= \mathbf I - \mathbf A \left(\mathbf A^\textsf \mathbf A\right)^ \mathbf A^\textsf, \\
  \mathbf P_B^\perp &= \mathbf I - \mathbf B \left(\mathbf B^\textsf \mathbf B\right)^ \mathbf B^\textsf.
\end

The above formulas are not necessarily valid if

\begin\mathbf A & \mathbf B\end

does not have full rank – for example, if

\mathbf A \neq 0

, then :

\begin\mathbf A & \mathbf A\end^+ =
  \frac\begin
    \mathbf A^+ \\
    \mathbf A^+
  \end \neq
  \begin
    \left(\mathbf P_A^\perp\mathbf A\right)^+ \\
    \left(\mathbf P_A^\perp\mathbf A\right)^+
  \end =
  0

Application to least squares problems

Given the same matrices as above, we consider the following least squares problems, which appear as multiple objective optimizations or constrained problems in signal processing. Eventually, we can implement a parallel algorithm for least squares based on the following results.

Column-wise partitioning in over-determined least squares

Suppose a solution

\mathbf x = \begin
    \mathbf x_1 \\
    \mathbf x_2 \\
  \end

solves an over-determined system: :

\begin
    \mathbf A, & \mathbf B 
  \end
  \begin
    \mathbf x_1 \\
    \mathbf x_2 \\
  \end =
  \mathbf d,\quad
  \mathbf d \in \reals^.

Using the block matrix pseudoinverse, we have :

\mathbf x = 
  \begin
    \mathbf A, & \mathbf B
  \end^+\,\mathbf d =
  \begin
    \left(\mathbf P_B^\perp \mathbf A\right)^+ \\
    \left(\mathbf P_A^\perp \mathbf B\right)^+ 
  \end\mathbf d.

Therefore, we have a decomposed solution: :

\mathbf x_1 = \left(\mathbf P_B^\perp \mathbf A\right)^+\,\mathbf d,\quad
  \mathbf x_2 = \left(\mathbf P_A^\perp \mathbf B\right)^+\,\mathbf d.

Row-wise partitioning in under-determined least squares

Suppose a solution

\mathbf x

solves an under-determined system: :

\begin
    \mathbf A^\textsf \\
    \mathbf B^\textsf 
  \end\mathbf x =
  \begin
    \mathbf e \\
    \mathbf f 
  \end,\quad
  \mathbf e \in \reals^,\quad
  \mathbf f \in \reals^.

The minimum-norm solution is given by :

\mathbf x =
  \begin
    \mathbf A^\textsf \\
    \mathbf B^\textsf 
  \end^+\,
  \begin
    \mathbf e \\
    \mathbf f 
  \end.

Using the block matrix pseudoinverse, we have :

\mathbf x = \begin
    \left(\mathbf A^\textsf\mathbf P_B^\perp\right)^+ &
    \left(\mathbf B^\textsf\mathbf P_A^\perp\right)^+
  \end \begin
    \mathbf e \\
    \mathbf f 
  \end =
  \left(\mathbf A^\textsf\mathbf P_B^\perp\right)^+\,\mathbf e +
    \left(\mathbf B^\textsf\mathbf P_A^\perp\right)^+\,\mathbf f.

Comments on matrix inversion

Instead of

\mathbf \left(\begin\mathbf A & \mathbf B\end^\textsf \begin\mathbf A & \mathbf B\end\right)^

, we need to calculate directly or indirectly :

\left(\mathbf A^\textsf \mathbf A\right)^,\quad
  \left(\mathbf B^\textsf \mathbf B\right)^,\quad
  \left(\mathbf A^\textsf \mathbf P_B^\perp \mathbf A\right)^,\quad
  \left(\mathbf B^\textsf \mathbf P_A^\perp \mathbf B\right)^.

In a dense and small system, we can use

singular value decomposition In linear algebra, the singular value decomposition (SVD) is a Matrix decomposition, factorization of a real number, real or complex number, complex matrix (mathematics), matrix into a rotation, followed by a rescaling followed by another rota ...

QR decomposition In linear algebra, a QR decomposition, also known as a QR factorization or QU factorization, is a decomposition of a matrix ''A'' into a product ''A'' = ''QR'' of an orthonormal matrix ''Q'' and an upper triangular matrix ''R''. QR decom ...

, or

Cholesky decomposition In linear algebra, the Cholesky decomposition or Cholesky factorization (pronounced ) is a decomposition of a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose, which is useful for eff ...

to replace the matrix inversions with numerical routines. In a large system, we may employ

iterative methods In computational mathematics, an iterative method is a Algorithm, mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which the ''i''-th approximation (called an " ...

such as Krylov subspace methods. Considering parallel algorithms, we can compute

\left(\mathbf A^\textsf \mathbf A\right)^

and

\left(\mathbf B^\textsf \mathbf B\right)^

in parallel. Then, we finish to compute

\left(\mathbf A^\textsf \mathbf P_B^\perp \mathbf A\right)^

and

\left(\mathbf B^\textsf \mathbf P_A^\perp \mathbf B\right)^

also in parallel.

Derivation

Application to least squares problems

Column-wise partitioning in over-determined least squares

Row-wise partitioning in under-determined least squares

Comments on matrix inversion

See also

References

External links