The term Blahut–Arimoto algorithm is often used to refer to a class of

algorithm In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...

s for computing numerically either the information theoretic capacity of a channel, the rate-distortion function of a source or a source encoding (i.e. compression to remove the redundancy). They are iterative algorithms that eventually converge to one of the maxima of the

optimization problem In mathematics, engineering, computer science and economics Economics () is a behavioral science that studies the Production (economics), production, distribution (economics), distribution, and Consumption (economics), consumption of goo ...

that is associated with these information theoretic concepts.

History and application

For the case of

channel capacity Channel capacity, in electrical engineering, computer science, and information theory, is the theoretical maximum rate at which information can be reliably transmitted over a communication channel. Following the terms of the noisy-channel coding ...

, the algorithm was independently invented by Suguru Arimoto and Richard Blahut. In addition, Blahut's treatment gives algorithms for computing

rate distortion Rate or rates may refer to: Finance * Rate (company), an American residential mortgage company formerly known as Guaranteed Rate * Rates (tax), a type of taxation system in the United Kingdom used to fund local government * Exchange rate, rate ...

and generalized capacity with input contraints (i.e. the capacity-cost function, analogous to rate-distortion). These algorithms are most applicable to the case of arbitrary finite alphabet sources. Much work has been done to extend it to more general problem instances. Recently, a version of the algorithm that accounts for continuous and multivariate outputs was proposed with applications in cellular signaling. There exists also a version of Blahut–Arimoto algorithm for directed information.

Algorithm for Channel Capacity

A discrete memoryless channel (DMC) can be specified using two random variables

X, Y

with alphabet

\mathcal X, \mathcal Y

, and a channel law as a

conditional probability distribution In probability theory and statistics, the conditional probability distribution is a probability distribution that describes the probability of an outcome given the occurrence of a particular event. Given two jointly distributed random variables X ...

p(y, x)

. The

, defined as

C := \sup_I(X;Y)

, indicates the maximum efficiency that a channel can communicate, in the unit of bit per use. Now if we denote the cardinality

, \mathcal X,  = n, , \mathcal Y,  = m

, then

p_

is a

n \times m

matrix, which we denote the

i^

row,

j^

column entry by

w_

. For the case of

, the algorithm was independently invented by Suguru Arimoto. and Richard Blahut.. They both found the following expression for the capacity of a DMC with channel law:

C = \max_ \max_ \sum_^n \sum_^m  p _i w_\log\left(\dfrac \right)

where

\mathbf p

and

Q

are maximized over the following requirements: *

\mathbf p

is a probability distribution on

X

, That is, if we write

\mathbf p

(p_1, p_2 ... , p_n), \sum_^np_i = 1

Q = (q_)

is a

m \times n

matrix that behaves like a transition matrix from

Y

X

with respect to the channel law. That is, For all

1 \leq i \leq n, 1\leq j \leq m

: **

q_ \geq 0, q_ = 0 \Leftrightarrow w_ = 0

** Every row sums up to 1, i.e.

\sum_^n q_ = 1

. Then upon picking a random probability distribution

\mathbf p^0 := (p^0_1, p^0_2, ... p^0_n)

X

, we can generate a sequence

(\mathbf p^0, Q^0, \mathbf p^1, Q^1 ...)

iteratively as follows:

(q^t_):= \dfrac

p^_k := \dfrac

For

t = 0, 1, 2 ...

. Then, using the theory of optimization, specifically

coordinate descent Coordinate descent is an optimization algorithm that successively minimizes along coordinate directions to find the minimum of a function. At each iteration, the algorithm determines a coordinate or coordinate block via a coordinate selection ru ...

, Yeung showed that the sequence indeed converges to the required maximum. That is,

\lim_  \sum_^n \sum_^m  p^t_i w_\log\left(\dfrac \right) = C

. So given a channel law

p(y, x)

, the capacity can be numerically estimated up to arbitrary precision.

Algorithm for Rate-Distortion

Suppose we have a source

X

with probability

p(x)

of any given symbol. We wish to find an encoding

p(\hat, x)

that generates a compressed signal

\hat

from the original signal while minimizing the expected

distortion In signal processing, distortion is the alteration of the original shape (or other characteristic) of a signal. In communications and electronics it means the alteration of the waveform of an information-bearing signal, such as an audio signal ...

\langle d(x,\hat) \rangle

, where the expectation is taken over the joint probability of

X

and

\hat

. We can find an encoding that minimizes the rate-distortion functional locally by repeating the following iteration until convergence: :

p_(\hat, x) = \frac

p_(\hat) = \sum_ p(x)p_(\hat, x)

where

\beta

is a parameter related to the slope in the rate-distortion curve that we are targeting and thus is related to how much we favor compression versus distortion (higher

\beta

means less compression).

References

{{DEFAULTSORT:Blahut-Arimoto algorithm Coding theory