The Viterbi algorithm is a
dynamic programming
Dynamic programming is both a mathematical optimization method and a computer programming method. The method was developed by Richard Bellman in the 1950s and has found applications in numerous fields, from aerospace engineering to economics.
...
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
for obtaining the
maximum a posteriori probability estimate of the most
likely sequence of hidden states—called the Viterbi path—that results in a sequence of observed events, especially in the context of
Markov information sources and
hidden Markov model
A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it X — with unobservable ("''hidden''") states. As part of the definition, HMM requires that there be an ...
s (HMM).
The algorithm has found universal application in decoding the
convolutional code
In telecommunication, a convolutional code is a type of error-correcting code that generates parity symbols via the sliding application of a boolean polynomial function to a data stream. The sliding application represents the 'convolution' of th ...
s used in both
CDMA
Code-division multiple access (CDMA) is a channel access method used by various radio communication technologies. CDMA is an example of multiple access, where several transmitters can send information simultaneously over a single communicatio ...
and
GSM
The Global System for Mobile Communications (GSM) is a standard developed by the European Telecommunications Standards Institute (ETSI) to describe the protocols for second-generation ( 2G) digital cellular networks used by mobile devices such ...
digital cellular,
dial-up
Dial-up Internet access is a form of Internet access that uses the facilities of the public switched telephone network (PSTN) to establish a connection to an Internet service provider (ISP) by dialing a telephone number on a conventional telepho ...
modems, satellite, deep-space communications, and
802.11
IEEE 802.11 is part of the IEEE 802 set of local area network (LAN) technical standards, and specifies the set of media access control (MAC) and physical layer (PHY) protocols for implementing wireless local area network (WLAN) computer com ...
wireless LANs. It is now also commonly used in
speech recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the ...
,
speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal langua ...
,
diarization
Speaker diarisation ( or diarization) is the process of partitioning an audio stream containing human speech into homogeneous segments according to the identity of each speaker. It can enhance the readability of an automatic speech transcription b ...
,
keyword spotting Keyword spotting (or more simply, word spotting) is a problem that was historically first defined in the context of speech processing.
In speech processing, keyword spotting deals with the identification of keywords in utterances.
Keyword spottin ...
,
computational linguistics
Computational linguistics is an interdisciplinary field concerned with the computational modelling of natural language, as well as the study of appropriate computational approaches to linguistic questions. In general, computational linguistics ...
, and
bioinformatics
Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
. For example, in
speech-to-text (speech recognition), the acoustic signal is treated as the observed sequence of events, and a string of text is considered to be the "hidden cause" of the acoustic signal. The Viterbi algorithm finds the most likely string of text given the acoustic signal.
History
The Viterbi algorithm is named after
Andrew Viterbi
Andrew James Viterbi (born Andrea Giacomo Viterbi, March 9, 1935) is an American electrical engineer and businessman who co-founded Qualcomm Inc. and invented the Viterbi algorithm. He is the Presidential Chair Professor of Electrical Engineer ...
, who proposed it in 1967 as a decoding algorithm for
convolutional codes
In telecommunication, a convolutional code is a type of error-correcting code that generates parity symbols via the sliding application of a boolean polynomial function to a data stream. The sliding application represents the 'convolution' of th ...
over noisy digital communication links. It has, however, a history of
multiple invention, with at least seven independent discoveries, including those by Viterbi,
Needleman and Wunsch, and
Wagner and Fischer.
It was introduced to
Natural Language Processing
Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to proc ...
as a method of
part-of-speech tagging as early as 1987.
''Viterbi path'' and ''Viterbi algorithm'' have become standard terms for the application of dynamic programming algorithms to maximization problems involving probabilities.
For example, in
statistical parsing a dynamic programming algorithm can be used to discover the single most likely context-free derivation (parse) of a string, which is commonly called the "Viterbi parse". Another application is in
target tracking, where the track is computed that assigns a maximum likelihood to a sequence of observations.
Extensions
A generalization of the Viterbi algorithm, termed the ''max-sum algorithm'' (or ''max-product algorithm'') can be used to find the most likely assignment of all or some subset of
latent variables in a large number of
graphical models, e.g.
Bayesian network
A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Ba ...
s,
Markov random fields and
conditional random fields. The latent variables need, in general, to be connected in a way somewhat similar to a
hidden Markov model
A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it X — with unobservable ("''hidden''") states. As part of the definition, HMM requires that there be an ...
(HMM), with a limited number of connections between variables and some type of linear structure among the variables. The general algorithm involves ''message passing'' and is substantially similar to the
belief propagation
A belief is an attitude that something is the case, or that some proposition is true. In epistemology, philosophers use the term "belief" to refer to attitudes about the world which can be either true or false. To believe something is to take i ...
algorithm (which is the generalization of the
forward-backward algorithm).
With the algorithm called
iterative Viterbi decoding Iterative Viterbi decoding is an algorithm that spots the subsequence ''S'' of an observation ''O'' = having the highest average probability (i.e., probability scaled by the length of ''S'') of being generated by a given hidden Markov model ''M'' w ...
one can find the subsequence of an observation that matches best (on average) to a given hidden Markov model. This algorithm is proposed by Qi Wang et al. to deal with
turbo code
In information theory, turbo codes (originally in French ''Turbocodes'') are a class of high-performance forward error correction (FEC) codes developed around 1990–91, but first published in 1993. They were the first practical codes to closel ...
. Iterative Viterbi decoding works by iteratively invoking a modified Viterbi algorithm, reestimating the score for a filler until convergence.
An alternative algorithm, the
Lazy Viterbi algorithm
Lazy is the adjective for laziness, a lack of desire to expend effort.
It may also refer to:
Music Groups and musicians
* Lazy (band), a Japanese rock band
* Lazy Lester, American blues harmonica player Leslie Johnson (1933–2018)
* Lazy Bill ...
, has been proposed. For many applications of practical interest, under reasonable noise conditions, the lazy decoder (using Lazy Viterbi algorithm) is much faster than the original
Viterbi decoder
A Viterbi decoder uses the Viterbi algorithm for decoding a bitstream that has been
encoded using a convolutional code or trellis code.
There are other algorithms for decoding a convolutionally encoded stream (for example, the Fano algorithm). T ...
(using Viterbi algorithm). While the original Viterbi algorithm calculates every node in the
trellis of possible outcomes, the Lazy Viterbi algorithm maintains a prioritized list of nodes to evaluate in order, and the number of calculations required is typically fewer (and never more) than the ordinary Viterbi algorithm for the same result. However, it is not so easy to parallelize in hardware.
Pseudocode
This algorithm generates a path
, which is a sequence of states
that generate the observations
with
, where
is the number of possible observations in the observation space
.
Two 2-dimensional tables of size
are constructed:
* Each element