For
parsing algorithms in
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includin ...
, the inside–outside algorithm is a way of re-estimating production probabilities in a
probabilistic context-free grammar Grammar theory to model symbol strings originated from work in computational linguistics aiming to understand the structure of natural languages. Probabilistic context free grammars (PCFGs) have been applied in probabilistic modeling of RNA structur ...
. It was introduced by
James K. Baker
James K. Baker and his wife Janet M. Baker are the co-founders of Dragon Systems. Together they are credited with the creation of Dragon NaturallySpeaking.
James Baker is an expert in speech recognition technology and a Distinguished Career ...
in 1979 as a generalization of the
forward–backward algorithm
The forward–backward algorithm is an inference algorithm for hidden Markov models which computes the posterior marginals of all hidden state variables given a sequence of observations/emissions o_:= o_1,\dots,o_T, i.e. it computes, for all hi ...
for parameter estimation on
hidden Markov model
A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it X — with unobservable ("''hidden''") states. As part of the definition, HMM requires that there be an ob ...
s to
stochastic context-free grammar Grammar theory to model symbol strings originated from work in computational linguistics aiming to understand the structure of natural languages. Probabilistic context free grammars (PCFGs) have been applied in probabilistic modeling of RNA struct ...
s. It is used to compute expectations, for example as part of the
expectation–maximization algorithm
In statistics, an expectation–maximization (EM) algorithm is an iterative method to find (local) maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent variab ...
(an unsupervised learning algorithm).
Inside and outside probabilities
The inside probability
is the total probability of generating words
, given the root nonterminal
and a grammar
:
:
The outside probability
is the total probability of beginning with the start symbol
and generating the nonterminal
and all the words outside
, given a grammar
:
:
Computing inside probabilities
Base Case:
General case:
Suppose there is a rule
in the grammar, then the probability of generating
starting with a subtree rooted at
is:
The inside probability
is just the sum over all such possible rules:
Computing outside probabilities
Base Case:
Here the start symbol is
.
General case:
Suppose there is a rule
in the grammar that generates
.
Then the ''left'' contribution of that rule to the outside probability
is:
Now suppose there is a rule
in the grammar. Then the ''right''
contribution of that rule to the outside probability
is:
The outside probability
is the sum of the left and right
contributions over all such rules:
References
* J. Baker (1979)
Trainable grammars for speech recognition In J. J. Wolf and D. H. Klatt, editors, ''Speech communication papers presented at the 97th meeting of the Acoustical Society of America'', pages 547–550, Cambridge, MA, June 1979. MIT.
*
Karim Lari,
Steve J. Young (1990)
The estimation of stochastic context-free grammars using the inside–outside algorithm ''Computer Speech and Language'', 4:35–56.
*
Karim Lari,
Steve J. Young (1991)
Applications of stochastic context-free grammars using the Inside–Outside algorithm ''Computer Speech and Language'', 5:237–257.
* Fernando Pereira, Yves Schabes (1992)
Inside–outside reestimation from partially bracketed corpora ''Proceedings of the 30th annual meeting on Association for Computational Linguistics, Association for Computational Linguistics'', 128–135.
External links
Inside-outside algorithm - Fei XiaThe Inside-Outside Algorithm - Michael Collins
{{DEFAULTSORT:Inside-outside algorithm
Parsing algorithms