HOME

TheInfoList



OR:

Statistical parsing is a group of
parsing Parsing, syntax analysis, or syntactic analysis is the process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term ''parsing'' comes from Lati ...
methods within
natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to pro ...
. The methods have in common that they associate
grammar In linguistics, the grammar of a natural language is its set of structure, structural constraints on speakers' or writers' composition of clause (linguistics), clauses, phrases, and words. The term can also refer to the study of such constraint ...
rules with a probability. Grammar rules are traditionally viewed in
computational linguistics Computational linguistics is an Interdisciplinarity, interdisciplinary field concerned with the computational modelling of natural language, as well as the study of appropriate computational approaches to linguistic questions. In general, comput ...
as defining the valid sentences in a language. Within this mindset, the idea of associating each rule with a probability then provides the relative frequency of any given grammar rule and, by deduction, the probability of a complete parse for a sentence. (The probability associated with a grammar rule may be induced, but the application of that grammar rule within a parse tree and the computation of the probability of the parse tree based on its component rules is a form of deduction.) Using this concept, statistical parsers make use of a procedure to search over a space of all candidate parses, and the computation of each candidate's probability, to derive the most probable parse of a sentence. The
Viterbi algorithm The Viterbi algorithm is a dynamic programming algorithm for obtaining the maximum a posteriori probability estimate of the most likely sequence of hidden states—called the Viterbi path—that results in a sequence of observed events, especiall ...
is one popular method of searching for the most probable parse. "Search" in this context is an application of
search algorithm In computer science, a search algorithm is an algorithm designed to solve a search problem. Search algorithms work to retrieve information stored within particular data structure, or calculated in the search space of a problem domain, with eith ...
s in
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech re ...
. As an example, think about the sentence "The can can hold water". A reader would instantly see that there is an object called "the can" and that this object is performing the action 'can' (i.e. is able to); and the thing the object is able to do is "hold"; and the thing the object is able to hold is "water". Using more linguistic terminology, "The can" is a noun phrase composed of a determiner followed by a noun, and "can hold water" is a verb phrase which is itself composed of a verb followed by a verb phrase. But is this the only interpretation of the sentence? Certainly "The can can" is a perfectly valid noun-phrase referring to a type of dance, and "hold water" is also a valid verb-phrase, although the coerced meaning of the combined sentence is non-obvious. This lack of meaning is not seen as a problem by most linguists (for a discussion on this point, see
Colorless green ideas sleep furiously ''Colorless green ideas sleep furiously'' is a sentence composed by Noam Chomsky in his 1957 book ''Syntactic Structures'' as an example of a sentence that is grammatically well-formed, but semantically nonsensical. The sentence was originally ...
) but from a pragmatic point of view it is desirable to obtain the first interpretation rather than the second and statistical parsers achieve this by ranking the interpretations based on their probability. (In this example various assumptions about the
grammar In linguistics, the grammar of a natural language is its set of structure, structural constraints on speakers' or writers' composition of clause (linguistics), clauses, phrases, and words. The term can also refer to the study of such constraint ...
have been made, such as a simple left-to-right derivation rather than head-driven, its use of noun-phrases rather than the currently fashionable determiner-phrases, and no type-check preventing a concrete noun being combined with an abstract verb phrase. None of these assumptions affect the thesis of the argument and a comparable argument can be made using any other grammatical formalism.) There are a number of methods that statistical parsing algorithms frequently use. While few algorithms will use all of these they give a good overview of the general field. Most statistical parsing algorithms are based on a modified form of
chart parsing In computer science, a chart parser is a type of parser suitable for ambiguous grammars (including grammars of natural languages). It uses the dynamic programming approach—partial hypothesized results are stored in a structure called a chart and ...
. The modifications are necessary to support an extremely large number of grammatical rules and therefore search space, and essentially involve applying classical
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech re ...
algorithms to the traditionally exhaustive search. Some examples of the optimisations are only searching a likely subset of the search space ( stack search), for optimising the search probability ( Baum-Welch algorithm) and for discarding parses that are too similar to be treated separately (
Viterbi algorithm The Viterbi algorithm is a dynamic programming algorithm for obtaining the maximum a posteriori probability estimate of the most likely sequence of hidden states—called the Viterbi path—that results in a sequence of observed events, especiall ...
).


Notable people in statistical parsing

*
Eugene Charniak Eugene Charniak is a professor of computer Science and cognitive Science at Brown University. He holds an A.B. in Physics from the University of Chicago and a Ph.D. from M.I.T. in Computer Science. His research has always been in the area of l ...
Author o
Statistical techniques for natural language parsing
amongst many other contributions *
Fred Jelinek Frederick Jelinek (18 November 1932 – 14 September 2010) was a Czech-American researcher in information theory, automatic speech recognition, and natural language processing. He is well known for his oft-quoted statement, "Every time I fire ...
br>Applied and developed numerous techniques from Information Theory to build the field
*
David Magerman David Mitchell Magerman (born 1968) is an American computer scientist and philanthropist. He spent 22 years working for an investment management company and hedge fund, Renaissance Technologies. Early life and education Magerman was born to Melv ...
br>Major contributor to turning the field from theoretical to practical by managing data
* James Curran Applying the
MaxEnt The principle of maximum entropy states that the probability distribution which best represents the current state of knowledge about a system is the one with largest entropy, in the context of precisely stated prior data (such as a proposition ...
br>algorithm, word representation, and other contributions
*
Michael Collins (computational linguist) Michael J. Collins (born 4 March 1970) is a researcher in the field of computational linguistics. He is the Vikram S. Pandit Professor of Computer Science at Columbia University. His research interests are in natural language processing as ...
br>First very high performance statistical parser
*
Joshua Goodman Joshua () or Yehoshua ( ''Yəhōšuaʿ'', Tiberian: ''Yŏhōšuaʿ,'' lit. 'Yahweh is salvation') ''Yēšūaʿ''; syr, ܝܫܘܥ ܒܪ ܢܘܢ ''Yəšūʿ bar Nōn''; el, Ἰησοῦς, ar , يُوشَعُ ٱبْنُ نُونٍ '' Yūšaʿ ...
Hypergraphs, and other generalizations between different methods


See also

*
Statistical machine translation Statistical machine translation (SMT) is a machine translation paradigm where translations are generated on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora. The statistical approach contrast ...
*
Statistical semantics In linguistics, statistical semantics applies the methods of statistics to the problem of determining the meaning of words or phrases, ideally through unsupervised learning, to a degree of precision at least sufficient for the purpose of informat ...
*
Stochastic context-free grammar Grammar theory to model symbol strings originated from work in computational linguistics aiming to understand the structure of natural languages. Probabilistic context free grammars (PCFGs) have been applied in probabilistic modeling of RNA structur ...
Natural language parsing Statistical natural language processing