HOME

TheInfoList



OR:

Data-oriented parsing (DOP, also data-oriented processing) is a
probabilistic Probability is the branch of mathematics concerning numerical descriptions of how likely an Event (probability theory), event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and ...
model A model is an informative representation of an object, person or system. The term originally denoted the plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin ''modulus'', a measure. Models c ...
in
computational linguistics Computational linguistics is an Interdisciplinarity, interdisciplinary field concerned with the computational modelling of natural language, as well as the study of appropriate computational approaches to linguistic questions. In general, comput ...
. DOP was conceived by
Remko Scha Remko Jan Hendrik Scha (15 September 1945 – 9 November 2015) was a professor of computational linguistics at the faculty of humanities and Institute for Logic, Language and Computation at the University of Amsterdam. He made important contributio ...
in 1990 with the aim of developing a
performance A performance is an act of staging or presenting a play, concert, or other form of entertainment. It is also defined as the action or process of carrying out or accomplishing an action, task, or function. Management science In the work place ...
-oriented grammar framework. Unlike other probabilistic models, DOP takes into account all subtrees contained in a
treebank In linguistics, a treebank is a parsed text corpus that annotates syntactic or semantic sentence structure. The construction of parsed corpora in the early 1990s revolutionized computational linguistics, which benefitted from large-scale empiri ...
rather than being restricted to, for example, 2-level subtrees (like PCFGs), thus allowing for more context-sensitive information. Several variants of DOP have been developed. The initial version developed by Rens Bod in 1992 was based on tree-substitution grammar,R. Bod, A computational model of language performance: Data oriented parsing, in: COLING 1992 Volume 3: The 15th International Conference on Computational Linguistics, https://www.aclweb.org/anthology/C92-3126.pdf while more recently, DOP has been combined with lexical-functional grammar (LFG). The resulting DOP-LFG finds an application in
machine translation Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation or interactive translation), is a sub-field of computational linguistics that investigates t ...
. Other work on learning and
parameter estimation Estimation theory is a branch of statistics that deals with estimating the values of parameters based on measured empirical data that has a random component. The parameters describe an underlying physical setting in such a way that their value ...
for DOP has also found its way into machine translation.


References


External links


Remko Scha Research on DOP Khalil Sima'an: Learning DOP models from treebanks; Computational Complexity
* Andy Way (1999). A hybrid architecture for robust MT using LFG-DOP.
Journal of Experimental and Theoretical Artificial Intelligence The ''Journal of Experimental and Theoretical Artificial Intelligence'' is a quarterly peer-reviewed scientific journal published by Taylor and Francis. It covers all aspects of artificial intelligence and was established in 1989. The editor-in-chi ...
11(3):441–471. {{comp-ling-stub Grammar frameworks Natural language parsing