HOME

TheInfoList



OR:

Deep linguistic processing is a
natural language processing Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related ...
framework which draws on theoretical and
descriptive linguistics In the study of language, description or descriptive linguistics is the work of objectively analyzing and describing how language is actually used (or how it was used in the past) by a speech community. François & Ponsonnet (2013). All aca ...
. It models language predominantly by way of theoretical syntactic/semantic theory (e.g. CCG,
HPSG Head-driven phrase structure grammar (HPSG) is a highly lexicalized, constraint-based grammar developed by Carl Pollard and Ivan Sag. It is a type of phrase structure grammar, as opposed to a dependency grammar, and it is the immediate successor t ...
,
LFG LFG may refer to: * Landfill gas, a waste gas containing methane and other gases emitted by landfills * Lexical functional grammar, a theory of syntax * Lagged Fibonacci generator, an example of a pseudorandom number generator * "Looking for group ...
, TAG, the
Prague School The Prague school or Prague linguistic circle is a language and literature society. It started in 1926 as a group of linguists, philologists and literary critics in Prague. Its proponents developed methods of structuralist literary analysis and ...
). Deep linguistic processing approaches differ from "shallower" methods in that they yield more expressive and structural representations which directly capture long-distance dependencies and underlying
predicate Predicate or predication may refer to: * Predicate (grammar), in linguistics * Predication (philosophy) * several closely related uses in mathematics and formal logic: **Predicate (mathematical logic) **Propositional function **Finitary relation, o ...
-
argument An argument is a series of sentences, statements, or propositions some of which are called premises and one is the conclusion. The purpose of an argument is to give reasons for one's conclusion via justification, explanation, and/or persu ...
structures.
The knowledge-intensive approach of deep linguistic processing requires considerable computational power, and has in the past sometimes been judged as being intractable. However, research in the early 2000s had made considerable advancement in efficiency of deep processing. Today, efficiency is no longer a major problem for applications using deep linguistic processing.


Contrast to "shallow linguistic processing"

Traditionally, deep linguistic processing has been concerned with computational grammar development (for use in both
parsing Parsing, syntax analysis, or syntactic analysis is a process of analyzing a String (computer science), string of Symbol (formal), symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal gramm ...
and generation). These grammars were manually developed, maintained and were computationally expensive to run. In recent years, machine learning approaches (also known as shallow linguistic processing) have fundamentally altered the field of
natural language processing Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related ...
. The rapid creation of robust and wide-coverage machine learning NLP tools requires substantially lesser amount of manual labor. Thus deep linguistic processing methods have received less attention. However, it is the belief of some computational linguists that in order for computers to understand natural language or
inference Inferences are steps in logical reasoning, moving from premises to logical consequences; etymologically, the word '' infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinct ...
, detailed syntactic and semantic representation is necessary. Moreover, while humans can easily understand a sentence and its meaning, shallow linguistic processing might lack human language 'understanding'. For example:U. Schafer. 2007. �
Integrating Deep and Shallow Natural Language Processing Components – Representations and Hybrid Architectures
Ph.D. thesis, Faculty of Mathematics and Computer Science, Saarland University, Saarbrucken, Germany.

:a) ''Things would be different if Microsoft were located in Georgia.'' In sentence (a), a shallow information extraction system might infer wrongly that Microsoft's headquarters was located in Georgia. While as humans, we understand from the sentence that Microsoft office was never in Georgia.
:b) ''The National Institute for Psychology in Israel was established in May 1971 as the Israel Center for Psychobiology by Prof. Joel.'' In sentence (b), a shallow system could wrongly infer that Israel was established in May 1971. Humans know that it is the National Institute for Psychobiology that was established in 1971.
In summary of the comparison between deep and shallow language processing, deep linguistic processing provides a knowledge-rich analysis of language through manually developed grammars and language resources. Whereas, shallow linguistic processing provides a knowledge-lean analysis of language through statistical/machine learning manipulation of texts and/or annotated linguistic resource.


Sub-communities

"Deep" computational linguists are divided in different sub-communities based on the grammatical formalism they adopted for deep linguistic processing. The major sub-communities includes the: *DEep Linguistic Processing with HPSG - INitiative (
DELPH-IN Deep Linguistic Processing with HPSG - INitiative (DELPH-IN) is a collaboration where computational linguists worldwide develop natural language processing tools for deep linguistic processing of human language. The goal of DELPH-IN is to combine ...
) collaboration working with the
HPSG Head-driven phrase structure grammar (HPSG) is a highly lexicalized, constraint-based grammar developed by Carl Pollard and Ivan Sag. It is a type of phrase structure grammar, as opposed to a dependency grammar, and it is the immediate successor t ...
formalism. Th
HPSG Conference
is the central conference to share knowledge/advancement of
HPSG Head-driven phrase structure grammar (HPSG) is a highly lexicalized, constraint-based grammar developed by Carl Pollard and Ivan Sag. It is a type of phrase structure grammar, as opposed to a dependency grammar, and it is the immediate successor t ...
based deep processing.
ParGram/ParSem
is international collaboration on
LFG LFG may refer to: * Landfill gas, a waste gas containing methane and other gases emitted by landfills * Lexical functional grammar, a theory of syntax * Lagged Fibonacci generator, an example of a pseudorandom number generator * "Looking for group ...
-based grammar and semantics development. Th
LFG Conference
is the central conference to share knowledge/advancement of
LFG LFG may refer to: * Landfill gas, a waste gas containing methane and other gases emitted by landfills * Lexical functional grammar, a theory of syntax * Lagged Fibonacci generator, an example of a pseudorandom number generator * "Looking for group ...
based deep processing. *XTAG Research group working with the TAG formalism. Th
TAG+ conference
is the central conference to share knowledge/advancement of TAG based deep processing. The shortlist above is not exhaustively representative of all the communities working on deep linguistic processing.


See also

*
Combinatory categorial grammar Combinatory categorial grammar (CCG) is an efficiently parsable, yet linguistically expressive grammar formalism. It has a transparent interface between surface syntax and underlying semantic representation, including predicate–argument structur ...
*
Head-driven phrase structure grammar Head-driven phrase structure grammar (HPSG) is a highly lexicalized, constraint-based grammar developed by Carl Pollard and Ivan Sag. It is a type of phrase structure grammar, as opposed to a dependency grammar, and it is the immediate successor t ...
*
Lexical functional grammar Lexical functional grammar (LFG) is a constraint-based grammar framework in theoretical linguistics. It posits several parallel levels of syntactic structure, including a phrase structure grammar representation of word order and constituency, an ...
*
Natural language processing Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related ...
*
Tree-adjoining grammar Tree-adjoining grammar (TAG) is a grammar formalism defined by Aravind Joshi. Tree-adjoining grammars are somewhat similar to context-free grammars, but the elementary unit of rewriting is the tree rather than the symbol. Whereas context-free gr ...


References

{{Natural Language Processing Natural language processing Generative linguistics Grammar frameworks