HOME

TheInfoList



OR:

A parse tree or parsing tree or derivation tree or concrete syntax tree is an ordered, rooted
tree In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, including only woody plants with secondary growth, plants that are ...
that represents the
syntactic In linguistics, syntax () is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituency) ...
structure of a string according to some
context-free grammar In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules are of the form :A\ \to\ \alpha with A a ''single'' nonterminal symbol, and \alpha a string of terminals and/or nonterminals (\alpha can be em ...
. The term ''parse tree'' itself is used primarily in
computational linguistics Computational linguistics is an interdisciplinary field concerned with the computational modelling of natural language, as well as the study of appropriate computational approaches to linguistic questions. In general, computational linguistics ...
; in theoretical syntax, the term ''syntax tree'' is more common. Concrete syntax trees reflect the syntax of the input language, making them distinct from the
abstract syntax tree In computer science, an abstract syntax tree (AST), or just syntax tree, is a tree representation of the abstract syntactic structure of text (often source code) written in a formal language. Each node of the tree denotes a construct occurr ...
s used in computer programming. Unlike Reed-Kellogg
sentence diagram A sentence diagram is a pictorial representation of the grammatical structure of a sentence. The term "sentence diagram" is used more when teaching written language, where sentences are ''diagrammed''. The model shows the relations between words ...
s used for teaching grammar, parse trees do not use distinct symbol shapes for different types of constituents. Parse trees are usually constructed based on either the constituency relation of constituency grammars (
phrase structure grammar The term phrase structure grammar was originally introduced by Noam Chomsky as the term for grammar studied previously by Emil Post and Axel Thue (Post canonical systems). Some authors, however, reserve the term for more restricted grammars in th ...
s) or the dependency relation of
dependency grammar Dependency grammar (DG) is a class of modern grammatical theories that are all based on the dependency relation (as opposed to the ''constituency relation'' of phrase structure) and that can be traced back primarily to the work of Lucien Tesni� ...
s. Parse trees may be generated for sentences in
natural language In neuropsychology, linguistics, and philosophy of language, a natural language or ordinary language is any language that has evolved naturally in humans through use and repetition without conscious planning or premeditation. Natural languages ...
s (see
natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to proc ...
), as well as during
processing Processing is a free graphical library and integrated development environment (IDE) built for the electronic arts, new media art, and visual design communities with the purpose of teaching non-programmers the fundamentals of computer programming ...
of computer languages, such as
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
s. A related concept is that of phrase marker or P-marker, as used in transformational generative grammar. A phrase marker is a linguistic expression marked as to its phrase structure. This may be presented in the form of a tree, or as a bracketed expression. Phrase markers are generated by applying
phrase structure rules Phrase structure rules are a type of rewrite rule used to describe a given language's syntax and are closely associated with the early stages of transformational grammar, proposed by Noam Chomsky in 1957. They are used to break down a natural langu ...
, and themselves are subject to further transformational rules. A set of possible parse trees for a
syntactically ambiguous Syntactic ambiguity, also called structural ambiguity, amphiboly or amphibology, is a situation where a sentence may be interpreted in more than one way due to ambiguous sentence structure. Syntactic ambiguity arises not from the range of mean ...
sentence is called a "parse forest."


Nomenclature

A parse tree is made up of nodes and branches. In the picture the parse tree is the entire structure, starting from S and ending in each of the leaf nodes (John, ball, the, hit). In a parse tree, each node is either a ''root'' node, a ''branch'' node, or a ''leaf'' node. In the above example, S is a root node, NP and VP are branch nodes, while John, ball, the, and hit are all leaf nodes. Nodes can also be referred to as parent nodes and child nodes. A ''parent'' node is one which has at least one other node linked by a branch under it. In the example, S is a parent of both NP and VP. A ''child'' node is one which has at least one node directly above it to which it is linked by a branch of the tree. Again from our example, hit is a child node of V. A nonterminal function is a function (node) which is either a root or a branch in that tree whereas a terminal function is a function (node) in a parse tree which is a leaf.


Constituency-based parse trees

The constituency-based parse trees of constituency grammars (
phrase structure grammar The term phrase structure grammar was originally introduced by Noam Chomsky as the term for grammar studied previously by Emil Post and Axel Thue (Post canonical systems). Some authors, however, reserve the term for more restricted grammars in th ...
s) distinguish between terminal and non-terminal nodes. The
interior node In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children (depending on the type of tree), but must be co ...
s are labeled by non-terminal categories of the grammar, while the
leaf node In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children (depending on the type of tree), but must be con ...
s are labeled by
terminal Terminal may refer to: Computing Hardware * Terminal (electronics), a device for joining electrical circuits together * Terminal (telecommunication), a device communicating over a line * Computer terminal, a set of primary input and output devi ...
categories. The image below represents a constituency-based parse tree; it shows the syntactic structure of the
English English usually refers to: * English language * English people English may also refer to: Peoples, culture, and language * ''English'', an adjective for something of, from, or related to England ** English national ...
sentence ''John hit the ball'': ::: The parse tree is the entire structure, starting from S and ending in each of the leaf nodes (''John'', ''hit'', ''the'', ''ball''). The following abbreviations are used in the tree: ::* S for sentence, the top-level structure in this example ::* NP for
noun phrase In linguistics, a noun phrase, or nominal (phrase), is a phrase that has a noun or pronoun as its head or performs the same grammatical function as a noun. Noun phrases are very common cross-linguistically, and they may be the most frequently oc ...
. The first (leftmost) NP, a single noun "John", serves as the subject of the sentence. The second one is the
object Object may refer to: General meanings * Object (philosophy), a thing, being, or concept ** Object (abstract), an object which does not exist at any particular time or place ** Physical object, an identifiable collection of matter * Goal, an ...
of the sentence. ::* VP for
verb phrase In linguistics, a verb phrase (VP) is a syntactic unit composed of a verb and its arguments except the subject of an independent clause or coordinate clause. Thus, in the sentence ''A fat man quickly put the money into the box'', the words ''q ...
, which serves as the
predicate Predicate or predication may refer to: * Predicate (grammar), in linguistics * Predication (philosophy) * several closely related uses in mathematics and formal logic: **Predicate (mathematical logic) **Propositional function **Finitary relation, o ...
::* V for
verb A verb () is a word ( part of speech) that in syntax generally conveys an action (''bring'', ''read'', ''walk'', ''run'', ''learn''), an occurrence (''happen'', ''become''), or a state of being (''be'', ''exist'', ''stand''). In the usual descr ...
. In this case, it's a
transitive verb A transitive verb is a verb that accepts one or more objects, for example, 'cleaned' in ''Donald cleaned the window''. This contrasts with intransitive verbs, which do not have objects, for example, 'panicked' in ''Donald panicked''. Transiti ...
''hit''. ::* D for
determiner A determiner, also called determinative ( abbreviated ), is a word, phrase, or affix that occurs together with a noun or noun phrase and generally serves to express the reference of that noun or noun phrase in the context. That is, a determine ...
, in this instance the
definite article An article is any member of a class of dedicated words that are used with noun phrases to mark the identifiability of the referents of the noun phrases. The category of articles constitutes a part of speech. In English, both "the" and "a(n)" a ...
"the" ::* N for
noun A noun () is a word that generally functions as the name of a specific object or set of objects, such as living creatures, places, actions, qualities, states of existence, or ideas.Example nouns for: * Living creatures (including people, alive, ...
Each node in the tree is either a ''root'' node, a ''branch'' node, or a ''leaf'' node. A root node is a node that doesn't have any branches on top of it. Within a sentence, there is only ever one root node. A branch node is a parent node that connects to two or more child nodes. A leaf node, however, is a terminal node that does not dominate other nodes in the tree. S is the root node, NP and VP are branch nodes, and ''John'' (N), ''hit'' (V), ''the'' (D), and ''ball'' (N) are all leaf nodes. The leaves are the lexical tokens of the sentence. A parent node is one that has at least one other node linked by a branch under it. In the example, S is a parent of both N and VP. A child node is one that has at least one node directly above it to which it is linked by a branch of a tree. From the example, ''hit'' is a child node of V. The terms ''mother'' and ''daughter'' are also sometimes used for this relationship.


Dependency-based parse trees

The dependency-based parse trees of
dependency grammar Dependency grammar (DG) is a class of modern grammatical theories that are all based on the dependency relation (as opposed to the ''constituency relation'' of phrase structure) and that can be traced back primarily to the work of Lucien Tesni� ...
sSee for example Ágel et al. 2003/2006. see all nodes as terminal, which means they do not acknowledge the distinction between terminal and non-terminal categories. They are simpler on average than constituency-based parse trees because they contain fewer nodes. The dependency-based parse tree for the example sentence above is as follows: ::: This parse tree lacks the phrasal categories (S, VP, and NP) seen in the constituency-based counterpart above. Like the constituency-based tree,
constituent Constituent or constituency may refer to: Politics * An individual voter within an electoral district, state, community, or organization * Advocacy group or constituency * Constituent assembly * Constituencies of Namibia Other meanings * Consti ...
structure is acknowledged. Any complete sub-tree of the tree is a constituent. Thus this dependency-based parse tree acknowledges the subject noun ''John'' and the object noun phrase ''the ball'' as constituents just like the constituency-based parse tree does. The constituency vs. dependency distinction is far-reaching. Whether the additional syntactic structure associated with constituency-based parse trees is necessary or beneficial is a matter of debate.


Phrase markers

Phrase markers, or P-markers, were introduced in early transformational generative grammar, as developed by
Noam Chomsky Avram Noam Chomsky (born December 7, 1928) is an American public intellectual: a linguist, philosopher, cognitive scientist, historian, social critic, and political activist. Sometimes called "the father of modern linguistics", Chomsky i ...
and others. A phrase marker representing the
deep structure Deep structure and surface structure (also D-structure and S-structure although those abbreviated forms are sometimes used with distinct meanings) are concepts used in linguistics, specifically in the study of syntax in the Chomskyan tradition of t ...
of a sentence is generated by applying
phrase structure rule Phrase structure rules are a type of rewrite rule used to describe a given language's syntax and are closely associated with the early stages of transformational grammar, proposed by Noam Chomsky in 1957. They are used to break down a natural lan ...
s. Then, this application may undergo further transformations. Phrase markers may be presented in the form of
trees In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, including only woody plants with secondary growth, plants that are u ...
(as in the above section on constituency-based parse trees), but are often given instead in the form of "bracketed expressions", which occupy less space in the memory. For example, a bracketed expression corresponding to the constituency-based tree given above may be something like : : S\_[_\mathit\_\text_[_\mathit\_[_V\_\text.html" ;"title="\mathit\_\text.html" ;"title="S\ [_\mathit\ \text">S\ [_\mathit\ \text [_\mathit\ [_V\ \text">\mathit\_\text.html" ;"title="S\ [_\mathit\ \text">S\ [_\mathit\ \text [_\mathit\ [_V\ \text [_\mathit\ [_\mathit\ \text]\ [_N\ \text As with trees, the precise construction of such expressions and the amount of detail shown can depend on the theory being applied and on the points that the query author wishes to illustrate.


See also

*
Abstract syntax tree In computer science, an abstract syntax tree (AST), or just syntax tree, is a tree representation of the abstract syntactic structure of text (often source code) written in a formal language. Each node of the tree denotes a construct occurr ...
*
Constituent (linguistics) In syntactic analysis, a constituent is a word or a group of words that function as a single unit within a hierarchical structure. The constituent structure of sentences is identified using ''tests for constituents''. These tests apply to a porti ...
*
Dependency grammar Dependency grammar (DG) is a class of modern grammatical theories that are all based on the dependency relation (as opposed to the ''constituency relation'' of phrase structure) and that can be traced back primarily to the work of Lucien Tesni� ...
*
Computational linguistics Computational linguistics is an interdisciplinary field concerned with the computational modelling of natural language, as well as the study of appropriate computational approaches to linguistic questions. In general, computational linguistics ...
*
Parsing Parsing, syntax analysis, or syntactic analysis is the process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term ''parsing'' comes from ...
(syntax analysis) *
Phrase structure grammar The term phrase structure grammar was originally introduced by Noam Chomsky as the term for grammar studied previously by Emil Post and Axel Thue (Post canonical systems). Some authors, however, reserve the term for more restricted grammars in th ...
*
Sentence diagram A sentence diagram is a pictorial representation of the grammatical structure of a sentence. The term "sentence diagram" is used more when teaching written language, where sentences are ''diagrammed''. The model shows the relations between words ...
*
Terminal and nonterminal symbols In computer science, terminal and nonterminal symbols are the lexical elements used in specifying the production rules constituting a formal grammar. ''Terminal symbols'' are the elementary symbols of the language defined by a formal grammar. ...


Notes


References

* Ágel, V., Ludwig Eichinger, Hans-Werner Eroms, Peter Hellwig, Hans Heringer, and Hennig Lobin (eds.) 2003/6
Dependency and valency: An international handbook of contemporary research
Berlin: Walter de Gruyter. *Carnie, A. 2013
Syntax: A generative introduction
3rd edition. Malden, MA: Wiley-Blackwell. *Chiswell, Ian and Wilfrid Hodges 2007. Mathematical logic. Oxford: Oxford University Press. *Aho, A. V., Sethi, R., and Ullman, J. D. 1986. Compilers: Principles, techniques, & tools. Reading, MA: Addison-Wesley.


External links


Syntax Tree Editor

Linguistic Tree Constructor

phpSyntaxTree
– Online parse tree drawing site
phpSyntaxTree (Unicode)
– Online parse tree drawing site (improved version that supports Unicode)
rSyntaxTree
Enhanced version of phpSyntaxTree in Ruby with Unicode and Vectorized graphics
Qtree
LaTeX Latex is an emulsion (stable dispersion) of polymer microparticles in water. Latexes are found in nature, but synthetic latexes are common as well. In nature, latex is found as a milky fluid found in 10% of all flowering plants (angiosperms ...
package for drawing parse trees
TreeForm Syntax Tree Drawing Software


Introduction and Transformation
OpenCourseOnline
Dependency Parse Introduction (Christopher Manning)

{{Parsers Syntax Generative syntax Trees (data structures)