HOME



picture info

Treebank
In linguistics, a treebank is a parsed text corpus that annotates syntactic or semantic sentence structure. The construction of parsed corpora in the early 1990s revolutionized computational linguistics, which benefitted from large-scale empirical data. Etymology The term ''treebank'' was coined by linguist Geoffrey Leech in the 1980s, by analogy to other repositories such as a seedbank or bloodbank. This is because both syntactic and semantic structure are commonly represented compositionally as a tree structure. The term ''parsed corpus'' is often used interchangeably with the term treebank, with the emphasis on the primacy of sentences rather than trees. Construction Treebanks are often created on top of a corpus that has already been annotated with part-of-speech tags. In turn, treebanks are sometimes enhanced with semantic or other linguistic information. Treebanks can be created completely manually, where linguists annotate each sentence with syntactic structur ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Computational Linguistics
Computational linguistics is an interdisciplinary field concerned with the computational modelling of natural language, as well as the study of appropriate computational approaches to linguistic questions. In general, computational linguistics draws upon linguistics, computer science, artificial intelligence, mathematics, logic, philosophy, cognitive science, cognitive psychology, psycholinguistics, anthropology and neuroscience, among others. Computational linguistics is closely related to mathematical linguistics. Origins The field overlapped with artificial intelligence since the efforts in the United States in the 1950s to use computers to automatically translate texts from foreign languages, particularly Russian scientific journals, into English. Since rule-based approaches were able to make arithmetic (systematic) calculations much faster and more accurately than humans, it was expected that lexicon, morphology, syntax and semantics can be learned using explicit rules, a ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Text Corpus
In linguistics and natural language processing, a corpus (: corpora) or text corpus is a dataset, consisting of natively digital and older, digitalized, language resources, either annotated or unannotated. Annotated, they have been used in corpus linguistics for statistical statistical hypothesis testing, hypothesis testing, checking occurrences or validating linguistic rules within a specific language territory. Overview A corpus may contain texts in a single language (''monolingual corpus'') or text data in multiple languages (''multilingual corpus''). In order to make the corpora more useful for doing linguistic research, they are often subjected to a process known as annotation. An example of annotating a corpus is part-of-speech tagging, or ''POS-tagging'', in which information about each word's part of speech (verb, noun, adjective, etc.) is added to the corpus in the form of ''tags''. Another example is indicating the Lemma (morphology), lemma (base) form of each word ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Part-of-speech Tagging
In corpus linguistics, part-of-speech tagging (POS tagging, PoS tagging, or POST), also called grammatical tagging, is the process of marking up a word in a text ( corpus) as corresponding to a particular part of speech, based on both its definition and its context. A simplified form of this is commonly taught to school-age children, in the identification of words as nouns, verbs, adjectives, adverbs, etc. Once performed by hand, POS tagging is now done in the context of computational linguistics, using algorithms which associate discrete terms, as well as hidden parts of speech, by a set of descriptive tags. POS-tagging algorithms fall into two distinctive groups: rule-based and stochastic. E. Brill's tagger, one of the first and most widely used English POS taggers, employs rule-based algorithms. Principle Part-of-speech tagging is harder than just having a list of words and their parts of speech, because some words can represent more than one part of speech at different t ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Dependency Grammar
Dependency grammar (DG) is a class of modern Grammar, grammatical theories that are all based on the dependency relation (as opposed to the ''constituency relation'' of Phrase structure grammar, phrase structure) and that can be traced back primarily to the work of Lucien Tesnière. Dependency is the notion that linguistic units, e.g. words, are connected to each other by directed links. The (finite) verb is taken to be the structural center of clause structure. All other syntactic units (words) are either directly or indirectly connected to the verb in terms of the directed links, which are called ''dependencies''. Dependency grammar differs from phrase structure grammar in that while it can identify phrases it tends to overlook phrasal nodes. A dependency structure is determined by the relation between a word (a Head (linguistics), head) and its dependents. Dependency structures are flatter than phrase structures in part because they lack a finite verb, finite verb phrase constit ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


HPSG
Head-driven phrase structure grammar (HPSG) is a highly lexicalized, constraint-based grammar developed by Carl Pollard and Ivan Sag. It is a type of phrase structure grammar, as opposed to a dependency grammar, and it is the immediate successor to generalized phrase structure grammar. HPSG draws from other fields such as computer science (type system, data type theory and knowledge representation) and uses Ferdinand de Saussure's notion of the sign (linguistics), sign. It uses a uniform formalism and is organized in a modular way which makes it attractive for natural language processing. An HPSG includes principles and grammar rules and lexicon entries which are normally not considered to belong to a grammar. The formalism is based on lexicalism. This means that the lexicon is more than just a list of entries; it is in itself richly structured. Individual entries are marked with types. Types form a hierarchy. Early versions of the grammar were very lexicalized with few grammatica ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Parser
Parsing, syntax analysis, or syntactic analysis is a process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar by breaking it into parts. The term ''parsing'' comes from Latin ''pars'' (''orationis''), meaning part (of speech). The term has slightly different meanings in different branches of linguistics and computer science. Traditional sentence parsing is often performed as a method of understanding the exact meaning of a sentence or word, sometimes with the aid of devices such as sentence diagrams. It usually emphasizes the importance of grammatical divisions such as subject and predicate. Within computational linguistics the term is used to refer to the formal analysis by a computer of a sentence or other string of words into its constituents, resulting in a parse tree showing their syntactic relation to each other, which may also contain semantic information. Some parsing a ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Corpus Linguistics
Corpus linguistics is an empirical method for the study of language by way of a text corpus (plural ''corpora''). Corpora are balanced, often stratified collections of authentic, "real world", text of speech or writing that aim to represent a given linguistic variety. Today, corpora are generally machine-readable data collections. Corpus linguistics proposes that a reliable analysis of a language is more feasible with corpora collected in the field—the natural context ("realia") of that language—with minimal experimental interference. Large collections of text, though corpora may also be small in terms of running words, allow linguists to run quantitative analyses on linguistic concepts that may be difficult to test in a qualitative manner. The text-corpus method uses the body of texts in any natural language to derive the set of abstract rules which govern that language. Those results can be used to explore the relationships between that subject language and other language ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Theoretical Linguistics
Theoretical linguistics is a term in linguistics that, like the related term general linguistics, can be understood in different ways. Both can be taken as a reference to the theory of language, or the branch of linguistics that inquires into the nature of language and seeks to answer fundamental questions as to what language is, or what the common ground of all languages is. The goal of theoretical linguistics can also be the construction of a general theoretical framework for the description of language. Another use of the term depends on the organisation of linguistics into different sub-fields. The term 'theoretical linguistics' is commonly juxtaposed with applied linguistics. This perspective implies that the aspiring language professional, e.g. a student, must first learn the ''theory'' i.e. properties of the linguistic system, or what Ferdinand de Saussure called ''internal linguistics''. This is followed by ''practice,'' or studies in the applied field. The dichotomy is ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Communication University Of China
The Communication University of China (CUC) is a public university in Chaoyang, Beijing, China. It is affiliated with the Ministry of Education. The university is part of the Double First-Class Construction and Project 211. CUC developed from what used to be a training center for technicians of the Central Broadcasting Bureau that was founded in 1954. In April 1959, it was upgraded to the Beijing Broadcasting College () approved by the State Council. In August 2004, the Beijing Broadcasting College was renamed the Communication University of China. CUC is located in the eastern part of Beijing near the ancient canal, which occupies 463,700 square meters of land and a total of 499,800 square meters of buildings. History CUC's history dates back to March 3, 1954, when the first training class for broadcasting professionals was held by the then Central Radio Administration. This then led to the founding of Beijing Broadcasting College in 1958. On September 7, 1959, CUC's precursor ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Association For Computational Linguistics
The Association for Computational Linguistics (ACL) is a scientific and professional organization for people working on natural language processing. Its namesake conference is one of the primary high impact conferences for natural language processing research, along with EMNLP. The conference is held each summer in locations where significant computational linguistics research is carried out. It was founded in 1962, originally named the Association for Machine Translation and Computational Linguistics (AMTCL). It became the ACL in 1968. The ACL has a European (EACL), a North American ( NAACL), and an Asian (AACL) chapter. History The ACL was founded in 1962 as the Association for Machine Translation and Computational Linguistics (AMTCL). The initial membership was about 100. In 1965, the AMTCL took over the journal '' Mechanical Translation and Computational Linguistics''. This journal was succeeded by many other journals: the '' American Journal of Computational Linguistics'' ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




The House At The End Of The Street
''House at the End of the Street'' is a 2012 American psychological thriller film directed by Mark Tonderai that stars Jennifer Lawrence. The film's plot revolves around a teenage girl, Elissa, who along with her newly divorced mother Sarah, moves to a new neighborhood only to discover that the house at the end of the street was the site of a gruesome double homicide committed by a 13-year-old girl named Carrie Anne who had disappeared without a trace four years prior. Elissa then starts a relationship with Carrie Anne's older brother Ryan, who lives in the same house, but nothing is as it appears to be. Although filming had been completed in 2010, the film was not released until 2012 by Relativity Media. Despite a negative response from critics, the film was a moderate commercial success, grossing $44 million. Plot Newly-divorced doctor Sarah Cassidy and her 17-year-old daughter Elissa move to a small, upscale suburb. They are disturbed to discover the house they are moving into ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Psycholinguistics
Psycholinguistics or psychology of language is the study of the interrelation between linguistic factors and psychological aspects. The discipline is mainly concerned with the mechanisms by which language is processed and represented in the mind and brain; that is, the psychology, psychological and neurobiology, neurobiological factors that enable humans to acquire, use, comprehend, and produce language. Psycholinguistics is concerned with the cognitive faculties and processes that are necessary to produce the grammatical constructions of language. It is also concerned with the perception of these constructions by a listener. Initial forays into psycholinguistics were in the philosophical and educational fields, mainly due to their location in departments other than applied sciences (e.g., cohesive data on how the human brain functioned). Modern research makes use of biology, neuroscience, cognitive science, linguistics, and information science to study how the mind-brain process ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]