Morphological Dictionary

	Morphological Dictionary In the fields of computational linguistics and applied linguistics, a morphological dictionary is a linguistic resource that contains correspondences between surface form and lexical forms of words. Surface forms of words are those found in natural language text. The corresponding lexical form of a surface form is the lemma followed by grammatical information (for example the part of speech, gender and number). In English ''give'', ''gives'', ''giving'', ''gave'' and ''given'' are surface forms of the verb ''give''. The lexical form would be "give", verb. There are two kinds of morphological dictionaries: morpheme-aligned dictionaries and full-form (non-aligned) dictionaries. Notable examples and formalisms Universal Morphologies Inspired by the success of the Universal Dependencies for cross-linguistic annotation of syntactic dependencies, similar efforts have emerged for morphology, e.g., UniMorph and UDer. These feature simple tabular ( tab-separated) formats with one form i ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Computational Linguistics Computational linguistics is an Interdisciplinarity, interdisciplinary field concerned with the computational modelling of natural language, as well as the study of appropriate computational approaches to linguistic questions. In general, computational linguistics draws upon linguistics, computer science, artificial intelligence, mathematics, logic, philosophy, cognitive science, cognitive psychology, psycholinguistics, anthropology and neuroscience, among others. Sub-fields and related areas Traditionally, computational linguistics emerged as an area of artificial intelligence performed by computer scientists who had specialized in the application of computers to the processing of a natural language. With the formation of the Association for Computational Linguistics (ACL) and the establishment of independent conference series, the field consolidated during the 1970s and 1980s. The Association for Computational Linguistics defines computational linguistics as: The term "comp ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Applied Linguistics Applied linguistics is an interdisciplinary field which identifies, investigates, and offers solutions to language-related real-life problems. Some of the academic fields related to applied linguistics are education, psychology, communication research, information science, natural language processing, anthropology, and sociology. Domain Applied linguistics is an interdisciplinary field. Major branches of applied linguistics include bilingualism and multilingualism, conversation analysis, contrastive linguistics, language assessment, literacies, discourse analysis, language pedagogy, second language acquisition, language planning and policy, interlinguistics, stylistics, language teacher education, forensic linguistics, and translation. Journals Major journals of the field include ''Research Methods in Applied Linguistics'', ''Annual Review of Applied Linguistics'', '' Applied Linguistics'', Studies in Second Language Acquisition, ''Applied Psycholinguistics'', ' ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Lemma (morphology) In morphology and lexicography, a lemma (plural ''lemmas'' or ''lemmata'') is the canonical form, dictionary form, or citation form of a set of word forms. In English, for example, ''break'', ''breaks'', ''broke'', ''broken'' and ''breaking'' are forms of the same lexeme, with ''break'' as the lemma by which they are indexed. ''Lexeme'', in this context, refers to the set of all the inflected or alternating forms in the paradigm of a single word, and ''lemma'' refers to the particular form that is chosen by convention to represent the lexeme. Lemmas have special significance in highly inflected languages such as Arabic, Turkish and Russian. The process of determining the ''lemma'' for a given lexeme is called lemmatisation. The lemma can be viewed as the chief of the principal parts, although lemmatisation is at least partly arbitrary. Morphology The form of a word that is chosen to serve as the lemma is usually the least marked form, but there are several exceptions s ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Part Of Speech In grammar, a part of speech or part-of-speech ( abbreviated as POS or PoS, also known as word class or grammatical category) is a category of words (or, more generally, of lexical items) that have similar grammatical properties. Words that are assigned to the same part of speech generally display similar syntactic behavior (they play similar roles within the grammatical structure of sentences), sometimes similar morphological behavior in that they undergo inflection for similar properties and even similar semantic behavior. Commonly listed English parts of speech are noun, verb, adjective, adverb, pronoun, preposition, conjunction, interjection, numeral, article, and determiner. Other terms than ''part of speech''—particularly in modern linguistic classifications, which often make more precise distinctions than the traditional scheme does—include word class, lexical class, and lexical category. Some authors restrict the term ''lexical category'' to refer only to a par ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Grammatical Gender In linguistics, grammatical gender system is a specific form of noun class system, where nouns are assigned with gender categories that are often not related to their real-world qualities. In languages with grammatical gender, most or all nouns inherently carry one value of the grammatical category called ''gender''; the values present in a given language (of which there are usually two or three) are called the ''genders'' of that language. Whereas some authors use the term "grammatical gender" as a synonym of "noun class", others use different definitions for each; many authors prefer "noun classes" when none of the inflections in a language relate to sex. Gender systems are used in approximately one half of the world's languages. According to one definition: "Genders are classes of nouns reflected in the behaviour of associated words." Overview Languages with grammatical gender usually have two to four different genders, but some are attested with up to 20. #Gender contras ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Grammatical Number In linguistics, grammatical number is a grammatical category of nouns, pronouns, adjectives and verb agreement that expresses count distinctions (such as "one", "two" or "three or more"). English and other languages present number categories of singular or plural, both of which are cited by using the hash sign (#) or by the numero signs "No." and "Nos." respectively. Some languages also have a dual, trial and paucal number or other arrangements. The count distinctions typically, but not always, correspond to the actual count of the referents of the marked noun or pronoun. The word "number" is also used in linguistics to describe the distinction between certain grammatical aspects that indicate the number of times an event occurs, such as the semelfactive aspect, the iterative aspect, etc. For that use of the term, see " Grammatical aspect". Overview Most languages of the world have formal means to express differences of number. One widespread distinction, found in English ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Universal Dependencies Universal Dependencies, frequently abbreviated as UD, is an international cooperative project to create treebanks of the world's languages. These treebanks are openly accessible and available. Core applications are automated text processing in the field of natural language processing (NLP) and research into natural language syntax and grammar, especially within linguistic typology Linguistic typology (or language typology) is a field of linguistics that studies and classifies languages according to their structural features to allow their comparison. Its aim is to describe and explain the structural diversity and the co .... The project's primary aim is to achieve cross-linguistic consistency of annotation, while still permitting language-specific extensions when necessary. The annotation scheme has it roots in three related projects: Stanford Dependencies, Google universal part-of-speech tags, and the Interset interlingua for morphosyntactic tagsets. The UD annotation scheme ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Tab-separated Values A tab-separated values (TSV) file is a simple text format for storing data in a tabular structure, e.g., a database table or spreadsheet data, and a way of exchanging information between databases. Each record in the table is one line of the text file. Each field value of a record is separated from the next by a tab character. The TSV format is thus a variation of the comma-separated values format. TSV is a simple file format that is widely supported, so it is often used in data exchange to move tabular data between different computer programs that support the format. For example, a TSV file might be used to transfer information from a database program to a spreadsheet. The IANA standard for TSV achieves simplicity by simply disallowing tabs within fields. Example The head of the Iris flower data set can be stored as a TSV using the following plain text (note that the HTML rendering may convert tabs to spaces): Sepal length Sepal width Petal length Petal width&Tab ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Finite-state Transducer A finite-state transducer (FST) is a finite-state machine with two memory ''tapes'', following the terminology for Turing machines: an input tape and an output tape. This contrasts with an ordinary finite-state automaton, which has a single tape. An FST is a type of finite-state automaton (FSA) that maps between two sets of symbols. An FST is more general than an FSA. An FSA defines a formal language by defining a set of accepted strings, while an FST defines relations between sets of strings. An FST will read a set of strings on the input tape and generates a set of relations on the output tape. An FST can be thought of as a translator or relater between strings in a set. In morphological parsing, an example would be inputting a string of letters into the FST, the FST would then output a string of morphemes. Overview An automaton can be said to ''recognize'' a string if we view the content of its tape as input. In other words, the automaton computes a function that maps ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Interlinear Gloss In linguistics and pedagogy, an interlinear gloss is a gloss (annotation), gloss (series of brief explanations, such as definitions or pronunciations) placed between lines, such as between a line of original text and its translation into another language. When glossed, each line of the original text acquires one or more corresponding lines of transcription known as an interlinear text or interlinear glossed text (IGT)interlinear for short. Such glosses help the reader follow the relationship between the source text and its translation, and the structure of the original language. In its simplest form, an interlinear gloss is simply a literal, word-for-word translation of the source text. History Interlinear glosses have been used for a variety of purposes over a long period of time. One common usage has been to annotate bilingual textbooks for language education. This sort of interlinearization serves to help make the meaning of a source text explicit without attempting to formally ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	OntoLex OntoLex is the short name of a vocabulary for lexical resources in the web of data (OntoLex-Lemon) and the short name of the W3C community group that created it (W3C Ontology-Lexica Community Group). OntoLex-Lemon vocabulary The OntoLex-Lemon vocabulary represents a vocabulary for publishing lexical data as a knowledge graph, in an RDF format and/or as Linguistic Linked Open Data. Since its publication as a W3C Community report in 2016, it serves as ``a de facto standard to represent ontology-lexica on the Web´´. OntoLex-Lemon is a revision of the Lemon vocabulary originally proposed by McCrae et al. (2011). The core elements of OntoLex-Lemon, shown in Fig. 1, are: * lexical entry: unit of analysis of the lexicon, groups together one or more forms and one or more senses, resp. concepts. Can provide additional morphosyntactic information, e.g., one part of speech. Note that every lexical entry can have at most one part of speech, for representing groups of lexical entries w ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Morphology (linguistics) In linguistics, morphology () is the study of words, how they are formed, and their relationship to other words in the same language. It analyzes the structure of words and parts of words such as stems, root words, prefixes, and suffixes. Morphology also looks at parts of speech, intonation and stress, and the ways context can change a word's pronunciation and meaning. Morphology differs from morphological typology, which is the classification of languages based on their use of words, and lexicology, which is the study of words and how they make up a language's vocabulary. While words, along with clitics, are generally accepted as being the smallest units of syntax, in most languages, if not all, many words can be related to other words by rules that collectively describe the grammar for that language. For example, English speakers recognize that the words ''dog'' and ''dogs'' are closely related, differentiated only by the plurality morpheme "-s", only found bound to ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]