Uby Turpin
   HOME
*





Uby Turpin
UBY is a large-scale lexical-semantic resource for natural language processing (NLP) developed at the Ubiquitous Knowledge Processing Lab (UKP) in the department of Computer Science of the Technische Universität Darmstadt . UBY is based on the ISO standard Lexical Markup Framework (LMF) and combines information from several expert-constructed and collaboratively constructed resources for English and German. UBY applies a word sense alignment approach (subfield of word sense disambiguation) for combining information about nouns and verbs. Currently, UBY contains 12 integrated resources in English and German. Included resources * English resources: WordNet, Wiktionary, Wikipedia, FrameNet, VerbNet, OmegaWiki * German resources: German Wikipedia The German Wikipedia (german: Deutschsprachige Wikipedia) is the German-language edition of Wikipedia, a free and publicly editable online encyclopedia. Founded on March 16, 2001, it is the second-oldest Wikipedia (after the ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Natural Language Processing
Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves. Challenges in natural language processing frequently involve speech recognition, natural-language understanding, and natural-language generation. History Natural language processing has its roots in the 1950s. Already in 1950, Alan Turing published an article titled "Computing Machinery and Intelligence" which proposed what is now called the Turing test as a criterion of intelligence, t ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


ISO/TC 37
ISO/TC 37 is a technical committee within the International Organization for Standardization (ISO) that prepares standards and other documents concerning methodology and principles for terminology and language resources. Title: Terminology and other language and content resources Scope: Standardization of principles, methods and applications relating to terminology and other language and content resources in the contexts of multilingual communication and cultural diversity ISO/TC 37 is a so-called "horizontal committee", providing guidelines for all other technical committees that develop standards on how to manage their terminological problems. However, the standards developed by ISO/TC 37 are not restricted to ISO. Collaboration with industry is sought to ensure that the requirements and needs from all possible users of standards concerning terminology, language and structured content are duly and timely addressed. Involvement in standards development is open to all stake ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


EuroWordNet
EuroWordNet is a system of semantic networks for European languages, based on WordNet. Each language develops its own wordnet but they are interconnected with ''interlingual links'' stored in the ''Interlingual Index'' (ILI). Unlike the original Princeton WordNet, most of the other wordnets are not freely available. Languages The original EuroWordNet project dealt with Dutch, Italian, Spanish, German, French, Czech, and Estonian. These wordnets are now frozen, but wordnets for other languages have been developed to varying degrees. License Some examples of EuroWordNet are available for free. Access to the full database, however, is charged. In some cases, OpenThesaurus and BabelNet may serve as a free alternative. See also * vidby * Babbel External links Lexical databases Computational linguistics Online dictionaries {{comp-ling-stub ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Machine Translation
Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation or interactive translation), is a sub-field of computational linguistics that investigates the use of software to translate text or speech from one language to another. On a basic level, MT performs mechanical substitution of words in one language for words in another, but that alone rarely produces a good translation because recognition of whole phrases and their closest counterparts in the target language is needed. Not all words in one language have equivalent words in another language, and many words have more than one meaning. Solving this problem with corpus statistical and neural techniques is a rapidly growing field that is leading to better translations, handling differences in linguistic typology, translation of idioms, and the isolation of anomalies. Current machine translation software often allows for customizat ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Text Classification
Document classification or document categorization is a problem in library science, information science and computer science. The task is to assign a document to one or more classes or categories. This may be done "manually" (or "intellectually") or algorithmically. The intellectual classification of documents has mostly been the province of library science, while the algorithmic classification of documents is mainly in information science and computer science. The problems are overlapping, however, and there is therefore interdisciplinary research on document classification. The documents to be classified may be texts, images, music, etc. Each kind of document possesses its special classification problems. When not otherwise specified, text classification is implied. Documents may be classified according to their subjects or according to other attributes (such as document type, author, printing year etc.). In the rest of this article only subject classification is considere ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Word Sense Disambiguation
Word-sense disambiguation (WSD) is the process of identifying which sense of a word is meant in a sentence or other segment of context. In human language processing and cognition, it is usually subconscious/automatic but can often come to conscious attention when ambiguity impairs clarity of communication, given the pervasive polysemy in natural language. In computational linguistics, it is an open problem that affects other computer-related writing, such as discourse, improving relevance of search engines, anaphora resolution, coherence, and inference. Given that natural language requires reflection of neurological reality, as shaped by the abilities provided by the brain's neural networks, computer science has had a long-term challenge in developing the ability in computers to do natural language processing and machine learning. Many techniques have been researched, including dictionary-based methods that use the knowledge encoded in lexical resources, supervised machine le ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

BabelNet
BabelNet is a multilingual lexicalized semantic network and ontology developed at the NLP group of the Sapienza University of Rome.R. Navigli and S. P Ponzetto. 2012BabelNet: The Automatic Construction, Evaluation and Application of a Wide-Coverage Multilingual Semantic Network Artificial Intelligence, 193, Elsevier, pp. 217-250. BabelNet was automatically created by linking Wikipedia to the most popular computational lexicon of the English language, WordNet. The integration is done using an automatic mapping and by filling in lexical gaps in resource-poor languages by using statistical machine translation. The result is an encyclopedic dictionary that provides concepts and named entities lexicalized in many languages and connected with large amounts of semantic relations. Additional lexicalizations and definitions are added by linking to free-license wordnets, OmegaWiki, the English Wiktionary, Wikidata, FrameNet, VerbNet and others. Similarly to WordNet, BabelNet groups words i ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Academic Free License
The Academic Free License (AFL) is a permissive free software license written in 2002 by Lawrence E. Rosen, a former general counsel of the Open Source Initiative (OSI). The license grants similar rights to the BSD, MIT, UoI/NCSA and Apache licenses licenses allowing the software to be made proprietary but was written to correct perceived problems with those licenses, the AFL: *makes clear what software is being licensed by including a statement following the software's copyright notice; *includes a complete copyright grant to the software; *contains a complete patent grant to the software; *makes clear that no trademark rights are granted to the licensor's trademarks; *warrants that the licensor either owns the copyright or is distributing the software under a license; *is itself copyrighted, with the right granted to copy and distribute without modification. The Free Software Foundation consider all AFL versions through 3.0 as incompatible with the GNU GPL. though Eric S. ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

CC By SA
A Creative Commons (CC) license is one of several public copyright licenses that enable the free distribution of an otherwise copyrighted "work".A "work" is any creative material made by a person. A painting, a graphic, a book, a song/lyrics to a song, or a photograph of almost anything are all examples of "works". A CC license is used when an author wants to give other people the right to share, use, and build upon a work that the author has created. CC provides an author flexibility (for example, they might choose to allow only non-commercial uses of a given work) and protects the people who use or redistribute an author's work from concerns of copyright infringement as long as they abide by the conditions that are specified in the license by which the author distributes the work. There are several types of Creative Commons licenses. Each license differs by several combinations that condition the terms of distribution. They were initially released on December 16, 2002, by ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


UBY-LMF
UBY-LMF is a format for standardizing lexical resources for Natural Language Processing (NLP). UBY-LMF conforms to the ISO standard for lexicons: LMF, designed within the ISO-TC37, and constitutes a so-called serialization of this abstract standard. In accordance with the LMF, all attributes and other linguistic terms introduced in UBY-LMF refer to standardized descriptions of their meaning in ISOCat. UBY-LMF has been implemented in Java and is actively developed as aOpen Source project on Google Code Based on this Java implementation, the large scale electronic lexicon UBY has automatically been created - it is the result of using UBY-LMF to standardize a range of diverse lexical resources frequently used for NLP applications. In 2013, UBY contains 10 lexicons which are pairwise interlinked at the sense level: * English WordNet, Wiktionary, Wikipedia, FrameNet, VerbNet, OmegaWiki * German Wiktionary, Wikipedia, GermaNet GermaNet is a semantic network for the German language. ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




ISO 12620
Linguistic categories include * Lexical category, a part of speech such as ''noun'', ''preposition'', etc. * Syntactic category, a similar concept which can also include phrasal categories * Grammatical category, a grammatical feature such as ''tense'', ''gender'', etc. The definition of linguistic categories is a major concern of linguistic theory, and thus, the definition and naming of categories varies across different theoretical frameworks and grammatical traditions for different languages. The operationalization of linguistic categories in lexicography, computational linguistics, natural language processing, corpus linguistics, and terminology management typically requires resource-, problem- or application-specific definitions of linguistic categories. In Cognitive linguistics it has been argued that linguistic categories have a prototype structure like that of the categories of common words in a language.John R Taylor (1995) ''Linguistic Categorization: Prototypes in Ling ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]