Example-based Machine Translation
   HOME
*





Example-based Machine Translation
Example-based machine translation (EBMT) is a method of machine translation often characterized by its use of a bilingual corpus with parallel texts as its main knowledge base at run-time. It is essentially a translation by analogy and can be viewed as an implementation of a case-based reasoning approach to machine learning. Translation by analogy At the foundation of example-based machine translation is the idea of translation by analogy. When applied to the process of human translation, the idea that translation takes place by analogy is a rejection of the idea that people translate sentences by doing deep linguistic analysis. Instead, it is founded on the belief that people translate by first decomposing a sentence into certain phrases, then by translating these phrases, and finally by properly composing these fragments into one long sentence. Phrasal translations are translated by analogy to previous translations. The principle of translation by analogy is encoded to example-b ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Machine Translation
Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation or interactive translation), is a sub-field of computational linguistics that investigates the use of software to translate text or speech from one language to another. On a basic level, MT performs mechanical substitution of words in one language for words in another, but that alone rarely produces a good translation because recognition of whole phrases and their closest counterparts in the target language is needed. Not all words in one language have equivalent words in another language, and many words have more than one meaning. Solving this problem with corpus statistical and neural techniques is a rapidly growing field that is leading to better translations, handling differences in linguistic typology, translation of idioms, and the isolation of anomalies. Current machine translation software often allows for customizat ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Adverb
An adverb is a word or an expression that generally modifies a verb, adjective, another adverb, determiner, clause, preposition, or sentence. Adverbs typically express manner, place, time, frequency, degree, level of certainty, etc., answering questions such as ''how'', ''in what way'', ''when'', ''where'', ''to what extent''. This is called the adverbial function and may be performed by single words (adverbs) or by multi-word adverbial phrases and adverbial clauses. Adverbs are traditionally regarded as one of the parts of speech. Modern linguists note that the term "adverb" has come to be used as a kind of "catch-all" category, used to classify words with various types of syntactic behavior, not necessarily having much in common except that they do not fit into any of the other available categories (noun, adjective, preposition, etc.) Functions The English word ''adverb'' derives (through French) from Latin ''adverbium'', from ''ad-'' ("to"), ''verbum'' ("word", "verb"), ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Statistical Machine Translation
Statistical machine translation (SMT) is a machine translation paradigm where translations are generated on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora. The statistical approach contrasts with the rule-based approaches to machine translation as well as with example-based machine translation, and has more recently been superseded by neural machine translation in many applications (see this article's final section). The first ideas of statistical machine translation were introduced by Warren Weaver in 1949, including the ideas of applying Claude Shannon's information theory. Statistical machine translation was re-introduced in the late 1980s and early 1990s by researchers at IBM's Thomas J. Watson Research Center and has contributed to the significant resurgence in interest in machine translation in recent years. Before the introduction of neural machine translation, it was by far the most widely studied machine translati ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Natural Language Processing
Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves. Challenges in natural language processing frequently involve speech recognition, natural-language understanding, and natural-language generation. History Natural language processing has its roots in the 1950s. Already in 1950, Alan Turing published an article titled "Computing Machinery and Intelligence" which proposed what is now called the Turing test as a criterion of intelligence, t ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Translation Memory
A translation memory (TM) is a database that stores "segments", which can be sentences, paragraphs or sentence-like units (headings, titles or elements in a list) that have previously been translated, in order to aid human translators. The translation memory stores the source text and its corresponding translation in language pairs called “translation units”. Individual words are handled by terminology bases and are not within the domain of TM. Software programs that use translation memories are sometimes known as translation memory managers (TMM) or translation memory systems (TM systems, not to be confused with a Translation management system (TMS), which is another type of software focused on managing process of translation). Translation memories are typically used in conjunction with a dedicated computer assisted translation (CAT) tool, word processing program, terminology management systems, multilingual dictionary, or even raw machine translation output. Research indicat ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Programming By Example
In computer science, programming by example (PbE), also termed programming by demonstration or more generally as demonstrational programming, is an end-user development technique for teaching a computer new behavior by demonstrating actions on concrete examples. The system records user actions and infers a generalized program that can be used on new examples. PbE is intended to be easier to do than traditional computer programming, which generally requires learning and using a programming language. Many PbE systems have been developed as research prototypes, but few have found widespread real-world application. More recently, PbE has proved to be a useful paradigm for creating scientific work-flows. PbE is used in two independent clients for the BioMOBY protocolSeahawkanGbrowse moby Also the programming by demonstration (PbD) term has been mostly adopted by robotics researchers for teaching new behaviors to the robot through a physical demonstration of the task. The usual distinc ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Hindustani Language
Hindustani (; Devanagari: , * * * * ; Perso-Arabic: , , ) is the '' lingua franca'' of Northern and Central India and Pakistan. Hindustani is a pluricentric language with two standard registers, known as Hindi and Urdu. Thus, the language is sometimes called Hindi–Urdu. Despite these standard registers, colloquial speech in Hindustani often exists on a spectrum between these standards. Ancestors of the language were known as ''Hindui'', ''Hindavi'', ''Zabān-e Hind'' (), ''Zabān-e Hindustan'' (), ''Hindustan ki boli'' (), Rekhta, and Hindi. Its regional dialects became known as ''Zabān-e Dakhani'' in southern India, ''Zabān-e Gujari'' () in Gujarat, and as ''Zabān-e Dehlavi'' or Urdu around Delhi. It is an Indo-Aryan language, deriving its base primarily from the Western Hindi dialect of Delhi, also known as Khariboli. Hindustani is a pluricentric language, best characterised as a continuum between two standardised registers: Modern Standard Hindi and Modern ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Particle (grammar)
In grammar, the term ''particle'' (abbreviated ) has a traditional meaning, as a part of speech that cannot be inflected, and a modern meaning, as a function word associated with another word or phrase, generally in order to impart meaning. Although a particle may have an intrinsic meaning, and indeed may fit into other grammatical categories, the fundamental idea of the particle is to add context to the sentence, expressing a mood or indicating a specific action. In English, for instance, the phrase "oh well" has no purpose in speech other than to convey a mood. The word 'up' would be a particle in the phrase to 'look up' (as in the phrase ''"''look up this topic''"''), implying that one researches something, rather than literally gazing skywards. Many languages use particles, in varying amounts and for varying reasons. In Hindi, for instance, they may be used as honorifics, or to indicate emphasis or negation. In some languages they are more clearly defined, such as Chinese, whic ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Preposition
Prepositions and postpositions, together called adpositions (or broadly, in traditional grammar, simply prepositions), are a class of words used to express spatial or temporal relations (''in'', ''under'', ''towards'', ''before'') or mark various semantic roles (''of'', ''for''). A preposition or postposition typically combines with a noun phrase, this being called its complement, or sometimes object. A preposition comes before its complement; a postposition comes after its complement. English generally has prepositions rather than postpositions – words such as ''in'', ''under'' and ''of'' precede their objects, such as ''in England'', ''under the table'', ''of Jane'' – although there are a few exceptions including "ago" and "notwithstanding", as in "three days ago" and "financial limitations notwithstanding". Some languages that use a different word order have postpositions instead, or have both types. The phrase formed by a preposition or postposition together with its comp ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Phrasal Verb
In the traditional grammar of Modern English, a phrasal verb typically constitutes a single semantic unit composed of a verb followed by a particle (examples: ''turn down'', ''run into'' or ''sit up''), sometimes combined with a preposition (examples: ''get together with'', ''run out of'' or ''feed off of''). Alternative terms include verb-adverb combination, verb-particle construction, two-part word/verb or three-part word/verb (depending on the number of particles) and multi-word verb. Phrasal verbs ordinarily cannot be understood based upon the meanings of the individual parts alone but must be considered as a whole: the meaning is non- compositional and thus unpredictable. Phrasal verbs are differentiated from other classifications of multi-word verbs and free combinations by criteria based on idiomaticity, replacement by a single-word verb, wh-question formation and particle movement. Types The category "phrasal verb" is mainly used in English as a second language teac ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Text Corpus
In linguistics, a corpus (plural ''corpora'') or text corpus is a language resource consisting of a large and structured set of texts (nowadays usually electronically stored and processed). In corpus linguistics, they are used to do statistical analysis and statistical hypothesis testing, hypothesis testing, checking occurrences or validating linguistic rules within a specific language territory. In Search engine (computing), search technology, a corpus is the collection of documents which is being searched. Overview A corpus may contain texts in a single language (''monolingual corpus'') or text data in multiple languages (''multilingual corpus''). In order to make the corpora more useful for doing linguistic research, they are often subjected to a process known as annotation. An example of annotating a corpus is part-of-speech tagging, or ''POS-tagging'', in which information about each word's part of speech (verb, noun, adjective, etc.) is added to the corpus in the form o ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Rule-based Machine Translation
Rule-based machine translation (RBMT; "Classical Approach" of MT) is machine translation systems based on linguistic information about source and target languages basically retrieved from (unilingual, bilingual or multilingual) dictionaries and grammars covering the main semantic, morphological, and syntactic regularities of each language respectively. Having input sentences (in some source language), an RBMT system generates them to output sentences (in some target language) on the basis of morphological, syntactic, and Semantic analysis (computational), semantic analysis of both the source and the target languages involved in a concrete translation task. History The first RBMT systems were developed in the early 1970s. The most important steps of this evolution were the emergence of the following RBMT systems: * Systran (http://www.systran.de/) * Japanese MT systems (http://aamt.info/english/mtsys.htm, http://www.wtec.org/loyola/ar93_94/mt.htm) Today, other common RBMT syste ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]