Evaluation Of Machine Translation
   HOME





Evaluation Of Machine Translation
Various methods for the evaluation for machine translation have been employed. This article focuses on the evaluation of the output of machine translation, rather than on performance or usability evaluation. Round-trip translation A typical way for lay people to assess machine translation quality is to translate from a source language to a target language and back to the source language with the same engine. Though intuitively this may seem like a good method of evaluation, it has been shown that round-trip translation is a "poor predictor of quality". The reason why it is such a poor predictor of quality is reasonably intuitive. A round-trip translation is not testing one system, but two systems: the language pair of the engine for translating ''into'' the target language, and the language pair translating ''back from'' the target language. Consider the following examples of round-trip translation performed from English to Italian and Portuguese from Somers (2005): : : In t ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Machine Translation
Machine translation is use of computational techniques to translate text or speech from one language to another, including the contextual, idiomatic and pragmatic nuances of both languages. Early approaches were mostly rule-based or statistical. These methods have since been superseded by neural machine translation and large language models. History Origins The origins of machine translation can be traced back to the work of Al-Kindi, a ninth-century Arabic cryptographer who developed techniques for systemic language translation, including cryptanalysis, frequency analysis, and probability and statistics, which are used in modern machine translation. The idea of machine translation later appeared in the 17th century. In 1629, René Descartes proposed a universal language, with equivalent ideas in different tongues sharing one symbol. The idea of using digital computers for translation of natural languages was proposed as early as 1947 by England's A. D. Booth and Warr ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


BLEU
Bleu or BLEU may refer to: * '' Three Colors: Blue'', a 1993 film * BLEU (Bilingual Evaluation Understudy), a machine translation evaluation metric * Belgium–Luxembourg Economic Union * Blue cheese, a type of cheese * Parti bleu, 19th century political group in Quebec, Canada * '' Bleu'' (blue-rare), synonymous with "extra rare", indicating a barely-cooked meat preparation; very red and cold * '' Le Bleu'', a 2001 album by Justin King People * Bleu (musician), a member of the pop group L.E.O. * Corbin Bleu, an American actor, model, dancer and singer * Yung Bleu, an American record producer, rapper and singer also known as "Bleu" * Deis, a character from the '' Breath of Fire'' role-playing video game series known as "Bleu" in the English versions See also * Blue (other) Blue is a color. Blue may also refer to: Places * Blue, Arizona, an unincorporated community in the United States * Blue, Oklahoma, an unincorporated community in the United States * Blue, W ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Bonnie Dorr
Bonnie Jean Dorr is an American computer scientist specializing in natural language processing, machine translation, automatic summarization, social computing, and explainable artificial intelligence. She is a professor and director of the Natural Language Processing Research Laboratory in the Department of Computer & Information Science & Engineering at the University of Florida. Gainesville, Florida She is professor emerita of computer science and linguistics and former dean at the University of Maryland, College Park, former associate director at the Florida Institute for Human and Machine Cognition,, and former president of the Association for Computational Linguistics. Education Dorr is a graduate of Boston University, and earned both a Master's (1986) and a Ph.D. (1990) from the Massachusetts Institute of Technology. Her dissertation, ''Lexical Conceptual Structure and Machine Translation'', was supervised by Robert C. Berwick. Academic career Dorr joined the University o ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Machine Translation Software Usability
The sections below give objective criteria for evaluating the usability of machine translation software output. Stationarity or canonical form Do repeated translations converge on a single expression in both languages? I.e. does the translation method show stationarity or produce a canonical form? Does the translation become stationary without losing the original meaning? This metric has been criticized as not being well correlated with BLEU (BiLingual Evaluation Understudy) scores. Adaptive to colloquialism, argot or slang Is the system adaptive to colloquialism, argot or slang? The French language has many rules for creating words in the speech and writing of popular culture. Two such rules are: (a) The reverse spelling of words such as ''femme'' to ''meuf''. (This is called verlan.) (b) The attachment of the suffix ''-ard'' to a noun or verb to form a proper noun. For example, the noun '' faluche'' means "student hat". The word ''faluchard'' formed from ''faluche'' ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Comparison Of Machine Translation Applications
Machine translation is an algorithm which attempts to translate text or speech from one natural language to another. General information Basic general information for popular machine translation applications. Languages features comparison The following table compares the number of languages which the following machine translation programs can translate between. (Moses and Moses for Mere Mortals allow you to train translation models for any language pair, though collections of translated texts (parallel corpus) need to be provided by the user. The Moses site provides links to training corpora.) This is not an all-encompassing list. Some applications have many more language pairs than those listed below. This is a general comparison of key languages only. A full and accurate list of language pairs supported by each product should be found on each of the product's websites. Multi-pair translations Paired translations See also * Machine translation * Machine tran ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Amazon Mechanical Turk
Amazon Mechanical Turk (MTurk) is a crowdsourcing website with which businesses can hire remotely located "crowdworkers" to perform discrete on-demand tasks that computers are currently unable to do as economically. It is operated under Amazon Web Services, and is owned by Amazon. Employers, known as ''requesters,'' post jobs known as ''Human Intelligence Tasks'' (HITs), such as identifying specific content in an image or video, writing product descriptions, or answering survey questions. Workers, colloquially known as ''Turkers'' or ''crowdworkers'', browse among existing jobs and complete them in exchange for a fee set by the requester. To place jobs, requesters use an open application programming interface (API), or the more limited MTurk Requester site. , requesters could register from 49 approved countries. History The service was conceived by Venky Harinarayan in a U.S. patent disclosure in 2001. Amazon coined the term ''artificial artificial intelligence'' for processes th ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Translation
Translation is the communication of the semantics, meaning of a #Source and target languages, source-language text by means of an Dynamic and formal equivalence, equivalent #Source and target languages, target-language text. The English language draws a terminological distinction (which does not exist in every language) between ''translating'' (a written text) and ''interpreting'' (oral or Sign language, signed communication between users of different languages); under this distinction, translation can begin only after the appearance of writing within a language community. A translator always risks inadvertently introducing source-language words, grammar, or syntax into the target-language rendering. On the other hand, such "spill-overs" have sometimes imported useful source-language calques and loanwords that have enriched target languages. Translators, including early translators of sacred texts, have helped shape the very languages into which they have translated. Becau ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Harmonic Mean
In mathematics, the harmonic mean is a kind of average, one of the Pythagorean means. It is the most appropriate average for ratios and rate (mathematics), rates such as speeds, and is normally only used for positive arguments. The harmonic mean is the multiplicative inverse, reciprocal of the arithmetic mean of the reciprocals of the numbers, that is, the generalized f-mean with f(x) = \frac. For example, the harmonic mean of 1, 4, and 4 is :\left(\frac\right)^ = \frac = \frac = 2\,. Definition The harmonic mean ''H'' of the positive real numbers x_1, x_2, \ldots, x_n is :H(x_1, x_2, \ldots, x_n) = \frac = \frac. It is the reciprocal of the arithmetic mean of the reciprocals, and vice versa: :\begin H(x_1, x_2, \ldots, x_n) &= \frac, \\ A(x_1, x_2, \ldots, x_n) &= \frac, \end where the arithmetic mean is A(x_1, x_2, \ldots, x_n) = \tfrac1n \sum_^n x_i. The harmonic mean is a Schur-concave function, and is greater than or equal to the minimum of its arguments: for positive a ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Speech Recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also known as automatic speech recognition (ASR), computer speech recognition or speech-to-text (STT). It incorporates knowledge and research in the computer science, linguistics and computer engineering fields. The reverse process is speech synthesis. Some speech recognition systems require "training" (also called "enrollment") where an individual speaker reads text or isolated vocabulary into the system. The system analyzes the person's specific voice and uses it to fine-tune the recognition of that person's speech, resulting in increased accuracy. Systems that do not use training are called "speaker-independent" systems. Systems that use training are called "speaker dependent". Speech recognition applications include voice user interfaces ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Levenshtein Distance
In information theory, linguistics, and computer science, the Levenshtein distance is a string metric for measuring the difference between two sequences. The Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other. It is named after Soviet mathematician Vladimir Levenshtein, who defined the metric in 1965. Levenshtein distance may also be referred to as ''edit distance'', although that term may also denote a larger family of distance metrics known collectively as edit distance. It is closely related to pairwise string alignments. Definition The Levenshtein distance between two strings a, b (of length , a, and , b, respectively) is given by \operatorname(a, b) where : \operatorname(a, b) = \begin , a, & \text , b, = 0, \\ , b, & \text , a, = 0, \\ \operatorname\big(\operatorname(a),\operatorname(b)\big) & \text \operatorname(a)= \operatorname(b), \\ 1 ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


N-gram
An ''n''-gram is a sequence of ''n'' adjacent symbols in particular order. The symbols may be ''n'' adjacent letter (alphabet), letters (including punctuation marks and blanks), syllables, or rarely whole words found in a language dataset; or adjacent phonemes extracted from a speech-recording dataset, or adjacent base pairs extracted from a genome. They are collected from a text corpus or speech corpus. If Latin numerical prefixes are used, then ''n''-gram of size 1 is called a "unigram", size 2 a "bigram" (or, less commonly, a "digram") etc. If, instead of the Latin ones, the Cardinal number (linguistics), English cardinal numbers are furtherly used, then they are called "four-gram", "five-gram", etc. Similarly, using Greek numerical prefixes such as "monomer", "dimer", "trimer", "tetramer", "pentamer", etc., or English cardinal numbers, "one-mer", "two-mer", "three-mer", etc. are used in computational biology, for polymers or oligomers of a known size, called k-mer, ''k'' ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Text Domain
Genre () is any style or form of communication in any mode (written, spoken, digital, artistic, etc.) with socially agreed-upon conventions developed over time. In popular usage, it normally describes a category of literature, music, or other forms of art or entertainment, based on some set of stylistic criteria, as in literary genres, film genres, music genres, comics genres, etc. Often, works fit into multiple genres by way of borrowing and recombining these conventions. Stand-alone texts, works, or pieces of communication may have individual styles, but genres are amalgams of these texts based on agreed-upon or socially inferred conventions. Some genres may have rigid, strictly adhered-to guidelines, while others may show great flexibility. The proper use of a specific genre is important for a successful transfer of information (media-adequacy). Critical discussion of genre perhaps began with a classification system for ancient Greek literature, as set out in Aristotle's ''Poe ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]