Evaluation Of Machine Translation
   HOME
*





Evaluation Of Machine Translation
Various methods for the evaluation for machine translation have been employed. This article focuses on the evaluation of the output of machine translation, rather than on performance or usability evaluation. Round-trip translation A typical way for lay people to assess machine translation quality is to translate from a source language to a target language and back to the source language with the same engine. Though intuitively this may seem like a good method of evaluation, it has been shown that round-trip translation is a "poor predictor of quality". The reason why it is such a poor predictor of quality is reasonably intuitive. A round-trip translation is not testing one system, but two systems: the language pair of the engine for translating ''into'' the target language, and the language pair translating ''back from'' the target language. Consider the following examples of round-trip translation performed from English to Italian and Portuguese from Somers (2005): : : In the ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Machine Translation
Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation or interactive translation), is a sub-field of computational linguistics that investigates the use of software to translate text or speech from one language to another. On a basic level, MT performs mechanical substitution of words in one language for words in another, but that alone rarely produces a good translation because recognition of whole phrases and their closest counterparts in the target language is needed. Not all words in one language have equivalent words in another language, and many words have more than one meaning. Solving this problem with corpus statistical and neural techniques is a rapidly growing field that is leading to better translations, handling differences in linguistic typology, translation of idioms, and the isolation of anomalies. Current machine translation software often allows for customizat ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


BLEU
Bleu or BLEU may refer to: * the French word for blue * '' Three Colors: Blue'', a 1993 movie * BLEU (Bilingual Evaluation Understudy), a machine translation evaluation metric * Belgium–Luxembourg Economic Union * Blue cheese, a type of cheese * Parti bleu, 19th century political group in Quebec, Canada * ''Bleu'' (blue-rare), synonymous with "extra rare", indicating a barely-cooked meat preparation; very red and cold * ''Le Bleu'' (2001 album) album by Justin King People * Bleu (musician), a member of pop-group L.E.O. * Corbin Bleu, an American actor, model, dancer and vocalist * Deis, a character from the ''Breath of Fire'' role-playing videogame series who is known as "Bleu" in the English versions See also * Blue (other) * Lebleu (other) * Les Bleus (other) Les Bleus may refer to: National team of France ''Les Bleus'' (French for "The Blues") is often used in a French sporting context, and in particular may refer to: * France's national team: ** ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Bonnie Dorr
Bonnie Jean Dorr is an American computer scientist specializing in natural language processing and machine translation. She is a professor emerita of computer science and linguistics at the University of Maryland, College Park, an associate director and senior research scientist at the Florida Institute for Human and Machine Cognition, and the former president of the Association for Computational Linguistics. Education and career Dorr is a graduate of Boston University, and earned a Ph.D. in 1990 from the Massachusetts Institute of Technology. Her dissertation, ''Lexical Conceptual Structure and Machine Translation'', was supervised by Robert C. Berwick. Dorr joined the University of Maryland faculty in 1992. At Maryland, she became the founding co-director of the Computational Linguistics and Information Processing Laboratory, and associate dean of the university's College of Computer Math and Natural Sciences. She has also worked as a program director at DARPA beginning in 201 ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Machine Translation Software Usability
The sections below give objective criteria for evaluating the usability of machine translation software output. Stationarity or canonical form Do repeated translations converge on a single expression in both languages? I.e. does the translation method show stationarity or produce a canonical form? Does the translation become stationary without losing the original meaning? This metric has been criticized as not being well correlated with BLEU (BiLingual Evaluation Understudy) scores. Adaptive to colloquialism, argot or slang Is the system adaptive to colloquialism, argot or slang? The French language has many rules for creating words in the speech and writing of popular culture. Two such rules are: (a) The reverse spelling of words such as ''femme'' to ''meuf''. (This is called verlan.) (b) The attachment of the suffix ''-ard'' to a noun or verb to form a proper noun. For example, the noun ''faluche'' means "student hat". The word ''faluchard'' formed from ''faluche'' c ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Comparison Of Machine Translation Applications
Machine translation is an algorithm which attempts to translate text or speech from one natural language to another. General information Basic general information for popular machine translation applications. Languages features comparison The following table compares the number of languages which the following machine translation programs can translate between. (Moses and Moses for Mere Mortals allow you to train translation models for any language pair, though collections of translated texts (parallel corpus) need to be provided by the user. The Moses site provides links to training corpora.) This is not an all-encompassing list. Some applications have many more language pairs than those listed below. This is a general comparison of key languages only. A full and accurate list of language pairs supported by each product should be found on each of the product's websites. See also * Machine translation * Machine translation software usability * Computer-assisted tr ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Amazon Mechanical Turk
Amazon Mechanical Turk (MTurk) is a crowdsourcing website for businesses to hire remotely located "crowdworkers" to perform discrete on-demand tasks that computers are currently unable to do. It is operated under Amazon Web Services, and is owned by Amazon. Employers (known as ''requesters'') post jobs known as ''Human Intelligence Tasks'' (HITs), such as identifying specific content in an image or video, writing product descriptions, or answering survey questions. Workers, colloquially known as ''Turkers'' or ''crowdworkers'', browse among existing jobs and complete them in exchange for a fee set by the employer. To place jobs, the requesting programs use an open application programming interface (API), or the more limited MTurk Requester site. As of April 2019, Requesters could register from 49 approved countries. History The service was conceived by Venky Harinarayan in a US patent disclosure in 2001.Multiple sources: * * Amazon coined the term ''artificial artificial intelli ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Translation
Translation is the communication of the Meaning (linguistic), meaning of a #Source and target languages, source-language text by means of an Dynamic and formal equivalence, equivalent #Source and target languages, target-language text. The English language draws a terminology, terminological distinction (which does not exist in every language) between ''translating'' (a written text) and ''Language interpretation, interpreting'' (oral or Sign language, signed communication between users of different languages); under this distinction, translation can begin only after the appearance of writing within a language community. A translator always risks inadvertently introducing source-language words, grammar, or syntax into the target-language rendering. On the other hand, such "spill-overs" have sometimes imported useful source-language calques and loanwords that have enriched target languages. Translators, including early translators of sacred texts, have helped shape the very l ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Harmonic Mean
In mathematics, the harmonic mean is one of several kinds of average, and in particular, one of the Pythagorean means. It is sometimes appropriate for situations when the average rate is desired. The harmonic mean can be expressed as the reciprocal of the arithmetic mean of the reciprocals of the given set of observations. As a simple example, the harmonic mean of 1, 4, and 4 is : \left(\frac\right)^ = \frac = \frac = 2\,. Definition The harmonic mean ''H'' of the positive real numbers x_1, x_2, \ldots, x_n is defined to be :H = \frac = \frac = \left(\frac\right)^. The third formula in the above equation expresses the harmonic mean as the reciprocal of the arithmetic mean of the reciprocals. From the following formula: :H = \frac. it is more apparent that the harmonic mean is related to the arithmetic and geometric means. It is the reciprocal dual of the arithmetic mean for positive inputs: :1/H(1/x_1 \ldots 1/x_n) = A(x_1 \ldots x_n) The harmonic mean is a Schur-con ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Speech Recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the main benefit of searchability. It is also known as automatic speech recognition (ASR), computer speech recognition or speech to text (STT). It incorporates knowledge and research in the computer science, linguistics and computer engineering fields. The reverse process is speech synthesis. Some speech recognition systems require "training" (also called "enrollment") where an individual speaker reads text or isolated vocabulary into the system. The system analyzes the person's specific voice and uses it to fine-tune the recognition of that person's speech, resulting in increased accuracy. Systems that do not use training are called "speaker-independent" systems. Systems that use training are called "speaker dependent". Speech recognition ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Levenshtein Distance
In information theory, linguistics, and computer science, the Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other. It is named after the Soviet mathematician Vladimir Levenshtein, who considered this distance in 1965. Levenshtein distance may also be referred to as ''edit distance'', although that term may also denote a larger family of distance metrics known collectively as edit distance. It is closely related to pairwise string alignments. Definition The Levenshtein distance between two strings a, b (of length , a, and , b, respectively) is given by \operatorname(a, b) where : \operatorname(a, b) = \begin , a, & \text , b, = 0, \\ , b, & \text , a, = 0, \\ \operatorname\big(\operatorname(a),\operatorname(b)\big) & \text a = b \\ 1 + \min ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


N-gram
In the fields of computational linguistics and probability, an ''n''-gram (sometimes also called Q-gram) is a contiguous sequence of ''n'' items from a given sample of text or speech. The items can be phonemes, syllables, letters, words or base pairs according to the application. The ''n''-grams typically are collected from a text or speech corpus. When the items are words, -grams may also be called ''shingles''. Using Latin numerical prefixes, an ''n''-gram of size 1 is referred to as a "unigram"; size 2 is a "bigram" (or, less commonly, a "digram"); size 3 is a "trigram". English cardinal numbers are sometimes used, e.g., "four-gram", "five-gram", and so on. In computational biology, a polymer or oligomer of a known size is called a ''k''-mer instead of an ''n''-gram, with specific names using Greek numerical prefixes such as "monomer", "dimer", "trimer", "tetramer", "pentamer", etc., or English cardinal numbers, "one-mer", "two-mer", "three-mer", etc. Applications ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Text Domain
Genre () is any form or type of communication in any mode (written, spoken, digital, artistic, etc.) with socially-agreed-upon conventions developed over time. In popular usage, it normally describes a category of literature, music, or other forms of art or entertainment, whether written or spoken, audio or visual, based on some set of stylistic criteria, yet genres can be aesthetic, rhetorical, communicative, or functional. Genres form by conventions that change over time as cultures invent new genres and discontinue the use of old ones. Often, works fit into multiple genres by way of borrowing and recombining these conventions. Stand-alone texts, works, or pieces of communication may have individual styles, but genres are amalgams of these texts based on agreed-upon or socially inferred conventions. Some genres may have rigid, strictly adhered-to guidelines, while others may show great flexibility. Genre began as an absolute classification system for ancient Greek literature, ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]