Caitra
   HOME

TheInfoList



OR:

Caitra is a
translation Translation is the communication of the Meaning (linguistic), meaning of a #Source and target languages, source-language text by means of an Dynamic and formal equivalence, equivalent #Source and target languages, target-language text. The ...
Computer Assisted Tool, or CAT, developed by the
University of Edinburgh The University of Edinburgh ( sco, University o Edinburgh, gd, Oilthigh Dhùn Èideann; abbreviated as ''Edin.'' in post-nominals) is a public research university based in Edinburgh, Scotland. Granted a royal charter by King James VI in 15 ...
. Provided from an online platform, Caitra is based on
AJAX Ajax may refer to: Greek mythology and tragedy * Ajax the Great, a Greek mythological hero, son of King Telamon and Periboea * Ajax the Lesser, a Greek mythological hero, son of Oileus, the king of Locris * ''Ajax'' (play), by the ancient Greek ...
Web.2 technologies and the Moses decoder. The web page of the tool is implemented with
Ruby on Rails Ruby on Rails (simplified as Rails) is a server-side web application framework written in Ruby under the MIT License. Rails is a model–view–controller (MVC) framework, providing default structures for a database, a web service, and web p ...
, an open source web framework, and
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
. Caitra assists human translators by offering suggestions and alternative translations.


History

Machine Translation Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation or interactive translation), is a sub-field of computational linguistics that investigates t ...
(MT) systems are typically used by readers who do not need a thorough translation and want quick access to the foreign language. Professional translators usually require advanced machine translation tools to make their work easier and to give a higher quality translation to their clients. The Trans-Type project (Langlais et al., 2000) gave a pioneer approach to the MT as an aid to human translators. This translation tool would suggest different translations for a segment while providing the translator an opportunity to accept the suggested translation or overwrite it with their own translation, which in turn would trigger new potential translations to the tool. This is, however, not necessarily suitable for professional translators. Tools with post-edition facilities have also been developed as an intermediate field between typical MT and human translators in order to integrate MT and human translation and to achieve the desired results. The School of Informatics and the Machine Translation Group of the University of Edinburgh, created a research program, CAITRA, to analyze the benefits of different types of MTs and to explore the interaction between the machine and the user in order to develop new CAT tools.


Properties

Caitra is programmed with an open-source web framework, Ruby on Rails (Thomasand Hansson, 2008). The online platform uses Ajax-style Web 2.0 technologies (Raymond, 2007) connected to a
MySQL MySQL () is an open-source relational database management system (RDBMS). Its name is a combination of "My", the name of co-founder Michael Widenius's daughter My, and "SQL", the acronym for Structured Query Language. A relational database o ...
database-driven back-end. The machine translation back-end is powered by the statistical sentence-based MT,
Moses Moses hbo, מֹשֶׁה, Mōše; also known as Moshe or Moshe Rabbeinu (Mishnaic Hebrew: מֹשֶׁה רַבֵּינוּ, ); syr, ܡܘܫܐ, Mūše; ar, موسى, Mūsā; grc, Mωϋσῆς, Mōÿsēs () is considered the most important pro ...
(Koehn et al., 2007). C++ is integrated to improve the speed of the process of translation suggestions. The tool is provided online by the School of Informatics as a study of the user’s interaction with the tool, as well as the ability for members suggest additional features and fixes to the program. The user inputs text into the provided text box. Caitra processes the text as the user clicks the "Upload" icon. The process may last a few minutes, and Caitra will find different options for the translation, one of them is taken by default. Once the process is finished, translators have multiple options of assistance, presented in an interface. The segment for translation is the sentence and so Caitra works with only one sentence at the same time.


Interactive Machine translation

The Trans-Type project (Langlais et al., 2000) has done an investigation about
Interactive Machine Translation Interactive machine translation (IMT), is a specific sub-field of computer-aided translation. Under this translation paradigm, the computer software that assists the human translator attempts to predict the text the user is going to input by taking ...
, consisting of sentence-segment translation aided by a CAT tool, which suggests several different options for the translation. The human translators may choose one of them or provide their own translation if they do not like the offered translations. This process is similar to the
auto-completion Autocomplete, or word completion, is a feature in which an application predicts the rest of a word a user is typing. In Android and iOS smartphones, this is called predictive text. In graphical user interfaces, users can typically press the tab ...
tool used in several office programs. The statistical translation system is followed to generate the predictions for translation. These predictions are provided in short phrases, according to the statistical phrase-based translation model. This model also makes it easier for the user to read the predictions. The suggestions and user actions are stored in a large database. During the user interaction, Caitra quickly matches user input against a graph using a string edit distance measure. The prediction is the optimal completion path that matches the user input with (a) minimal string edit distance and (b) highest sentence translation probability. This
computation Computation is any type of arithmetic or non-arithmetic calculation that follows a well-defined model (e.g., an algorithm). Mechanical or electronic devices (or, historically, people) that perform computations are known as ''computers''. An es ...
takes place at the server and is implemented in C++, as Philipp Koehn explains. Once the user accepts a suggestion, a new one is displayed as well the typing of a new segment. The acceptance of suggestions depends on the pair of languages and the complexity of the text. Preliminary studies about CAITRA suggest that users usually accept 50-80% of predictions generated by the system.


Translation process

Once the text is uploaded, users can see the result of the machine translation and edit the text based on the predictions. The prediction table is displayed by clicking the edit icon. The text is divided into sentences, which are also divided into smaller units. Predictions for these units appear in a box, and the most likely suggestion has a different colour in the highest part of the table. Predictions are accepted by clicking on them and the system updates the election to the user input. The database is made of amounts of pairs of translated texts and translations. The most likely prediction is the result of previous matches in the database. The user's choices are scored in the database to be used in future translations. These predictions help not only professional translators, but also novice translators who do not know the vocabulary and people without knowledge of the foreign language.


Post-editing Machine Translation process

Users can
review A review is an evaluation of a publication, product, service, or company or a critical take on current affairs in literature, politics or culture. In addition to a critical evaluation, the review's author may assign the work a content rating, ...
their translation and make any change to correct possible mistakes. The changes appear in the output display.


User activity

Caitra stores the allotted time in which the users accept a prediction or write their own translation. The actions have different importance for the future predictions depending on the user's actions and in the time they need to perform their translation. Every action, pause or movement is relevant in order to improve future translations.


References

{{reflist * Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, Evan Herbst. (2007) "Moses: Open Source Toolkit for Statistical Machine Translation". Annual Meeting of the Association for Computational Linguistics (ACL), demonstration session, Prague, Czech Republic, June 2007.
Olivia Craciunescu, "Machine Translation and Computer-Assisted Translation:a New Way of Translating?"


External links


Caitra Official website

Statistical Machine Translation Group at the University of Edinburgh

Moses Official website. University of Edinburgh


See also

*
Computer-assisted translation Computer-aided translation (CAT), also referred to as computer-assisted translation or computer-aided human translation (CAHT), is the use of software to assist a human translator in the translation process. The translation is created by a huma ...
*
Computer-assisted reviewing {{Unreferenced, date=September 2008 Computer-assisted reviewing (CAR) tools are pieces of software based on text-comparison and analysis algorithms. These tools focus on the differences between two documents, taking into account each document's typ ...
Machine translation