Automatic Language Translator
   HOME

TheInfoList



OR:

IBM's Automatic Language Translator was a
machine translation Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation or interactive translation), is a sub-field of computational linguistics that investigates t ...
system that converted
Russian Russian(s) refers to anything related to Russia, including: *Russians (, ''russkiye''), an ethnic group of the East Slavic peoples, primarily living in Russia and neighboring countries *Rossiyane (), Russian language term for all citizens and peo ...
documents into
English English usually refers to: * English language * English people English may also refer to: Peoples, culture, and language * ''English'', an adjective for something of, from, or related to England ** English national ide ...
. It used an
optical disc In computing and optical disc recording technologies, an optical disc (OD) is a flat, usually circular disc that encodes binary data (bits) in the form of pits and lands on a special material, often aluminum, on one of its flat surfaces. ...
that stored 170,000 word-for-word and statement-for-statement translations and a custom computer to look them up at high speed. Built for the
US Air Force The United States Air Force (USAF) is the air service branch of the United States Armed Forces, and is one of the eight uniformed services of the United States. Originally created on 1 August 1907, as a part of the United States Army Sig ...
's Foreign Technology Division, the AN/GSQ-16 (or XW-2), as it was known to the Air Force, was primarily used to convert Soviet technical documents for distribution to western scientists. The translator was installed in 1959, dramatically upgraded in 1964, and was eventually replaced by a
mainframe A mainframe computer, informally called a mainframe or big iron, is a computer used primarily by large organizations for critical applications like bulk data processing for tasks such as censuses, industry and consumer statistics, enterprise ...
running SYSTRAN in 1970.


History


Photoscopic store

The translator began in a June 1953 contract from the
US Navy The United States Navy (USN) is the maritime service branch of the United States Armed Forces and one of the eight uniformed services of the United States. It is the largest and most powerful navy in the world, with the estimated tonnage of ...
to the International Telemeter Corporation (ITC) of Los Angeles. This was not for a translation system, but a pure research and development contract for a high-performance photographic online storage medium consisting of small black rectangles embedded in a plastic disk. When the initial contract ran out, what was then the
Rome Air Development Center Rome Laboratory (Rome Air Development Center until 1991) is the US "Air Force 'superlab' for command, control, and communications" research and development and is responsible for planning and executing the USAF science and technology program. ...
(RADC) took up further funding in 1954 and onwards.Hutchins, pg. 171 The system was developed by Gilbert King, chief of engineering at ITC, along with a team that included
Louis Ridenour Louis N. Ridenour (June 27, 1911 – May 21, 1959) was a physicist instrumental in U.S. development of radar, Vice President of Lockheed, and an advisor to President Dwight D. Eisenhower. Biography and positions held During World War II, Ri ...
. It evolved into a 16-inch plastic disk with data recorded as a series of microscopic black rectangles or clear spots. Only the outermost 4 inches of the disk were used for storage, which increased the linear speed of the portion being accessed. When the disk spun at 2,400 RPM it had an access speed of about 1 Mbit/sec. In total, the system stored 30 Mbits, making it the highest density online system of its era.


Mark I

In 1954 IBM gave an influential demonstration of machine translation, known today as the " Georgetown-IBM experiment". Run on an
IBM 704 The IBM 704 is a large digital mainframe computer introduced by IBM in 1954. It was the first mass-produced computer with hardware for floating-point arithmetic. The IBM 704 ''Manual of operation'' states: The type 704 Electronic Data-Pro ...
mainframe A mainframe computer, informally called a mainframe or big iron, is a computer used primarily by large organizations for critical applications like bulk data processing for tasks such as censuses, industry and consumer statistics, enterprise ...
, the translation system knew only 250 words of Russian limited to the field of organic chemistry, and only 6 grammar rules for combining them. Nevertheless, the results were extremely promising, and widely reported in the press. At the time, most researchers in the nascent machine translation field felt that the major challenge to providing reasonable translations was building a large library, as storage devices of the era were both too small and too slow to be useful in this role.Hutchins, pg. 172 King felt that the photoscopic store was a natural solution to the problem, and pitched the idea of an automated translation system based on the photostore to the Air Force. RADC proved interested, and provided a research grant in May 1956. At the time, the Air Force also provided a grant to researchers at the
University of Washington The University of Washington (UW, simply Washington, or informally U-Dub) is a public research university in Seattle, Washington. Founded in 1861, Washington is one of the oldest universities on the West Coast; it was established in Seattle a ...
who were working on the problem of producing an optimal translation dictionary for the project. King advocated a simple word-for-word approach to translations. He thought that the natural redundancies in language would allow even a poor translation to be understood, and that local context was alone enough to provide reasonable guesses when faced with ambiguous terms. He stated that "the success of the human in achieving a probability of .50 in anticipating the words in a sentence is largely due to his experience and the real meanings of the words already discovered."King, 1956 In other words, simply translating the words alone would allow a human to effectively read a document, because they would be able to reason out the proper meaning from the context provided by earlier words. In 1958 King moved to IBM's
Thomas J. Watson Research Center The Thomas J. Watson Research Center is the headquarters for IBM Research. The center comprises three sites, with its main laboratory in Yorktown Heights, New York, U.S., 38 miles (61 km) north of New York City, Albany, New York and with ...
, and continued development of the photostore-based translator. Over time, King changed the approach from a pure word-for-word translator to one that stored "stems and endings", which broke words into parts that could be combined back together to form complete words again. The first machine, "Mark I", was demonstrated in July 1959 and consisted of a 65,000 word dictionary and a custom tube-based computer to do the lookups. Texts were hand-copied onto
punched card A punched card (also punch card or punched-card) is a piece of stiff paper that holds digital data represented by the presence or absence of holes in predefined positions. Punched cards were once common in data processing applications or to di ...
s using custom Cyrillic terminals, and then input into the machine for translation. The results were less than impressive, but were enough to suggest that a larger and faster machine would be a reasonable development. In the meantime, the Mark I was applied to translations of the Soviet newspaper, ''
Pravda ''Pravda'' ( rus, Правда, p=ˈpravdə, a=Ru-правда.ogg, "Truth") is a Russian broadsheet newspaper, and was the official newspaper of the Communist Party of the Soviet Union, when it was one of the most influential papers in the co ...
''. The results continued to be questionable, but King declared it a success, stating in ''
Scientific American ''Scientific American'', informally abbreviated ''SciAm'' or sometimes ''SA'', is an American popular science magazine. Many famous scientists, including Albert Einstein and Nikola Tesla, have contributed articles to it. In print since 1845, it i ...
'' that the system was "...found, in an operational evaluation, to be quite useful by the Government."


Mark II

On 4 October 1957 the
USSR The Soviet Union,. officially the Union of Soviet Socialist Republics. (USSR),. was a transcontinental country that spanned much of Eurasia from 1922 to 1991. A flagship communist state, it was nominally a federal union of fifteen nationa ...
launched
Sputnik 1 Sputnik 1 (; see § Etymology) was the first artificial Earth satellite. It was launched into an elliptical low Earth orbit by the Soviet Union on 4 October 1957 as part of the Soviet space program. It sent a radio signal back to Earth for t ...
, the first artificial satellite. This caused a wave of concern in the US, whose own
Project Vanguard Project Vanguard was a program managed by the United States Navy Naval Research Laboratory (NRL), which intended to launch the first artificial satellite into low Earth orbit using a Vanguard rocket. as the launch vehicle from Cape Canaveral ...
was caught flat-footed and then proved to repeatedly fail in spectacular fashion. This embarrassing turn of events led to a huge investment in US science and technology, including the formation of
DARPA The Defense Advanced Research Projects Agency (DARPA) is a research and development agency of the United States Department of Defense responsible for the development of emerging technologies for use by the military. Originally known as the Adv ...
,
NASA The National Aeronautics and Space Administration (NASA ) is an independent agency of the US federal government responsible for the civil space program, aeronautics research, and space research. NASA was established in 1958, succeeding t ...
and a variety of intelligence efforts that would attempt to avoid being surprised in this fashion again. After a short period, the intelligence efforts centralized at the
Wright Patterson Air Force Base Wright-Patterson Air Force Base (WPAFB) is a United States Air Force base and census-designated place just east of Dayton, Ohio, in Greene County, Ohio, Greene and Montgomery County, Ohio, Montgomery counties. It includes both Wright and Patte ...
as the Foreign Technology Division (FTD, now known as the
National Air and Space Intelligence Center The National Air and Space Intelligence Center (NASIC) is the United States Air Force unit for analyzing military intelligence on foreign air and space forces, weapons, and systems. NASIC assessments of aerospace performance characteristics, ca ...
), run by the Air Force with input from the DIA and other organizations. FTD was tasked with the translation of Soviet and other Warsaw Bloc technical and scientific journals so researchers in the "west" could keep up to date on developments behind the
Iron Curtain The Iron Curtain was the political boundary dividing Europe into two separate areas from the end of World War II in 1945 until the end of the Cold War in 1991. The term symbolizes the efforts by the Soviet Union (USSR) to block itself and its s ...
. Most of these documents were publicly available, but FTD also made a number of one-off translations of other materials upon request. Assuming there was a shortage of qualified translators, the FTD became extremely interested in King's efforts at IBM. Funding for an upgraded machine was soon forthcoming, and work began on a "Mark II" system based around a transistorized computer with a faster and higher-capacity 10 inch glass-based optical disc spinning at 2,400 RPM. Another addition was an
optical character reader Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a sc ...
provided by the third party, which they hoped would eliminate the time-consuming process of copying the Russian text into machine-readable cards. In 1960 the Washington team also joined IBM, bringing their dictionary efforts with them. The dictionary continued to expand as additional storage was made available, reaching 170,000 words and terms by the time it was installed at the FTD. A major software update was also incorporated in the Mark II, which King referred to as "dictionary stuffing". Stuffing was an attempt to deal with the problems of ambiguous words by "stuffing" prefixes onto them from earlier words in the text. These modified words would match with similarly stuffed words in the dictionary, reducing the number of false positives. In 1962 King left IBM for
Itek Itek Corporation was a United States defense contractor that initially specialized in camera systems for spy satellites and various other reconnaissance systems. In the early 1960s they built a conglomerate in a fashion similar to LTV or Litto ...
, a military contractor in the process of rapidly acquiring new technologies. Development at IBM continued, and the system went fully operational at FTD in February 1964. The system was demonstrated at the
1964 New York World's Fair The 1964–1965 New York World's Fair was a world's fair that held over 140 pavilions and 110 restaurants, representing 80 nations (hosted by 37), 24 US states, and over 45 corporations with the goal and the final result of building exhibits or ...
. The version at the Fair included a 150,000 word dictionary, with about 1/3 of the words in phrases. About 3,500 of these were stored in
core memory Core or cores may refer to: Science and technology * Core (anatomy), everything except the appendages * Core (manufacturing), used in casting and molding * Core (optical fiber), the signal-carrying portion of an optical fiber * Core, the central ...
to improve performance, and an average speed of 20 words per minute was claimed. The results of the carefully selected input text was quite impressive. After its return to the FTD, it was used continually until 1970, when it was replaced by a machine running SYSTRAN.


ALPAC Report

In 1964 the
United States Department of Defense The United States Department of Defense (DoD, USDOD or DOD) is an executive branch department of the federal government charged with coordinating and supervising all agencies and functions of the government directly related to national secu ...
commissioned the United States
National Academy of Sciences The National Academy of Sciences (NAS) is a United States nonprofit, non-governmental organization. NAS is part of the National Academies of Sciences, Engineering, and Medicine, along with the National Academy of Engineering (NAE) and the Nati ...
(NAS) to prepare a report on the state of machine translation. The NAS formed the "Automatic Language Processing Advisory Committee", or '' ALPAC'', and published their findings in 1966. The report, ''Language and Machines: Computers in Translation and Linguistics'', was highly critical of the existing efforts, demonstrating that the systems were no faster than human translations, while also demonstrating that the supposed lack of translators was in fact a surplus, and as a result of
supply and demand In microeconomics, supply and demand is an economic model of price determination in a Market (economics), market. It postulates that, Ceteris paribus, holding all else equal, in a perfect competition, competitive market, the unit price for a ...
issues, human translation was relatively inexpensive – about $6 per 1,000 words. Worse, the FTD was slower as well; tests using physics papers as input demonstrated that the translator was "10 percent less accurate, 21 percent slower, and had a comprehension level 29 percent lower than when he used human translation." The ALPAC report was as influential as the Georgetown experiment had been a decade earlier; in the immediate aftermath of its publication, the US government suspended almost all funding for machine translation research.John Hutchins
"ALPAC: the (in)famous report"
Ongoing work at IBM and Itek had ended by 1966, leaving the field to the Europeans, who continued development of systems like SYSTRAN and Logos.


References


Notes

^These numbers for the early disk systems appear to be inaccurate – another document from the same author suggests that these figures are actually for the later version used on the Mark II translator.


Bibliography

* G.W. King, G.W. Brown and L.N. Ridenour, "Photographic Techniques for Information Storage", ''Proceedings of the IRE'', Volume 41 Issue 10 (October 1953), pp. 1421–1428 * G.W. King, "Stochastic Methods of Mechanical Translation", ''Mechanical Translation'', Volume 3 Issue 2 (1956) pp. 38–39 * J.L. Craft, E.H. Goldman, W.B. Strohm
"A Table Look-up Machine for Processing of Natural Languages"
''IBM Journal'', July 1961, pp. 192–203 * Language Processing Advisory Committee
"Language and Machines: Computers in Translation and Linguistics"
''National Research Council'', 1966 (widely known as the "ALPAC Report") * John Hutchins (ed)
"Gilbert W. King and the IBM-USAF Translator"
''Early Years in Machine Translation'', Joh Benjamins, 2000, (RADC-TDR-62-105) * Charles Bourne and Trudi Bellardo Hahn
"A History of Online Information Services, 1963–1976"
MIT Press, 2003, {{refend Machine translation software Military electronics of the United States