computing Computing is any goal-oriented activity requiring, benefiting from, or creating computer, computing machinery. It includes the study and experimentation of algorithmic processes, and the development of both computer hardware, hardware and softw ...

, binary translation is a form of binary recompilation where sequences of instructions are translated from a source

instruction set In computer science, an instruction set architecture (ISA) is an abstract model that generally defines how software controls the CPU in a computer or a family of computers. A device or program that executes instructions described by that ISA, s ...

(ISA) to the target instruction set with respect to the

operating system An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ...

for which the binary was compiled for. In some cases such as instruction set simulation, the target instruction set may be the same as the source instruction set, providing testing and debugging features such as instruction trace, conditional breakpoints and hot spot detection. The two main types are static and dynamic binary translation. Translation can be done in hardware (for example, by circuits in a CPU) or in software (e.g. run-time engines, static recompiler, emulators, all are typically slow).

Motivation

Binary translation is motivated by a lack of a binary for a target platform, the lack of source code to compile for the target platform, or otherwise difficulty in compiling the source for the target platform. Statically recompiled binaries run potentially faster than their respective emulated binaries, as the emulation overhead is removed. This is similar to the difference in performance between interpreted and compiled programs in general.

Static binary translation

A translator using static binary translation aims to convert all of the code of an

executable file In computer science, executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, causes a computer "to perform indicated tasks according to encoded instructions", as opposed to a da ...

into code that runs on the target architecture and platform without having to run the code first, as is done in dynamic binary translation. This is very difficult to do correctly, since not all the code can be discovered by the translator. For example, some parts of the executable may be reachable only through indirect branches, whose value is known only at run-time. One such static binary translator uses universal superoptimizer peephole technology (developed by Sorav Bansal and Alex Aiken from

Stanford University Leland Stanford Junior University, commonly referred to as Stanford University, is a Private university, private research university in Stanford, California, United States. It was founded in 1885 by railroad magnate Leland Stanford (the eighth ...

) to perform efficient translation between possibly many source and target pairs, with considerably low development costs and high performance of the target binary. In experiments of PowerPC-to-x86 translations, some binaries even outperformed native versions, but on average they ran at two-thirds of native speed.

Examples for static binary translations

In 1960s

Honeywell Honeywell International Inc. is an American publicly traded, multinational conglomerate corporation headquartered in Charlotte, North Carolina. It primarily operates in four areas of business: aerospace, building automation, industrial automa ...

provided a program called the Liberator for their Honeywell 200 series of computers; it could translate programs for the

IBM 1400 series The IBM 1400 series are second-generation (transistor) mid-range business decimal computers that IBM marketed in the early 1960s. The computers were offered to replace tabulating machines like the IBM 407. The 1400-series machines stored infor ...

of computers into programs for the Honeywell 200 series. In 1995 Norman Ramsey at Bell Communications Research and Mary F. Fernandez at Department of Computer Science,

Princeton University Princeton University is a private university, private Ivy League research university in Princeton, New Jersey, United States. Founded in 1746 in Elizabeth, New Jersey, Elizabeth as the College of New Jersey, Princeton is the List of Colonial ...

developed ''The New Jersey Machine-Code Toolkit'' that had the basic tools for static assembly translation. In 2004 Scott Elliott and Phillip R. Hutchinson at

Nintendo is a Japanese Multinational corporation, multinational video game company headquartered in Kyoto. It develops, publishes, and releases both video games and video game consoles. The history of Nintendo began when craftsman Fusajiro Yamauchi ...

developed a tool to generate "C" code from

Game Boy The is a handheld game console developed by Nintendo, launched in the Japanese home market on April 21, 1989, followed by North America later that year and other territories from 1990 onwards. Following the success of the Game & Watch single-ga ...

binary that could then be compiled for a new platform and linked against a hardware library for use in airline entertainment systems. In 2014, an

ARM architecture ARM (stylised in lowercase as arm, formerly an acronym for Advanced RISC Machines and originally Acorn RISC Machine) is a family of reduced instruction set computer, RISC instruction set architectures (ISAs) for central processing unit, com ...

version of the 1998

video game A video game or computer game is an electronic game that involves interaction with a user interface or input device (such as a joystick, game controller, controller, computer keyboard, keyboard, or motion sensing device) to generate visual fe ...

'' StarCraft'' was generated by static recompilation and additional

reverse engineering Reverse engineering (also known as backwards engineering or back engineering) is a process or method through which one attempts to understand through deductive reasoning how a previously made device, process, system, or piece of software accompl ...

of the original

x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel, based on the 8086 microprocessor and its 8-bit-external-bus variant, the 8088. Th ...

version. The

Pandora In Greek mythology, Pandora was the first human woman created by Hephaestus on the instructions of Zeus. As Hesiod related it, each god cooperated by giving her unique gifts. Her other name—inscribed against her figure on a white-ground '' ky ...

handheld community was capable of developing the required tools on their own and achieving such translations successfully several times. Another example is the NES-to-

statically recompiled version of the videogame '' Super Mario Bros.'' which was generated under usage of

LLVM LLVM, also called LLVM Core, is a target-independent optimizer and code generator. It can be used to develop a Compiler#Front end, frontend for any programming language and a Compiler#Back end, backend for any instruction set architecture. LLVM i ...

in 2013. For instance, a successful x86-to- x64 static recompilation was generated for the procedural terrain generator of the video game '' Cube World'' in 2014.

Dynamic binary translation

Dynamic binary translation (DBT) looks at a short sequence of code—typically on the order of a single

basic block In compiler construction, a basic block is a straight-line code sequence with no branches in except to the entry and no branches out except at the exit. This restricted form makes a basic block highly amenable to analysis. Compilers usually decom ...

—then translates it and caches the resulting sequence. Code is only translated as it is discovered and when possible, and branch instructions are made to point to already translated and saved code (

memoization In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls to pure functions and returning the cached result when the same inputs occur ag ...

). Dynamic binary translation differs from simple emulation (eliminating the emulator's main read-decode-execute loop—a major performance bottleneck), paying for this by large overhead during translation time. This overhead is hopefully amortized as translated code sequences are executed multiple times. More advanced dynamic translators employ

dynamic recompilation In computer science, dynamic recompilation is a feature of some emulators and virtual machines, where the system may recompile some part of a program during execution. By compiling during execution, the system can tailor the generated code to ...

where the translated code is instrumented to find out what portions are executed a large number of times, and these portions are optimized aggressively. This technique is reminiscent of a

JIT compiler In computing, just-in-time (JIT) compilation (also dynamic translation or run-time compilations) is compiler, compilation (of Source code, computer code) during execution of a program (at run time (program lifecycle phase), run time) rather than b ...

, and in fact such compilers (e.g.

Sun The Sun is the star at the centre of the Solar System. It is a massive, nearly perfect sphere of hot plasma, heated to incandescence by nuclear fusion reactions in its core, radiating the energy from its surface mainly as visible light a ...

's HotSpot technology) can be viewed as dynamic translators from a virtual instruction set (the

bytecode Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references (normal ...

) to a real one.

Examples for dynamic binary translations in software

Apple Computer Apple Inc. is an American multinational corporation and technology company headquartered in Cupertino, California, in Silicon Valley. It is best known for its consumer electronics, software, and services. Founded in 1976 as Apple Computer Co ...

implemented a dynamic translating

emulator In computing, an emulator is Computer hardware, hardware or software that enables one computer system (called the ''host'') to behave like another computer system (called the ''guest''). An emulator typically enables the host system to run sof ...

for

M68K The Motorola 68000 series (also known as 680x0, m68000, m68k, or 68k) is a family of 32-bit complex instruction set computer (CISC) microprocessors. During the 1980s and early 1990s, they were popular in personal computers and workstations and w ...

code in their

PowerPC PowerPC (with the backronym Performance Optimization With Enhanced RISC – Performance Computing, sometimes abbreviated as PPC) is a reduced instruction set computer (RISC) instruction set architecture (ISA) created by the 1991 Apple Inc., App ...

line of Macintoshes, which achieved a very high level of reliability, performance and compatibility (see Mac 68K emulator). This allowed Apple to bring the machines to market with only a partially native

, and end users could adopt the new, faster architecture without risking their investment in software. Partly because the emulator was so successful, many parts of the operating system remained emulated. A full transition to a PowerPC native

(OS) was not made until the release of

Mac OS X macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...

(10.0) in 2001. (The Mac OS X "

Classic A classic is an outstanding example of a particular style; something of Masterpiece, lasting worth or with a timeless quality; of the first or Literary merit, highest quality, class, or rank – something that Exemplification, exemplifies its ...

" runtime environment continued to offer this emulation capability on PowerPC Macs until Mac OS X 10.5.) * Mac OS X 10.4.4 for Intel-based Macs introduced the Rosetta dynamic translation layer to ease Apple's transition from PPC-based hardware to x86. Developed for Apple by Transitive Corporation, the Rosetta software is an implementation of Transitive's QuickTransit solution. * QuickTransit during its product lifespan also provided SPARC→

, x86→

and MIPS→

Itanium 2 Itanium (; ) is a discontinued family of 64-bit Intel microprocessors that implement the Intel Itanium architecture (formerly called IA-64). The Itanium architecture originated at Hewlett-Packard (HP), and was later jointly developed by HP and I ...

translation support. * DEC achieved similar success with its translation tools to help users migrate from the CISC VAX architecture to the

Alpha Alpha (uppercase , lowercase ) is the first letter of the Greek alphabet. In the system of Greek numerals, it has a value of one. Alpha is derived from the Phoenician letter ''aleph'' , whose name comes from the West Semitic word for ' ...

RISC In electronics and computer science, a reduced instruction set computer (RISC) is a computer architecture designed to simplify the individual instructions given to the computer to accomplish tasks. Compared to the instructions given to a comp ...

architecture. * HP ARIES (Automatic Re-translation and Integrated Environment Simulation) is a software dynamic binary translation system that combines fast code interpretation with two phase dynamic translation to transparently and accurately execute HP 9000

HP-UX HP-UX (from "Hewlett Packard Unix") is a proprietary software, proprietary implementation of the Unix operating system developed by Hewlett Packard Enterprise; current versions support HPE Integrity Servers, based on Intel's Itanium architect ...

applications on

11i for

HPE Integrity Servers HPE Integrity Servers is a series of server (computing), server computers produced by Hewlett Packard Enterprise (formerly Hewlett-Packard) since 2003, based on the Itanium processor. The Integrity brand name was inherited by HP from Tandem Com ...

. The ARIES fast interpreter emulates a complete set of non-privileged

PA-RISC Precision Architecture reduced instruction set computer, RISC (PA-RISC) or Hewlett Packard Precision Architecture (HP/PA or simply HPPA), is a computer, general purpose computer instruction set architecture (ISA) developed by Hewlett-Packard f ...

instructions with no user intervention. During interpretation, it monitors the application's execution pattern and translates only the frequently executed code into native

Itanium Itanium (; ) is a discontinued family of 64-bit computing, 64-bit Intel microprocessors that implement the Intel Itanium architecture (formerly called IA-64). The Itanium architecture originated at Hewlett-Packard (HP), and was later jointly dev ...

code at runtime. ARIES implements two phase dynamic translation, a technique in which translated code in first phase collects runtime profile information which is used during second phase translation to further optimize the translated code. ARIES stores the dynamically translated code in memory buffer called code cache. Further references to translated basic blocks execute directly in the code cache and do not require additional interpretation or translation. The targets of translated code blocks are back-patched to ensure execution takes place in code cache most of the time. At the end of the emulation, ARIES discards all the translated code without modifying the original application. The ARIES emulation engine also implements Environment Emulation which emulates an HP 9000

application's system calls, signal delivery, exception management, threads management, emulation of HP GDB for debugging, and core file creation for the application. * DEC created the FX!32 binary translator for converting

applications to Alpha applications. *

Sun Microsystems Sun Microsystems, Inc., often known as Sun for short, was an American technology company that existed from 1982 to 2010 which developed and sold computers, computer components, software, and information technology services. Sun contributed sig ...

' Wabi software included dynamic translation from x86 to SPARC instructions. * In January 2000,

Transmeta Transmeta Corporation was an American fabless semiconductor company based in Santa Clara, California. It developed low power x86 compatible microprocessors based on a VLIW core and a software layer called Code Morphing Software. Code Morphing ...

Corporation announced a novel processor design named Crusoe. From the FAQ on their web site, *

Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer compo ...

Corporation developed and implemented an IA-32 Execution Layer - a dynamic binary translator designed to support IA-32 applications on

-based systems, which was included in

Microsoft Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company became influential in the History of personal computers#The ear ...

Windows Server Windows Server (formerly Windows NT Server) is a brand name for Server (computing), server-oriented releases of the Windows NT operating system (OS) that have been developed by Microsoft since 1993. The first release under this brand name i ...

for

architecture, as well as in several flavors of

Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...

, including

Red Hat Red Hat, Inc. (formerly Red Hat Software, Inc.) is an American software company that provides open source software products to enterprises and is a subsidiary of IBM. Founded in 1993, Red Hat has its corporate headquarters in Raleigh, North ...

and Suse. It allowed IA-32 applications to run faster than they would using the native IA-32 mode on Itanium processors. *

Dolphin A dolphin is an aquatic mammal in the cetacean clade Odontoceti (toothed whale). Dolphins belong to the families Delphinidae (the oceanic dolphins), Platanistidae (the Indian river dolphins), Iniidae (the New World river dolphins), Pontopori ...

(an emulator for the

GameCube The is a PowerPC-based home video game console developed and marketed by Nintendo. It was released in Japan on September 14, 2001, in North America on November 18, 2001, in Europe on May 3, 2002, and in Australia on May 17, 2002. It is the suc ...

/ Wii) performs JIT recompilation of PowerPC code to x86 and AArch64. * Microsoft Virtual PC supports binary translation for 32-bit guest operating systems. *

VMware Workstation VMware Workstation Pro (known as VMware Workstation until release of VMware Workstation 12 in 2015) is a hosted (Type 2) hypervisor that runs on x64 versions of Windows and Linux operating systems. It enables users to set up virtual machines (VM ...

12 or earlier are known to support binary translation for 32-bit guest operating systems.

Examples for dynamic binary translations in hardware

Nvidia Nvidia Corporation ( ) is an American multinational corporation and technology company headquartered in Santa Clara, California, and incorporated in Delaware. Founded in 1993 by Jensen Huang (president and CEO), Chris Malachowsky, and Curti ...

Tegra Tegra is a system on a chip (SoC) series developed by Nvidia for mobile devices such as smartphones, personal digital assistants, and mobile Internet devices. The Tegra integrates an ARM architecture central processing unit (CPU), graphics pr ...

K1 Denver translates ARM instructions over a slow hardware decoder to its native microcode instructions and uses a software binary translator for hot code.

Motivation

Static binary translation

Examples for static binary translations

Dynamic binary translation

Examples for dynamic binary translations in software

Examples for dynamic binary translations in hardware

See also

References

Further reading