bytecode
   HOME

TheInfoList



OR:

Bytecode (also called portable code or p-code) is a form of
instruction set In computer science, an instruction set architecture (ISA) is an abstract model that generally defines how software controls the CPU in a computer or a family of computers. A device or program that executes instructions described by that ISA, s ...
designed for efficient execution by a software interpreter. Unlike human-readable
source code In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer. Since a computer, at base, only ...
, bytecodes are compact numeric codes, constants, and references (normally numeric addresses) that encode the result of
compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
parsing and performing semantic analysis of things like type, scope, and nesting depths of program objects. The name ''bytecode'' stems from instruction sets that have one- byte
opcode In computing, an opcode (abbreviated from operation code) is an enumerated value that specifies the operation to be performed. Opcodes are employed in hardware devices such as arithmetic logic units (ALUs), central processing units (CPUs), and ...
s followed by optional parameters. Intermediate representations such as bytecode may be output by
programming language A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
implementations to ease interpretation, or it may be used to reduce hardware and
operating system An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ...
dependence by allowing the same code to run
cross-platform Within computing, cross-platform software (also called multi-platform software, platform-agnostic software, or platform-independent software) is computer software that is designed to work in several Computing platform, computing platforms. Some ...
, on different devices. Bytecode may often be either directly executed on a virtual machine (a p-code machine, i.e., interpreter), or it may be further compiled into machine code for better performance. Since bytecode instructions are processed by software, they may be arbitrarily complex, but are nonetheless often akin to traditional hardware instructions: virtual stack machines are the most common, but virtual register machines have been built also. Different parts may often be stored in separate files, similar to object modules, but dynamically loaded during execution.


Execution

A bytecode program may be executed by parsing and ''directly'' executing the instructions, one at a time. This kind of ''bytecode interpreter'' is very portable. Some systems, called dynamic translators, or '' just-in-time'' (JIT) compilers, translate bytecode into machine code as necessary at runtime. This makes the virtual machine hardware-specific but does not lose the portability of the bytecode. For example,
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
and
Smalltalk Smalltalk is a purely object oriented programming language (OOP) that was originally created in the 1970s for educational use, specifically for constructionist learning, but later found use in business. It was created at Xerox PARC by Learni ...
code is typically stored in bytecode format, which is typically then JIT compiled to translate the bytecode to machine code before execution. This introduces a delay before a program is run, when the bytecode is compiled to native machine code, but improves execution speed considerably compared to interpreting source code directly, normally by around an order of magnitude (10x). Because of its performance advantage, today many language implementations execute a program in two phases, first compiling the source code into bytecode, and then passing the bytecode to the virtual machine. There are bytecode based virtual machines of this sort for
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
, Raku, Python, PHP, Tcl, mawk and Forth (however, Forth is seldom compiled via bytecodes in this way, and its virtual machine is more generic instead). The implementation of Perl and Ruby 1.8 instead work by walking an abstract syntax tree representation derived from the source code. More recently, the authors of V8 and Dart have challenged the notion that intermediate bytecode is needed for fast and efficient VM implementation. Both of these language implementations currently do direct JIT compiling from source code to machine code with no bytecode intermediary.


Examples

*
ActionScript ActionScript is an object-oriented programming language originally developed by Macromedia Inc. (later acquired by Adobe). It is influenced by HyperTalk, the scripting language for HyperCard. It is now an implementation of ECMAScript (mean ...
executes in the ActionScript Virtual Machine (AVM), which is part of Flash Player and AIR. ActionScript code is typically transformed into bytecode format by a
compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
. Examples of compilers include one built into Adobe Flash Professional and one built into Adobe Flash Builder and available in the Adobe Flex SDK. * Adobe Flash objects * BANCStar, originally bytecode for an interface-building tool but used also as a language * Berkeley Packet Filter * EBPF *Berkeley Pascal * Byte Code Engineering Library *C to
Java virtual machine A Java virtual machine (JVM) is a virtual machine that enables a computer to run Java programs as well as programs written in other languages that are also compiled to Java bytecode. The JVM is detailed by a specification that formally descr ...
compilers * CLISP implementation of Common Lisp used to compile only to bytecode for many years; however, now it also supports compiling to native code with the help of GNU lightning * CMUCL and Scieneer Common Lisp implementations of Common Lisp can compile either to native code or to bytecode, which is far more compact * Common Intermediate Language executed by Common Language Runtime, used by
.NET The .NET platform (pronounced as "''dot net"'') is a free and open-source, managed code, managed computer software framework for Microsoft Windows, Windows, Linux, and macOS operating systems. The project is mainly developed by Microsoft emplo ...
languages such as C# * Dalvik bytecode, designed for the Android platform, is executed by the Dalvik virtual machine *Dis bytecode, designed for the
Inferno (operating system) Inferno is a distributed operating system started at Bell Labs and now developed and maintained by Vita Nuova Holdings as free software under the MIT License. Inferno was based on the experience gained with Plan 9 from Bell Labs, and the furth ...
, is executed by the Dis virtual machine * EiffelStudio for the Eiffel programming language *EM, the Amsterdam Compiler Kit virtual machine used as an intermediate compiling language and as a modern bytecode language * Emacs is a text editor with most of its functions implemented by
Emacs Lisp Emacs Lisp is a Lisp dialect made for Emacs. It is used for implementing most of the editing functionality built into Emacs, the remainder being written in C, as is the Lisp interpreter. Emacs Lisp code is used to modify, extend and customi ...
, its built-in dialect of
Lisp Lisp (historically LISP, an abbreviation of "list processing") is a family of programming languages with a long history and a distinctive, fully parenthesized Polish notation#Explanation, prefix notation. Originally specified in the late 1950s, ...
. These features are compiled into bytecode. This architecture allows users to customize the editor with a high level language, which after compiling into bytecode yields reasonable performance. * Embeddable Common Lisp implementation of Common Lisp can compile to bytecode or C code * Common Lisp provides a disassemble function which prints to the standard output the underlying code of a specified function. The result is implementation-dependent and may or may not resolve to bytecode. Its inspection can be utilized for debugging and optimization purposes. Steel Bank Common Lisp, for instance, produces: : (disassemble '(lambda (x) (print x))) ; disassembly for (LAMBDA (X)) ; 2436F6DF: 850500000F22 TEST EAX, x220F0000 ; no-arg-parsing entry point ; E5: 8BD6 MOV EDX, ESI ; E7: 8B05A8F63624 MOV EAX, x2436F6A8 ; # ; ED: B904000000 MOV ECX, 4 ; F2: FF7504 PUSH DWORD PTR BP+4; F5: FF6005 JMP DWORD PTR AX+5; F8: CC0A BREAK 10 ; error trap ; FA: 02 BYTE #X02 ; FB: 18 BYTE #X18 ; INVALID-ARG-COUNT-ERROR ; FC: 4F BYTE #X4F ; ECX *Ericsson implementation of Erlang uses BEAM bytecodes * Ethereum's Virtual Machine (EVM) is the runtime environment, using its own bytecode, for transaction execution in Ethereum (smart contracts). * Icon and
Unicon Unicon, previously known as UNICON, is the World unicycle, Unicycling Convention and Championships sanctioned by the International Unicycling Federation (IUF). The IUF sanctions a biennial world unicycling convention and competition, the major e ...
programming languages *
Infocom Infocom, Inc., was an American software company based in Cambridge, Massachusetts, that produced numerous works of interactive fiction. They also produced a business application, a relational database called ''Cornerstone (software), Cornerston ...
used the Z-machine to make its software applications more portable * Java bytecode, which is executed by the
Java virtual machine A Java virtual machine (JVM) is a virtual machine that enables a computer to run Java programs as well as programs written in other languages that are also compiled to Java bytecode. The JVM is detailed by a specification that formally descr ...
** ASM ** BCEL **Javassist * Keiko bytecode used by the Oberon-2 programming language to make it and the Oberon operating system more portable. * KEYB, the MS-DOS/ PC DOS keyboard driver with its resource file KEYBOARD.SYS containing layout information and short p-code sequences executed by an interpreter inside the resident driver. * LLVM IR *LSL, a scripting language used in virtual worlds compiles into bytecode running on a virtual machine. Second Life has the original Mono version, Inworldz developed the Phlox version. * Lua language uses a register-based bytecode virtual machine *m-code of the MATLAB language * Malbolge is an esoteric machine language for a ternary virtual machine. * Microsoft P-code used in Visual C++ and
Visual Basic Visual Basic is a name for a family of programming languages from Microsoft. It may refer to: * Visual Basic (.NET), the current version of Visual Basic launched in 2002 which runs on .NET * Visual Basic (classic), the original Visual Basic suppo ...
* Multiplan * O-code of the BCPL programming language * OCaml language optionally compiles to a compact bytecode form * p-code of UCSD Pascal implementation of the Pascal language * Parrot virtual machine * Pick BASIC also referred to as Data BASIC or MultiValue BASIC *The R environment for statistical computing offers a bytecode compiler through the compiler package, now standard with R version 2.13.0. It is possible to compile this version of R so that the base and recommended packages exploit this. * Pyramid 2000 adventure game * Python scripts are being compiled on execution to Python's bytecode language, and the compiled files (.pyc) are cached inside the script's folder :Compiled code can be analysed and investigated using a built-in tool for debugging the low-level bytecode. The tool can be initialized from the shell, for example: : >>> import dis # "dis" - Disassembler of Python byte code into mnemonics. >>> dis.dis('print("Hello, World!")') 1 0 LOAD_NAME 0 (print) 2 LOAD_CONST 0 ('Hello, World!') 4 CALL_FUNCTION 1 6 RETURN_VALUE * Scheme 48 implementation of Scheme using bytecode interpreter *Bytecodes of many implementations of the
Smalltalk Smalltalk is a purely object oriented programming language (OOP) that was originally created in the 1970s for educational use, specifically for constructionist learning, but later found use in business. It was created at Xerox PARC by Learni ...
language *The Spin interpreter built into the Parallax Propeller microcontroller *The SQLite database engine translates SQL statements into a bespoke byte-code format. *Apple SWEET16 * Tcl * TIMI is used by compilers on the IBM i platform. * Tiny BASIC * Visual FoxPro compiles to bytecode * WebAssembly * YARV and Rubinius for Ruby * ZCODE * Zend Engine opcodes for PHP


See also

* Computing platform * Intermediate representation * Runtime system


Notes


References

{{reflist, refs= {{cite web , url=http://www.jucs.org/jucs_11_7/the_implementation_of_lua/jucs_11_7_1159_1176_defigueiredo.html , title=The Implementation of Lua 5.0 (NB. This involves a register-based virtual machine.) {{Cite web , url=http://source.android.com/tech/dalvik/dalvik-bytecode.html , title=Dalvik VM , url-status=dead , archive-url=https://web.archive.org/web/20130518021154/http://source.android.com/tech/dalvik/dalvik-bytecode.html , archive-date=2013-05-18 , access-date=2012-10-29 (NB. This VM is register based.) {{cite web , title=Byte Code Vs Machine Code , website=www.allaboutcomputing.net , url=http://www.allaboutcomputing.net/2014/07/byte-code-vs-machine-code.html , access-date=2017-10-23 {{cite web , title=Dynamic Machine Code Generation , publisher=Google Inc. , url=https://chromium.googlesource.com/external/github.com/v8/v8.wiki/+/eb80fb157da30e8c838e758f178de674e47648ed/Design-Elements.md , access-date=2024-12-01 , archive-date=2017-03-05 , archive-url=https://web.archive.org/web/20170305155607/https://github.com/v8/v8/wiki/Design%20Elements , url-status=live {{Cite web , url=http://www.dartlang.org/articles/why-not-bytecode/ , title=Why Not a Bytecode VM? , last=Loitsch , first=Florian , publisher=
Google Google LLC (, ) is an American multinational corporation and technology company focusing on online advertising, search engine technology, cloud computing, computer software, quantum computing, e-commerce, consumer electronics, and artificial ...
, url-status=dead , archive-url=https://web.archive.org/web/20130512215811/http://www.dartlang.org/articles/why-not-bytecode/ , archive-date=2013-05-12
{{Cite web, url=https://2ality.com/2012/01/bytecode-myth.html, title=JavaScript myth: JavaScript needs a standard bytecode, website=2ality.com {{Cite web , url=http://www.cs.arizona.edu/icon/ftp/doc/ib1up.pdf , title=The Implementation of the Icon Programming Language , url-status=dead , archive-url=https://web.archive.org/web/20160305123148/http://www.cs.arizona.edu/icon/ftp/doc/ib1up.pdf , archive-date=5 March 2016 , access-date=9 September 2011 {{Cite web, url=http://unicon.sourceforge.net/book/ib.pdf , archive-url=https://ghostarchive.org/archive/20221009/http://unicon.sourceforge.net/book/ib.pdf , archive-date=2022-10-09 , url-status=live, title=The Implementation of Icon and Unicon a Compendium {{cite newsgroup , title=KEYBOARD.SYS internal structure , newsgroup=comp.os.msdos.programmer , author-first=Matthias R. , author-last=Paul , date=2001-12-30 , url=https://groups.google.com/d/msg/comp.os.msdos.programmer/l_IuSHsBDWQ/887rJF9IYmMJ , access-date=2016-09-17 , url-status=live , archive-url=https://archive.today/20170909082257/https://groups.google.com/forum/%23!msg/comp.os.msdos.programmer/l_IuSHsBDWQ/887rJF9IYmMJ , archive-date=2017-09-09 , quote= ��In fact, the format is basically the same in MS-DOS 3.3 - 8.0, PC DOS 3.3 - 2000, including Russian, Lithuanian, Chinese and Japanese issues, as well as in Windows NT, 2000, and XP �� There are minor differences and incompatibilities, but the general format has not changed over the years. ��Some of the data entries contain normal tables ��However, most entries contain ''executable code'' interpreted by some kind of p-code interpreter at * runtime*, including conditional branches and the like. This is why the KEYB driver has such a huge memory footprint compared to table-driven keyboard drivers which can be done in 3 - 4 Kb getting the same level of function except for the interpreter. ��} {{Cite web , url=http://www.columbia.edu/~em36/wpdos/eurodos.html , title=How to Display the Euro in MS-DOS and Windows DOS , last=Mendelson , first=Edward , author-link=Edward Mendelson , date=2001-07-20 , at=Display the euro symbol in full-screen MS-DOS (including Windows 95 or Windows 98 full-screen DOS) , url-status=live , archive-url=https://web.archive.org/web/20160917201248/http://www.columbia.edu/~em36/wpdos/eurodos.html , archive-date=2016-09-17 , access-date=2016-09-17 , quote= ��Matthias .Paul ��warns that the IBM PC DOS version of the keyboard driver uses some internal procedures that are not recognized by the
Microsoft Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company became influential in the History of personal computers#The ear ...
driver, so, if possible, you should use the
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
versions of both KEYB.COM and KEYBOARD.SYS instead of mixing Microsoft and IBM versions ��} (NB. What is meant by "procedures" here are some additional bytecodes in the IBM KEYBOARD.SYS file not supported by the Microsoft version of the KEYB driver.)
{{cite web , title=United States Patent 6,973,644 , url=http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.htm&r=1&f=G&l=50&s1=6973644.PN.&OS=PN/6973644&RS=PN/6973644 , access-date=2009-05-21 , archive-date=2017-03-05 , archive-url=https://web.archive.org/web/20170305001731/http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.htm&r=1&f=G&l=50&s1=6973644.PN.&OS=PN/6973644&RS=PN/6973644 , url-status=dead {{Cite web, url=https://cran.r-project.org/doc/manuals/R-admin.html#Byte_002dcompiler, title=R Installation and Administration, website=cran.r-project.org {{cite web , title=The SQLite Bytecode Engine , url=https://www.sqlite.org/opcode.html , access-date=29 August 2016 , archive-url=https://web.archive.org/web/20170414044139/http://sqlite.org/opcode.html , archive-date=14 April 2017 , url-status=dead {{cite book , title=Microsoft C Pcode Specifications , page=13 , quote= Multiplan wasn't compiled to machine code, but to a kind of byte-code which was run by an interpreter, in order to make Multiplan portable across the widely varying hardware of the time. This byte-code distinguished between the machine-specific floating point format to calculate on, and an external (standard) format, which was binary coded decimal (BCD). The PACK and UNPACK instructions converted between the two. Virtualization *