HOME

TheInfoList



OR:

A disassembler is a
computer program A computer program is a sequence or set of instructions in a programming language for a computer to execute. Computer programs are one component of software, which also includes documentation and other intangible components. A computer program ...
that translates
machine language In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a very ...
into assembly language—the inverse operation to that of an
assembler Assembler may refer to: Arts and media * Nobukazu Takemura, avant-garde electronic musician, stage name Assembler * Assemblers, a fictional race in the ''Star Wars'' universe * Assemblers, an alternative name of the superhero group Champions of ...
. A disassembler differs from a
decompiler A decompiler is a computer program that translates an executable file to a high-level source file which can be recompiled successfully. It does therefore the opposite of a typical compiler, which translates a high-level language to a low-level l ...
, which targets a
high-level language In computer science, a high-level programming language is a programming language with strong abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language ''elements'', be easier to use, ...
rather than an assembly language. Disassembly, the output of a disassembler, is often formatted for human-readability rather than suitability for input to an assembler, making it principally a
reverse-engineering Reverse engineering (also known as backwards engineering or back engineering) is a process or method through which one attempts to understand through deductive reasoning how a previously made device, process, system, or piece of software accompli ...
tool. Assembly language
source code In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the w ...
generally permits the use of constants and programmer comments. These are usually removed from the assembled
machine code In computer programming, machine code is any low-level programming language, consisting of machine language instructions, which are used to control a computer's central processing unit (CPU). Each instruction causes the CPU to perform a ve ...
by the assembler. If so, a disassembler operating on the machine code would produce disassembly lacking these constants and comments; the disassembled output becomes more difficult for a human to interpret than the original annotated source code. Some disassemblers provide a built-in code commenting feature where the generated output gets enriched with comments regarding called API functions or parameters of called functions. Some disassemblers make use of the symbolic debugging information present in object files such as
ELF An elf () is a type of humanoid supernatural being in Germanic mythology and folklore. Elves appear especially in North Germanic mythology. They are subsequently mentioned in Snorri Sturluson's Icelandic Prose Edda. He distinguishes "ligh ...
. For example, IDA allows the human user to make up mnemonic symbols for values or regions of code in an interactive session: human insight applied to the disassembly process often parallels human creativity in the code writing process. On CISC platforms with variable-width instructions, more than one disassembly may be valid. Disassemblers do not handle code that varies during execution.


Problems of disassembly

Writing a disassembler which produces code which, when assembled, produces exactly the original binary is possible; however, there are often differences. This poses demands on the expressivity of the assembler. For example, an x86 assembler takes an arbitrary choice between two binary codes for something as simple as MOV AX,BX. If the original code uses the other choice, the original code simply cannot be reproduced at any given point in time. However, even when a fully correct disassembly is produced, problems remain if the program requires modification. For example, the same machine language jump instruction can be generated by assembly code to jump to a specified location (for example, to execute specific code), or to jump a specified number of bytes (for example, to skip over an unwanted branch). A disassembler cannot know what is intended, and may use either syntax to generate a disassembly which reproduces the original binary. However, if a programmer wants to add instructions between the jump instruction and its destination, it is necessary to understand the program's operation to determine whether the jump should be absolute or relative, i.e., whether its destination should remain at a fixed location, or be moved so as to skip both the original and added instructions.


Examples of disassemblers

A disassembler may be stand-alone or interactive. A stand-alone disassembler, when executed, generates an assembly language file which can be examined; an interactive one shows the effect of any change the user makes immediately. For example, the disassembler may initially not know that a section of the program is actually code, and treat it as data; if the user specifies that it is code, the resulting disassembled code is shown immediately, allowing the user to examine it and take further action during the same run. Any interactive
debugger A debugger or debugging tool is a computer program used to test and debug other programs (the "target" program). The main use of a debugger is to run the target program under controlled conditions that permit the programmer to track its executi ...
will include some way of viewing the disassembly of the program being debugged. Often, the same disassembly tool will be packaged as a standalone disassembler distributed along with the debugger. For example,
objdump objdump is a command-line program for displaying various information about object files on Unix-like operating systems. For instance, it can be used as a disassembler to view an executable in assembly form. It is part of the GNU Binutils for fine ...
, part of
GNU Binutils The GNU Binary Utilities, or , are a set of programming tools for creating and managing binary programs, object files, libraries, profile data, and assembly source code. Tools They were originally written by programmers at Cygnus Solutions. ...
, is related to the interactive debugger
gdb The GNU Debugger (GDB) is a portable debugger that runs on many Unix-like systems and works for many programming languages, including Ada, C, C++, Objective-C, Free Pascal, Fortran, Go, and partially others. History GDB was first written by ...
. *
Binary Ninja Binary Ninja is a reverse-engineering platform developed by Vector 35 Inc. It can disassemble a binary and display the disassembly in linear or graph views. It performs automated in-depth analysis of the code, generating information that helps to ...
* DEBUG *
Interactive Disassembler The Interactive Disassembler (IDA) is a disassembler for computer software which generates assembly language source code from machine-executable code. It supports a variety of executable formats for different processors and operating systems. I ...
(IDA) *
Ghidra Ghidra (pronounced gee-druh; ) is a free and open source reverse engineering tool developed by the National Security Agency (NSA) of the United States. The binaries were released at RSA Conference in March 2019; the sources were published one mo ...
* Hiew * Hopper Disassembler * PE Explorer Disassembler * Netwide Disassembler (Ndisasm), companion to the Netwide Assembler (NASM). * OLIVER (
CICS IBM CICS (Customer Information Control System) is a family of mixed-language application servers that provide online transaction management and connectivity for applications on IBM mainframe systems under z/OS and z/VSE. CICS family products ...
interactive test/debug) includes disassemblers for Assembler, COBOL, and
PL/1 PL/I (Programming Language One, pronounced and sometimes written PL/1) is a procedural, imperative computer programming language developed and published by IBM. It is designed for scientific, engineering, business and system programming. I ...
*
OllyDbg OllyDbg (named after its author, Oleh Yuschuk) was an x86 debugger that emphasizes binary code analysis, which is useful when source code is not available. It traces registers, recognizes procedures, API calls, switches, tables, constants a ...
is a 32-bit assembler level analysing debugger *
Radare2 Radare2 (also known as r2) is a complete framework for reverse-engineering and analyzing binaries; composed of a set of small utilities that can be used together or independently from the command line. Built around a disassembler for computer soft ...
* SIMON (batch interactive test/debug) includes disassemblers for Assembler, COBOL, and PL/1 * Sourcer, a commenting 16-bit/32-bit disassembler for
DOS DOS is shorthand for the MS-DOS and IBM PC DOS family of operating systems. DOS may also refer to: Computing * Data over signalling (DoS), multiplexing data onto a signalling channel * Denial-of-service attack (DoS), an attack on a communicat ...
,
OS/2 OS/2 (Operating System/2) is a series of computer operating systems, initially created by Microsoft and IBM under the leadership of IBM software designer Ed Iacobucci. As a result of a feud between the two companies over how to position OS/2 r ...
and
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for ser ...
by V Communications in the 1990s


Disassemblers and emulators

A dynamic disassembler can be incorporated into the output of an
emulator In computing, an emulator is hardware or software that enables one computer system (called the ''host'') to behave like another computer system (called the ''guest''). An emulator typically enables the host system to run software or use pe ...
or
hypervisor A hypervisor (also known as a virtual machine monitor, VMM, or virtualizer) is a type of computer software, firmware or hardware that creates and runs virtual machines. A computer on which a hypervisor runs one or more virtual machines is called ...
to 'trace out', line-by-line, the real time execution of any executed machine instructions. In this case, as well as lines containing the disassembled machine code, the register(s) and/or data change(s) (or any other changes of "
state State may refer to: Arts, entertainment, and media Literature * ''State Magazine'', a monthly magazine published by the U.S. Department of State * ''The State'' (newspaper), a daily newspaper in Columbia, South Carolina, United States * ''Our S ...
", such as condition codes) that each individual instruction causes can be shown alongside or beneath the disassembled instruction. This provides extremely powerful debugging information for ultimate problem resolution, although the size of the resultant output can sometimes be quite large, especially if active for an entire program's execution. OLIVER provided these features from the early 1970s as part of its
CICS IBM CICS (Customer Information Control System) is a family of mixed-language application servers that provide online transaction management and connectivity for applications on IBM mainframe systems under z/OS and z/VSE. CICS family products ...
debugging product offering and is now to be found incorporated into the XPEDITER product from
Compuware Compuware Corporation was an American software company based in Detroit, Michigan. The company offers products aimed at the information technology (IT) departments of large businesses, and its services also include testing, development, automation ...
.


Length disassembler

A length disassembler, also known as length disassembler engine (LDE), is a tool that, given a sequence of bytes (instructions), outputs the number of bytes taken by the parsed instruction. Notable open source projects for the x86 architecture include ldisasm, Tiny x86 Length Disassembler and Extended Length Disassembler Engine for x86-64.


See also

*
Control-flow graph In computer science, a control-flow graph (CFG) is a representation, using graph notation, of all paths that might be traversed through a program during its execution. The control-flow graph was discovered by Frances E. Allen, who noted that ...
* Data-flow analysis *
Decompiler A decompiler is a computer program that translates an executable file to a high-level source file which can be recompiled successfully. It does therefore the opposite of a typical compiler, which translates a high-level language to a low-level l ...


References


Further reading

* *


External links

* List of x86 disassemblers in Wikibooks
Transformation Wiki on disassembly

Boomerang
A general, open source, retargetable decompiler of machine code programs *
Online Disassembler
a free online disassembler of arms, mips, ppc, and x86 code {{Authority control Debugging Reverse engineering