HOME

TheInfoList



OR:

Reverse engineering (also known as backwards engineering or back engineering) is a process or method through which one attempts to understand through
deductive reasoning Deductive reasoning is the mental process of drawing deductive inferences. An inference is deductively Validity (logic), valid if its conclusion follows logically from its premises, i.e. if it is impossible for the premises to be true and the concl ...
how a previously made device, process, system, or piece of
software Software is a set of computer programs and associated software documentation, documentation and data (computing), data. This is in contrast to Computer hardware, hardware, from which the system is built and which actually performs the work. ...
accomplishes a task with very little (if any) insight into exactly how it does so. It is essentially the process of opening up or dissecting a system to see how it works, in order to duplicate or enhance it. Depending on the system under consideration and the technologies employed, the knowledge gained during reverse engineering can help with repurposing obsolete objects, doing security analysis, or learning how something works. Although the process is specific to the object on which it is being performed, all reverse engineering processes consist of three basic steps: Information extraction, Modeling, and Review. Information extraction refers to the practice of gathering all relevant information for performing the operation. Modeling refers to the practice of combining the gathered information into an abstract model, which can be used as a guide for designing the new object or system. Review refers to the testing of the model to ensure the validity of the chosen abstract. Reverse engineering is applicable in the fields of
computer engineering Computer engineering (CoE or CpE) is a branch of electrical engineering and computer science that integrates several fields of computer science and electronic engineering required to develop computer hardware and computer software, software. C ...
,
mechanical engineering Mechanical engineering is the study of physical Machine, machines that may involve force and movement. It is an engineering branch that combines engineering physics and engineering mathematics, mathematics principles with materials science, to d ...
,
design A design is a plan or specification for the construction of an object or system or for the implementation of an activity or process or the result of that plan or specification in the form of a prototype, product, or process. The verb ''to design'' ...
,
electronic engineering Electronics engineering is a sub-discipline of electrical engineering which emerged in the early 20th century and is distinguished by the additional use of active components such as semiconductor devices to amplify and control electric current ...
,
software engineering Software engineering is a systematic engineering approach to software development. A software engineer is a person who applies the principles of software engineering to design, develop, maintain, test, and evaluate software, computer software. Th ...
,
chemical engineering Chemical engineering is an engineering field which deals with the study of operation and design of chemical plants as well as methods of improving production. Chemical engineers develop economical commercial processes to convert raw materials int ...
, and
systems biology Systems biology is the computational modeling, computational and mathematical analysis and modeling of complex biological systems. It is a biology-based interdisciplinary field of study that focuses on complex interactions within biological syst ...
.


Overview

There are many reasons for performing reverse engineering in various fields. Reverse engineering has its origins in the analysis of hardware for commercial or military advantage. However, the reverse engineering process, as such, is not concerned with creating a copy or changing the artifact in some way. It is only an
analysis Analysis (plural, : analyses) is the process of breaking a complexity, complex topic or Substance theory, substance into smaller parts in order to gain a better understanding of it. The technique has been applied in the study of mathematics a ...
to deduce design features from products with little or no additional knowledge about the procedures involved in their original production. In some cases, the goal of the reverse engineering process can simply be a redocumentation of legacy systems.A Survey of Reverse Engineering and Program Comprehension. Michael L. Nelson, April 19, 1996, ODU CS 551 – Software Engineering Survey. Even when the reverse-engineered product is that of a competitor, the goal may not be to copy it but to perform
competitor analysis Competitive analysis in marketing and strategic management is an assessment of the strengths and weaknesses of current and potential Competition (economics), competitors. This analysis provides both an offensive and defensive strategic context to ...
. Reverse engineering may also be used to create interoperable products and despite some narrowly-tailored United States and European Union legislation, the legality of using specific reverse engineering techniques for that purpose has been hotly contested in courts worldwide for more than two decades.
Software Software is a set of computer programs and associated software documentation, documentation and data (computing), data. This is in contrast to Computer hardware, hardware, from which the system is built and which actually performs the work. ...
reverse engineering can help to improve the understanding of the underlying source code for the maintenance and improvement of the software, relevant information can be extracted to make a decision for software development and graphical representations of the code can provide alternate views regarding the source code, which can help to detect and fix a software bug or vulnerability. Frequently, as some software develops, its design information and improvements are often lost over time, but that lost information can usually be recovered with reverse engineering. The process can also help to cut down the time required to understand the source code, thus reducing the overall cost of the software development. Reverse engineering can also help to detect and to eliminate a malicious code written to the software with better code detectors. Reversing a source code can be used to find alternate uses of the source code, such as detecting the unauthorized replication of the source code where it was not intended to be used, or revealing how a competitor's product was built. That process is commonly used for "cracking" software and media to remove their
copy protection Copy protection, also known as content protection, copy prevention and copy restriction, describes measures to enforce copyright A copyright is a type of intellectual property that gives its owner the exclusive right to copy, distribute, ...
, or to create a possibly-improved copy or even a knockoff, which is usually the goal of a competitor or a hacker.
Malware Malware (a portmanteau for ''malicious software'') is any software intentionally designed to cause disruption to a computer, server (computing), server, Client (computing), client, or computer network, leak private information, gain unauthorized ...
developers often use reverse engineering techniques to find vulnerabilities in an
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
to build a
computer virus A computer virus is a type of computer program that, when executed, replicates itself by modifying other computer programs and Code injection, inserting its own Computer language, code. If this replication succeeds, the affected areas are then s ...
that can exploit the system vulnerabilities. Reverse engineering is also being used in
cryptanalysis Cryptanalysis (from the Greek language, Greek ''kryptós'', "hidden", and ''analýein'', "to analyze") refers to the process of analyzing information systems in order to understand hidden aspects of the systems. Cryptanalysis is used to breach C ...
to find vulnerabilities in
substitution cipher In cryptography, a substitution cipher is a method of encrypting in which units of plaintext are replaced with the ciphertext, in a defined manner, with the help of a key; the "units" may be single letters (the most common), pairs of letters, trip ...
,
symmetric-key algorithm Symmetric-key algorithms are algorithms for cryptography that use the same Key (cryptography), cryptographic keys for both the encryption of plaintext and the decryption of ciphertext. The keys may be identical, or there may be a simple transformat ...
or
public-key cryptography Public-key cryptography, or asymmetric cryptography, is the field of cryptographic systems that use pairs of related keys. Each key pair consists of a public key and a corresponding private key. Key pairs are generated with cryptographic alg ...
. There are other uses to reverse engineering: * Interfacing. Reverse engineering can be used when a system is required to interface to another system and how both systems would negotiate is to be established. Such requirements typically exist for
interoperability Interoperability is a characteristic of a product or system to work with other products or systems. While the term was initially defined for information technology Information technology (IT) is the use of computers to create, process, ...
. * Military or
commercial Commercial may refer to: * a dose of advertising conveyed through media (such as - for example - radio or television) ** Radio advertisement ** Television advertisement * (adjective for:) commerce, a system of voluntary exchange of products and s ...
espionage Espionage, spying, or intelligence gathering is the act of obtaining Secrecy, secret or Confidentiality, confidential information (Intelligence assessment, intelligence) from non-disclosed sources or divulging of the same without the Consent ...
. Learning about an enemy's or competitor's latest research by stealing or capturing a prototype and dismantling it may result in the development of a similar product or a better countermeasure against it. * Obsolescence.
Integrated circuits An integrated circuit or monolithic integrated circuit (also referred to as an IC, a chip, or a microchip) is a set of electronic circuit An electronic circuit is composed of individual electronic components, such as resistors, transistor ...
are often designed on proprietary systems and built on production lines, which become obsolete in only a few years. When systems using those parts can no longer be maintained since the parts are no longer made, the only way to incorporate the functionality into new technology is to reverse-engineer the existing chip and then to redesign it using newer tools by using the understanding gained as a guide. Another obsolescence originated problem that can be solved by reverse engineering is the need to support (maintenance and supply for continuous operation) existing legacy devices that are no longer supported by their
original equipment manufacturer An original equipment manufacturer (OEM) is generally perceived as a company that produces non-aftermarket parts and equipment that may be marketed by another manufacturer. It is a common industry term recognized and used by many professional or ...
. The problem is particularly critical in military operations. * Product security analysis. That examines how a product works by determining the specifications of its components and estimate costs and identifies potential
patent infringement Patent infringement is the commission of a prohibited act with respect to a patented invention without permission from the patent holder. Permission may typically be granted in the form of a license. The definition of patent infringement may va ...
. Also part of product security analysis is acquiring sensitive data by disassembling and analyzing the design of a system component.Internet Engineering Task Force RFC 2828 Internet Security Glossary Another intent may be to remove
copy protection Copy protection, also known as content protection, copy prevention and copy restriction, describes measures to enforce copyright A copyright is a type of intellectual property that gives its owner the exclusive right to copy, distribute, ...
or to circumvent access restrictions. * Competitive technical intelligence. That is to understand what one's competitor is actually doing, rather than what it says that it is doing. * Saving money. Finding out what a piece of electronics can do may spare a user from purchasing a separate product. * Repurposing. Obsolete objects are then reused in a different-but-useful manner. *
Design A design is a plan or specification for the construction of an object or system or for the implementation of an activity or process or the result of that plan or specification in the form of a prototype, product, or process. The verb ''to design'' ...
. Production and design companies applied Reverse Engineering to practical craft-based manufacturing process. The companies can work on “historical” manufacturing collections through 3D scanning, 3D re-modeling and re-design. In 2013 Italian manufactures Baldi and Savio Firmino together with
University of Florence The University of Florence (Italian language, Italian: ''Università degli Studi di Firenze'', UniFI) is an Italian public research university located in Florence, Italy. It comprises 12 schools and has around 50,000 students enrolled. History ...
optimized their innovation, design, and production processes.


Common situations


Machines

As
computer-aided design Computer-aided design (CAD) is the use of computers (or ) to aid in the creation, modification, analysis, or optimization of a design. This software is used to increase the productivity of the designer, improve the quality of design, improve co ...
(CAD) has become more popular, reverse engineering has become a viable method to create a 3D virtual model of an existing physical part for use in 3D CAD, CAM, CAE, or other software. The reverse-engineering process involves measuring an object and then reconstructing it as a 3D model. The physical object can be measured using
3D scanning 3D scanning is the process of analyzing a real-world object or environment to collect data on its shape and possibly its appearance (e.g. color). The collected data can then be used to construct digital 3D models. A 3D scanner can be based on ...
technologies like CMMs, laser scanners, structured light digitizers, or industrial CT scanning (computed tomography). The measured data alone, usually represented as a
point cloud Point or points may refer to: Places * Point, Lewis, a peninsula in the Outer Hebrides, Scotland * Point, Texas, a city in Rains County, Texas, United States * Point, the NE tip and a ferry terminal of Lismore, Inner Hebrides, Scotland * Poi ...
, lacks topological information and design intent. The former may be recovered by converting the point cloud to a triangular-faced mesh. Reverse engineering aims to go beyond producing such a mesh and to recover the design intent in terms of simple analytical surfaces where appropriate (planes, cylinders, etc.) as well as possibly NURBS surfaces to produce a boundary-representation CAD model. Recovery of such a model allows a design to be modified to meet new requirements, a manufacturing plan to be generated, etc. Hybrid modeling is a commonly used term when NURBS and parametric modeling are implemented together. Using a combination of geometric and freeform surfaces can provide a powerful method of 3D modeling. Areas of freeform data can be combined with exact geometric surfaces to create a hybrid model. A typical example of this would be the reverse engineering of a cylinder head, which includes freeform cast features, such as water jackets and high-tolerance machined areas. Reverse engineering is also used by businesses to bring existing physical geometry into digital product development environments, to make a digital 3D record of their own products, or to assess competitors' products. It is used to analyze how a product works, what it does, what components it has; estimate costs; identify potential
patent A patent is a type of intellectual property that gives its owner the legal right to exclude others from making, using, or selling an invention for a limited period of time in exchange for publishing an sufficiency of disclosure, enabling disclo ...
infringement; etc.
Value engineering Value engineering (VE) is a systematic analysis of the functions of various components and materials to lower the cost of goods, products and services with a tolerable loss of performance or functionality. Value, as defined, ...
, a related activity that is also used by businesses, involves deconstructing and analyzing products. However, the objective is to find opportunities for cost-cutting.


PCB Reverse Engineering

Reverse engineering of
printed circuit board A printed circuit board (PCB; also printed wiring board or PWB) is a medium used in Electrical engineering, electrical and electronic engineering to connect electronic components to one another in a controlled manner. It takes the form of a L ...
s involves recreating fabrication data for a particular circuit board. This is done to allow benchmarking, and support for legacy systems.


Software

In 1990, the
Institute of Electrical and Electronics Engineers The Institute of Electrical and Electronics Engineers (IEEE) is a 501(c)(3) organization, 501(c)(3) professional association for electronic engineering and electrical engineering (and associated disciplines) with its corporate office in New Yor ...
(IEEE) defined (software) reverse engineering (SRE) as "the process of analyzing a subject system to identify the system's components and their interrelationships and to create representations of the system in another form or at a higher level of abstraction" in which the "subject system" is the end product of software development. Reverse engineering is a process of examination only, and the software system under consideration is not modified, which would otherwise be re-engineering or restructuring. Reverse engineering can be performed from any stage of the product cycle, not necessarily from the functional end product. There are two components in reverse engineering: redocumentation and design recovery. Redocumentation is the creation of new representation of the computer code so that it is easier to understand. Meanwhile, design recovery is the use of deduction or reasoning from general knowledge or personal experience of the product to understand the product's functionality fully. It can also be seen as "going backwards through the development cycle." In this model, the output of the implementation phase (in source code form) is reverse-engineered back to the analysis phase, in an inversion of the traditional
waterfall model The waterfall model is a breakdown of project activities into linear sequential phases, meaning they are passed down onto each other, where each phase depends on the deliverables of the previous one and corresponds to a specialization of tasks. ...
. Another term for this technique is program comprehension. The Working Conference on Reverse Engineering (WCRE) has been held yearly to explore and expand the techniques of reverse engineering. Computer-aided software engineering (CASE) and automated code generation have contributed greatly in the field of reverse engineering. Software anti-tamper technology like
obfuscation Obfuscation is the wikt:obscure#Verb, obscuring of the intended meaning (linguistics), meaning of communication by making the message difficult to understand, usually with mental confusion, confusing and ambiguity, ambiguous language. The obfuscat ...
is used to deter both reverse engineering and re-engineering of proprietary software and software-powered systems. In practice, two main types of reverse engineering emerge. In the first case, source code is already available for the software, but higher-level aspects of the program, which are perhaps poorly documented or documented but no longer valid, are discovered. In the second case, there is no source code available for the software, and any efforts towards discovering one possible source code for the software are regarded as reverse engineering. The second usage of the term is more familiar to most people. Reverse engineering of software can make use of the clean room design technique to avoid copyright infringement. On a related note, black box testing in
software engineering Software engineering is a systematic engineering approach to software development. A software engineer is a person who applies the principles of software engineering to design, develop, maintain, test, and evaluate software, computer software. Th ...
has a lot in common with reverse engineering. The tester usually has the API but has the goals to find bugs and undocumented features by bashing the product from outside. Other purposes of reverse engineering include security auditing, removal of copy protection (" cracking"), circumvention of access restrictions often present in
consumer electronics Consumer electronics or home electronics are electronic ( analog or digital) equipment intended for everyday use, typically in private homes. Consumer electronics include devices used for entertainment Entertainment is a form of ac ...
, customization of
embedded systems An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or Electronics, electronic system. It is '' ...
(such as engine management systems), in-house repairs or retrofits, enabling of additional features on low-cost "crippled" hardware (such as some graphics card chip-sets), or even mere satisfaction of curiosity.


Binary software

Binary reverse engineering is performed if source code for a software is unavailable. This process is sometimes termed ''reverse code engineering'', or RCE. For example, decompilation of binaries for the
Java platform Java is a set of computer software and specifications developed by James Gosling at Sun Microsystems, which was later acquired by the Oracle Corporation, that provides a system for developing application software and deploying it in a cros ...
can be accomplished by using Jad. One famous case of reverse engineering was the first non- IBM implementation of the PC
BIOS In computing, BIOS (, ; Basic Input/Output System, also known as the System BIOS, ROM BIOS, BIOS ROM or PC BIOS) is firmware used to provide runtime services for operating systems and programs and to perform Computer hardware, hardware initializ ...
, which launched the historic
IBM PC compatible IBM PC compatible computers are similar to the original IBM Personal Computer, IBM PC, IBM Personal Computer XT, XT, and IBM Personal Computer/AT, AT, all from computer giant IBM, that are able to use the same software and expansion cards. Such ...
industry that has been the overwhelmingly-dominant
computer hardware Computer hardware includes the physical parts of a computer, such as the computer case, case, central processing unit (CPU), Random-access memory, random access memory (RAM), Computer monitor, monitor, Computer mouse, mouse, Computer keyboard, ...
platform for many years. Reverse engineering of software is protected in the US by the
fair use Fair use is a doctrine in United States law that permits limited use of copyright A copyright is a type of intellectual property that gives its owner the exclusive right to copy, distribute, adapt, display, and perform a creative work ...
exception in
copyright law A copyright is a type of intellectual property Intellectual property (IP) is a category of property that includes intangible creations of the human intellect. There are many types of intellectual property, and some countries recognize ...
. The Samba software, which allows systems that do not run
Microsoft Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft Microsoft Corporation is an American multinational corporation, multinational technology company, technology corporatio ...
systems to share files with systems that run it, is a classic example of software reverse engineering since the Samba project had to reverse-engineer unpublished information about how Windows file sharing worked so that non-Windows computers could emulate it. The
Wine Wine is an alcoholic drink typically made from Fermentation in winemaking, fermented grapes. Yeast in winemaking, Yeast consumes the sugar in the grapes and converts it to ethanol and carbon dioxide, releasing heat in the process. Different ...
project does the same thing for the
Windows API The Windows API, informally WinAPI, is Microsoft's core set of API, application programming interfaces (APIs) available in the Microsoft Windows operating systems. The name Windows API collectively refers to several different platform implementatio ...
, and OpenOffice.org is one party doing that for the
Microsoft Office Microsoft Office, or simply Office, is the former name of a family of client software, server software, and services developed by Microsoft. It was first announced by Bill Gates on August 1, 1988, at COMDEX in Las Vegas. Initially a marketin ...
file formats. The ReactOS project is even more ambitious in its goals by striving to provide binary (ABI and API) compatibility with the current Windows operating systems of the NT branch, which allows software and drivers written for Windows to run on a clean-room reverse-engineered
free software Free software or libre software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, no ...
( GPL) counterpart. WindowsSCOPE allows for reverse-engineering the full contents of a Windows system's live memory including a binary-level, graphical reverse engineering of all running processes. Another classic, if not well-known, example is that in 1987
Bell Laboratories Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984), then AT&T Bell Laboratories (1984–1996) and Bell Labs Innovations (1996–2007), is an American industrial Research and development, research and scientific developm ...
reverse-engineered the
Mac OS Two major famlies of Mac operating systems were developed by Apple Inc. In 1984, Apple debuted the operating system that is now known as the Classic Mac OS, "Classic" Mac OS with its release of the System 1, original Macintosh System Software. ...
System 4.1, originally running on the Apple Macintosh SE, so that it could run it on RISC machines of their own.


=Binary software techniques

= Reverse engineering of software can be accomplished by various methods. The three main groups of software reverse engineering are #Analysis through observation of information exchange, most prevalent in protocol reverse engineering, which involves using bus analyzers and packet sniffers, such as for accessing a
computer bus In computer architecture In computer engineering, computer architecture is a description of the structure of a computer system made from component parts. It can sometimes be a high-level description that ignores details of the implementatio ...
or
computer network A computer network is a set of computers sharing resources located on or provided by Node (networking), network nodes. The computers use common communication protocols over digital signal, digital interconnections to communicate with each ot ...
connection and revealing the traffic data thereon. Bus or network behavior can then be analyzed to produce a standalone implementation that mimics that behavior. That is especially useful for reverse engineering
device driver In computing, a device driver is a computer program that operates or controls a particular type of Peripheral, device that is attached to a computer or automaton. A driver provides a software Interface (computing), interface to Computer hardware, ...
s. Sometimes, reverse engineering on
embedded system An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or Electronics, electronic system. It is '' ...
s is greatly assisted by tools deliberately introduced by the manufacturer, such as
JTAG JTAG (named after the Joint Test Action Group which codified it) is an Technical standard, industry standard for verifying designs and testing printed circuit boards after manufacture. JTAG implements standards for on-chip instrumentation in ele ...
ports or other debugging means. In
Microsoft Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft Microsoft Corporation is an American multinational corporation, multinational technology company, technology corporatio ...
, low-level debuggers such as SoftICE are popular. # Disassembly using a
disassembler A disassembler is a computer program that translator (computing), translates machine language into assembly language—the inverse operation to that of an Assembly language#Assembler, assembler. A disassembler differs from a decompiler, which targ ...
, meaning the raw
machine language In computer programming, machine code is any low-level programming language, consisting of machine language instruction set architecture, instructions, which are used to control a computer's central processing unit (CPU). Each instruction cau ...
of the program is read and understood in its own terms, only with the aid of machine-language
mnemonic A mnemonic ( ) device, or memory device, is any learning technique that aids information retention or retrieval (remembering) in the human memory for better understanding. Mnemonics make use of elaborative encoding, retrieval cues, and imagery ...
s. It works on any computer program but can take quite some time, especially for those who are not used to machine code. The Interactive Disassembler is a particularly popular tool. #Decompilation using a
decompiler A decompiler is a computer program that translates an executable file to a high-level Source code, source file which can be binary recompiler, recompiled successfully. It does therefore the opposite of a typical compiler, which translates a high-l ...
, a process that tries, with varying results, to recreate the source code in some high-level language for a program only available in machine code or
bytecode Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software Interpreter (computing), interpreter. Unlike Human-readable code, human-readable source code, bytecodes are compact nume ...
.


Software classification

Software classification is the process of identifying similarities between different software binaries (such as two different versions of the same binary) used to detect code relations between software samples. The task was traditionally done manually for several reasons (such as patch analysis for vulnerability detection and
copyright infringement Copyright infringement (at times referred to as piracy) is the use of Copyright#Scope, works protected by copyright without permission for a usage where such permission is required, thereby infringing certain exclusive rights granted to the cop ...
), but it can now be done somewhat automatically for large numbers of samples. This method is being used mostly for long and thorough reverse engineering tasks (complete analysis of a complex algorithm or big piece of software). In general,
statistical classification In statistics, classification is the problem of identifying which of a set of categorical data, categories (sub-populations) an observation (or observations) belongs to. Examples are assigning a given email to the Spam filtering, "spam" or "non-sp ...
is considered to be a hard problem, which is also true for software classification, and so few solutions/tools that handle this task well.


Source code

A number of UML tools refer to the process of importing and analysing source code to generate UML diagrams as "reverse engineering." See List of UML tools. Although UML is one approach in providing "reverse engineering" more recent advances in international standards activities have resulted in the development of the Knowledge Discovery Metamodel (KDM). The standard delivers an ontology for the intermediate (or abstracted) representation of programming language constructs and their interrelationships. An
Object Management Group The Object Management Group (OMG) is a computer industry standardization, standards consortium. OMG Task Forces develop enterprise integration standards for a range of technologies. Business activities The goal of the OMG was a common portabl ...
standard (on its way to becoming an ISO standard as well), KDM has started to take hold in industry with the development of tools and analysis environments that can deliver the extraction and analysis of source, binary, and byte code. For source code analysis, KDM's granular standards' architecture enables the extraction of software system flows (data, control, and call maps), architectures, and business layer knowledge (rules, terms, and process). The standard enables the use of a common data format (XMI) enabling the correlation of the various layers of system knowledge for either detailed analysis (such as root cause, impact) or derived analysis (such as business process extraction). Although efforts to represent language constructs can be never-ending because of the number of languages, the continuous evolution of software languages, and the development of new languages, the standard does allow for the use of extensions to support the broad language set as well as evolution. KDM is compatible with UML, BPMN, RDF, and other standards enabling migration into other environments and thus leverage system knowledge for efforts such as software system transformation and enterprise business layer analysis.


Protocols

Protocols are sets of rules that describe message formats and how messages are exchanged: the protocol
state machine A finite-state machine (FSM) or finite-state automaton (FSA, plural: ''automata''), finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number o ...
. Accordingly, the problem of protocol reverse-engineering can be partitioned into two subproblems: message format and state-machine reverse-engineering. The message formats have traditionally been reverse-engineered by a tedious manual process, which involved analysis of how protocol implementations process messages, but recent research proposed a number of automatic solutions.P. M. Comparetti, G. Wondracek, C. Kruegel, and E. Kirda. Prospex: Protocol specification extraction. In Proceedings of the 2009 30th IEEE Symposium on Security and Privacy, pp. 110–125, Washington, 2009. IEEE Computer Society. Typically, the automatic approaches group observe messages into clusters by using various clustering analyses, or they emulate the protocol implementation tracing the message processing. There has been less work on reverse-engineering of state-machines of protocols. In general, the protocol state-machines can be learned either through a process of offline learning, which passively observes communication and attempts to build the most general state-machine accepting all observed sequences of messages, and online learning, which allows interactive generation of probing sequences of messages and listening to responses to those probing sequences. In general, offline learning of small state-machines is known to be
NP-complete In computational complexity theory, a problem is NP-complete when: # it is a problem for which the correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by trying ...
, but online learning can be done in polynomial time. An automatic offline approach has been demonstrated by Comparetti et al. and an online approach by Cho et al. Other components of typical protocols, like encryption and hash functions, can be reverse-engineered automatically as well. Typically, the automatic approaches trace the execution of protocol implementations and try to detect buffers in memory holding unencrypted packets.


Integrated circuits/smart cards

Reverse engineering is an invasive and destructive form of analyzing a
smart card A smart card, chip card, or integrated circuit card (ICC or IC card) is a physical electronic authentication device, used to control access to a resource. It is typically a plastic credit card-sized card with an embedded integrated circuit (IC) c ...
. The attacker uses chemicals to etch away layer after layer of the smart card and takes pictures with a scanning electron microscope (SEM). That technique can reveal the complete hardware and software part of the smart card. The major problem for the attacker is to bring everything into the right order to find out how everything works. The makers of the card try to hide keys and operations by mixing up memory positions, such as by bus scrambling. In some cases, it is even possible to attach a probe to measure voltages while the smart card is still operational. The makers of the card employ sensors to detect and prevent that attack. That attack is not very common because it requires both a large investment in effort and special equipment that is generally available only to large chip manufacturers. Furthermore, the payoff from this attack is low since other security techniques are often used such as shadow accounts. It is still uncertain whether attacks against chip-and-PIN cards to replicate encryption data and then to crack PINs would provide a cost-effective attack on multifactor authentication. Full reverse engineering proceeds in several major steps. The first step after images have been taken with a SEM is stitching the images together, which is necessary because each layer cannot be captured by a single shot. A SEM needs to sweep across the area of the circuit and take several hundred images to cover the entire layer. Image stitching takes as input several hundred pictures and outputs a single properly-overlapped picture of the complete layer. Next, the stitched layers need to be aligned because the sample, after etching, cannot be put into the exact same position relative to the SEM each time. Therefore, the stitched versions will not overlap in the correct fashion, as on the real circuit. Usually, three corresponding points are selected, and a transformation applied on the basis of that. To extract the circuit structure, the aligned, stitched images need to be segmented, which highlights the important circuitry and separates it from the uninteresting background and insulating materials. Finally, the wires can be traced from one layer to the next, and the netlist of the circuit, which contains all of the circuit's information, can be reconstructed.


Military applications

Reverse engineering is often used by people to copy other nations' technologies, devices, or information that have been obtained by regular troops in the fields or by
intelligence Intelligence has been defined in many ways: the capacity for abstraction, logic, understanding, self-awareness, learning Learning is the process of acquiring new understanding, knowledge, behaviors, skills, value (personal and cultura ...
operations. It was often used during the
Second World War World War II or the Second World War, often abbreviated as WWII or WW2, was a world war that lasted from 1939 to 1945. It involved the World War II by country, vast majority of the world's countries—including all of the great power ...
and the
Cold War The Cold War is a term commonly used to refer to a period of Geopolitics, geopolitical tension between the United States and the Soviet Union and their respective allies, the Western Bloc and the Eastern Bloc. The term ''Cold war (term), co ...
. Here are well-known examples from the Second World War and later: *
Jerry can A jerrycan (also written as jerry can or jerrican) is a robust liquid container made from pressed steel (and more recently, high-density polyethylene, high density polyethylene). It was designed in Germany in the 1930s for military use to ho ...
: British and American forces in WW2 noticed that the Germans had gasoline cans with an excellent design. They reverse-engineered copies of those cans, which cans were popularly known as "Jerry cans." *
Panzerschreck ''Panzerschreck'' (lit. "tank fright", "tank's fright" or "tank's bane") was the popular name for the ''Raketenpanzerbüchse'' 54 ("Rocket Anti-armor Rifle Model 54", abbreviated to RPzB 54), an 88 mm reusable anti-tank rocket launcher de ...
: The Germans captured an American
bazooka Bazooka () is the common name for a Man-portable anti-tank systems, man-portable recoilless Anti-tank warfare, anti-tank rocket launcher weapon, widely deployed by the United States Army, especially during World War II. Also referred to as the ...
during the Second World War and reverse engineered it to create the larger Panzerschreck. *
Tupolev Tu-4 The Tupolev Tu-4 (russian: Туполев Ту-4; NATO reporting name: Bull) is a piston-engined Soviet Union, Soviet strategic bomber that served the Soviet Air Force from the late 1940s to mid-1960s. It was Reverse engineering, reverse-enginee ...
: In 1944, three American
B-29 The Boeing B-29 Superfortress is an American four-engined Propeller (aeronautics), propeller-driven heavy bomber, designed by Boeing and flown primarily by the United States during World War II and the Korean War. Named in allusion to its p ...
bombers on missions over
Japan Japan ( ja, 日本, or , and formally , ''Nihonkoku'') is an island country in East Asia. It is situated in the northwest Pacific Ocean, and is bordered on the west by the Sea of Japan, while extending from the Sea of Okhotsk in the north ...
were forced to land in the
Soviet Union The Soviet Union,. officially the Union of Soviet Socialist Republics. (USSR),. was a List of former transcontinental countries#Since 1700, transcontinental country that spanned much of Eurasia from 1922 to 1991. A flagship communist state, ...
. The Soviets, who did not have a similar strategic bomber, decided to copy the B-29. Within three years, they had developed the Tu-4, a nearly-perfect copy. * SCR-584 radar: copied by the Soviet Union after the Second World War, it is known for a few modifications - СЦР-584, Бинокль-Д. *
V-2 The V-2 (german: V-weapons, Vergeltungswaffe 2, lit=Retaliation Weapon 2), with the technical name ''Aggregat (rocket family), Aggregat 4'' (A-4), was the world’s first long-range missile guidance, guided ballistic missile. The missile, power ...
rocket: Technical documents for the V-2 and related technologies were captured by the Western Allies at the end of the war. The Americans focused their reverse engineering efforts via
Operation Paperclip Operation Paperclip was a secret United States intelligence program in which more than 1,600 German scientists, engineers, and technicians were taken from the former Nazi Germany to the U.S. for government employment after End of World War II ...
, which led to the development of the
PGM-11 Redstone The PGM-11 Redstone was the first large American ballistic missile. A short-range ballistic missile A short-range ballistic missile (SRBM) is a ballistic missile with a range (aeronautics), range of about or less. In past and potential regio ...
rocket. The Soviets used captured German engineers to reproduce technical documents and plans and worked from captured hardware to make their clone of the rocket, the R-1. Thus began the postwar Soviet rocket program, which led to the R-7 and the beginning of the
space race The Space Race was a 20th-century competition between two Cold War rivals, the United States and the Soviet Union, to achieve superior spaceflight capability. It had its origins in the ballistic missile-based nuclear arms race between the tw ...
. * K-13/R-3S missile (
NATO reporting name NATO reporting names are code names for military equipment from Russia, China, and historically, the Eastern Bloc (Soviet Union and other nations of the Warsaw Pact). They provide unambiguous and easily understood English words in a uniform manne ...
AA-2 Atoll), a Soviet reverse-engineered copy of the
AIM-9 Sidewinder The AIM-9 Sidewinder (where "AIM" stands for "Air Intercept Missile") is a short-range air-to-air missile which entered service with the US Navy in 1956 and subsequently was adopted by the US Air Force in 1964. Since then the Sidewinder has prove ...
, was made possible after a Taiwanese (ROCAF) AIM-9B hit a Chinese PLA
MiG-17 The Mikoyan-Gurevich MiG-17 (russian: Микоян и Гуревич МиГ-17; NATO reporting name: Fresco) is a high-subsonic fighter aircraft produced in the Soviet Union from 1952 and was operated by air forces internationally. The MiG-17 w ...
without exploding in September 1958. The missile became lodged within the airframe, and the pilot returned to base with what Soviet scientists would describe as a university course in missile development. *
BGM-71 TOW The BGM-71 TOW ("Tube-launched, Optically tracked, Wire-guided missile, Wire-guided") is an American anti-tank missile. TOW replaced much smaller missiles like the SS.10 and ENTAC, offering roughly twice the effective range, a more powerful war ...
missile: In May 1975, negotiations between Iran and Hughes Missile Systems on co-production of the TOW and Maverick missiles stalled over disagreements in the pricing structure, the subsequent
1979 revolution The Iranian Revolution ( fa, انقلاب ایران, Enqelâb-e Irân, ), also known as the Islamic Revolution ( fa, انقلاب اسلامی, Enqelâb-e Eslâmī), was a series of events that culminated in the overthrow of the Pahlavi dynas ...
ending all plans for such co-production. Iran was later successful in reverse-engineering the missile and now produces its own copy, the
Toophan The Toophan ( fa, طوفان "typhoon", rarely Toofan) is an Iranian SACLOS anti-tank guided missile Reverse engineering, reverse-engineered from the United States of America, American BGM-71 TOW missile. The Toophan 1, an Licensed production, un ...
. * China has reversed engineered many examples of Western and Russian hardware, from fighter aircraft to missiles and HMMWV cars, such as the MiG-15,17,19,21 (which became the J-2,5,6,7) and the Su-33 (which became the J-15). More recent analyses of China's military growth have pointed to the inherent limitations of habitual reverse engineering for advanced weapon systems. * During the Second World War, Polish and British cryptographers studied captured German " Enigma" message encryption machines for weaknesses. Their operation was then simulated on electromechanical devices, " bombes", which tried all the possible scrambler settings of the "Enigma" machines that helped the breaking of coded messages that had been sent by the Germans. * Also during the Second World War, British scientists analyzed and defeated a series of increasingly-sophisticated radio navigation systems used by the
Luftwaffe The ''Luftwaffe'' () was the aerial warfare, aerial-warfare branch of the German ''Wehrmacht'' before and during World War II. German Empire, Germany's military air arms during World War I, the ''Luftstreitkräfte'' of the German Army (Ge ...
to perform guided bombing missions at night. The British countermeasures to the system were so effective that in some cases, German aircraft were led by signals to land at
RAF The Royal Air Force (RAF) is the United Kingdom's Air force, air and space force. It was formed towards the end of the World War I, First World War on 1 April 1918, becoming the first independent air force in the world, by regrouping the Royal ...
bases since they believed that they had returned to German territory.


Gene networks

Reverse engineering concepts have been applied to
biology Biology is the scientific study of life. It is a natural science with a broad scope but has several unifying themes that tie it together as a single, coherent field. For instance, all organisms are made up of Cell (biology), cells that proce ...
as well, specifically to the task of understanding the structure and function of
gene regulatory network A gene (or genetic) regulatory network (GRN) is a collection of molecular regulators that interact with each other and with other substances in the cell to govern the gene expression levels of mRNA and proteins which, in turn, determine the fun ...
s. They regulate almost every aspect of biological behavior and allow cells to carry out physiological processes and responses to perturbations. Understanding the structure and the dynamic behavior of gene networks is therefore one of the paramount challenges of systems biology, with immediate practical repercussions in several applications that are beyond basic research. There are several methods for reverse engineering gene regulatory networks by using molecular biology and data science methods. They have been generally divided into six classes: * Coexpression methods are based on the notion that if two genes exhibit a similar expression profile, they may be related although no causation can be simply inferred from coexpression. * Sequence motif methods analyze gene promoters to find specific
transcription factor In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription (genetics), transcription of genetics, genetic information from DNA to messenger RNA, by binding to ...
binding domains. If a transcription factor is predicted to bind a promoter of a specific gene, a regulatory connection can be hypothesized. * Chromatin ImmunoPrecipitation (ChIP) methods investigate the
genome In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding gene ...
-wide profile of DNA binding of chosen transcription factors to infer their downstream gene networks. * Orthology methods transfer gene network knowledge from one species to another. * Literature methods implement
text mining Text mining, also referred to as ''text data mining'', similar to text analytics, is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extra ...
and manual research to identify putative or experimentally-proven gene network connections. * Transcriptional complexes methods leverage information on protein-protein interactions between transcription factors, thus extending the concept of gene networks to include transcriptional regulatory complexes. Often, gene network reliability is tested by genetic perturbation experiments followed by dynamic modelling, based on the principle that removing one network node has predictable effects on the functioning of the remaining nodes of the network. Applications of the reverse engineering of gene networks range from understanding mechanisms of plant physiology to the highlighting of new targets for anticancer therapy.


Overlap with patent law

Reverse engineering applies primarily to gaining understanding of a process or artifact in which the manner of its construction, use, or internal processes has not been made clear by its creator.
Patent A patent is a type of intellectual property that gives its owner the legal right to exclude others from making, using, or selling an invention for a limited period of time in exchange for publishing an sufficiency of disclosure, enabling disclo ...
ed items do not of themselves have to be reverse-engineered to be studied, for the essence of a patent is that inventors provide a detailed public disclosure themselves, and in return receive legal protection of the
invention An invention is a unique or novelty (patent), novel machine, device, method, composition, idea or process. An invention may be an improvement upon a machine, product, or process for increasing efficiency or lowering cost. It may also be an ent ...
that is involved. However, an item produced under one or more patents could also include other technology that is not patented and not disclosed. Indeed, one common motivation of reverse engineering is to determine whether a competitor's product contains
patent infringement Patent infringement is the commission of a prohibited act with respect to a patented invention without permission from the patent holder. Permission may typically be granted in the form of a license. The definition of patent infringement may va ...
or
copyright infringement Copyright infringement (at times referred to as piracy) is the use of Copyright#Scope, works protected by copyright without permission for a usage where such permission is required, thereby infringing certain exclusive rights granted to the cop ...
.


Legality


United States

In the United States, even if an artifact or process is protected by
trade secrets Trade secrets are a type of intellectual property that includes formulas, best practice, practices, business process, processes, designs, legal instrument, instruments, patterns, or compilations of information that have inherent economic value be ...
, reverse-engineering the artifact or process is often lawful if it has been legitimately obtained. Reverse engineering of
computer software Software is a set of computer programs and associated software documentation, documentation and data (computing), data. This is in contrast to Computer hardware, hardware, from which the system is built and which actually performs the work. ...
often falls under both
contract law A contract is a legally enforceable agreement between two or more Party (law), parties that creates, defines, and governs mutual rights and obligations between them. A contract typically involves the transfer of goods, Service (economics), ser ...
as a
breach of contract Breach of contract is a legal cause of action and a type of civil wrong, in which a binding agreement or bargained-for exchange is not honored by one or more of the parties to the contract by non-performance or interference with the other party' ...
as well as any other relevant laws. That is because most end-user license agreements specifically prohibit it, and US courts have ruled that if such terms are present, they override the copyright law that expressly permits it (see '' Bowers v. Baystate Technologies''). According to Section 103(f) of the
Digital Millennium Copyright Act The Digital Millennium Copyright Act (DMCA) is a 1998 United States copyright law that implements two 1996 treaties of the World Intellectual Property Organization (WIPO). It criminalizes production and dissemination of technology, devices, or s ...

17 U.S.C. § 1201 (f)
, a person in legal possession of a program may reverse-engineer and circumvent its protection if that is necessary to achieve "interoperability," a term that broadly covers other devices and programs that can interact with it, make use of it, and to use and transfer data to and from it in useful ways. A limited exemption exists that allows the knowledge thus gained to be shared and used for interoperability purposes.


European Union

EU Directive 2009/24 on the legal protection of computer programs, which superseded an earlier (1991) directive, governs reverse engineering in the
European Union The European Union (EU) is a supranational union, supranational political union, political and economic union of Member state of the European Union, member states that are located primarily in Europe, Europe. The union has a total area of ...
.The directive states:


See also

*
Antikythera mechanism The Antikythera mechanism ( ) is an Ancient Greece, Ancient Greek hand-powered orrery, described as the oldest example of an analogue computer used to predict astronomy, astronomical positions and eclipses decades in advance. It could also be ...
* Backward induction *
Benchmarking Benchmarking is the practice of comparing business processes and performance metrics to industry bests and best practices from other companies. Dimensions typically measured are Project management triangle, quality, time and cost. Benchmarking i ...
* Bus analyzer * Chonda *
Clone (computing) In computing, a clone is computer hardware, hardware or software that is designed to function in exactly the same way as another system. A specific subset of clones are remakes (or remades), which are revivals of old, obsolete, or discontinued pr ...
* Clean room design * CMM * Code morphing * Connectix Virtual Game Station *
Counterfeiting To counterfeit means to imitate something authentic, with the intent to steal, destroy, or replace the original, for use in illegal transactions, or otherwise to deceive individuals into believing that the fake is of equal or greater value tha ...
*
Cryptanalysis Cryptanalysis (from the Greek language, Greek ''kryptós'', "hidden", and ''analýein'', "to analyze") refers to the process of analyzing information systems in order to understand hidden aspects of the systems. Cryptanalysis is used to breach C ...
* Decompile * Deformulation *
Digital Millennium Copyright Act The Digital Millennium Copyright Act (DMCA) is a 1998 United States copyright law that implements two 1996 treaties of the World Intellectual Property Organization (WIPO). It criminalizes production and dissemination of technology, devices, or s ...
(DMCA) *
Disassembler A disassembler is a computer program that translator (computing), translates machine language into assembly language—the inverse operation to that of an Assembly language#Assembler, assembler. A disassembler differs from a decompiler, which targ ...
*
Dongle A dongle is a small piece of computer hardware that connects to a Computer port (hardware), port on another device to provide it with additional functionality, or enable a pass-through to such a device that adds functionality. In computing, the ...
*
Forensic engineering Forensic engineering has been defined as ''"the investigation of failures - ranging from serviceability to catastrophic - which may lead to legal activity, including both civil and criminal".'' It includes the investigation of materials, produ ...
* Industrial CT scanning * Interactive Disassembler * Knowledge Discovery Metamodel * Laser scanner * List of production topics * Listeroid Engines *
Logic analyzer A logic analyzer is an electronic instrument that captures and displays multiple signal (electrical engineering), signals from a digital system or digital circuit. A logic analyzer may convert the captured data into digital timing diagram, timin ...
* '' Paycheck'' * Product teardown * Repurposing * Reverse architecture * Round-trip engineering * Retrodiction * '' Sega v. Accolade'' * Software archaeology * Software cracking * Structured light digitizer *
Value engineering Value engineering (VE) is a systematic analysis of the functions of various components and materials to lower the cost of goods, products and services with a tolerable loss of performance or functionality. Value, as defined, ...


References


Sources

* * Elvidge, Julia, "Using Reverse Engineering to Discover Patent Infringement," Chipworks, Sept. 2010. Online: http://www.photonics.com/Article.aspx?AID=44063 * * Hausi A. Müller and Holger M. Kienle, "A Small Primer on Software Reverse Engineering," Technical Report, University of Victoria, 17 pages, March 2009. Online: http://holgerkienle.wikispaces.com/file/view/MK-UVic-09.pdf * Heines, Henry, "Determining Infringement by X-Ray Diffraction," ''Chemical Engineering Process'', Jan. 1999 (example of reverse engineering used to detect IP infringement) * * * (introduction to hardware teardowns, including methodology, goals) * * Reverse Engineering for Beginners * Samuelson, Pamela and Scotchmer, Suzanne, "The Law and Economics of Reverse Engineering," 111 Yale L.J. 1575 (2002). Online: http://people.ischool.berkeley.edu/~pam/papers/l&e%20reveng3.pdf * (xviii+856+vi pages, 3.5"-floppy) Errata

https://web.archive.org/web/20190417212906/https://www.pcjs.org/pubs/pc/programming/Undocumented_DOS/#errata-2nd-edition] (NB. On general methodology of reverse engineering, applied to mass-market software: a program for exploring DOS, disassembling DOS.) * (pp. 59–188 on general methodology of reverse engineering, applied to mass-market software: examining Windows executables, disassembling Windows, tools for exploring Windows) * Schulman, Andrew, "Hiding in Plain Sight: Using Reverse Engineering to Uncover Software Patent Infringement," ''Intellectual Property Today'', Nov. 2010. Online: http://www.iptoday.com/issues/2010/11/hiding-in-plain-sight-using-reverse-engineering-to-uncover-software-patent-infringement.asp * Schulman, Andrew, "Open to Inspection: Using Reverse Engineering to Uncover Software Prior Art," ''New Matter'' (Calif. State Bar IP Section), Summer 2011 (Part 1); Fall 2011 (Part 2). Online: http://www.SoftwareLitigationConsulting.com * {{DEFAULTSORT:Reverse Engineering Reverse engineering, Computer security Espionage Patent law Industrial engineering Technical intelligence Technological races