Code refactoring
   HOME

TheInfoList



OR:

In
computer programming Computer programming is the process of performing a particular computation (or more generally, accomplishing a specific computing result), usually by designing and building an executable computer program. Programming involves tasks such as anal ...
and software design, code refactoring is the process of restructuring existing
computer code A computer is a machine that can be programmed to carry out sequences of arithmetic or logical operations ( computation) automatically. Modern digital electronic computers can perform generic sets of operations known as programs. These pr ...
—changing the '' factoring''—without changing its external behavior. Refactoring is intended to improve the design, structure, and/or implementation of the
software Software is a set of computer programs and associated software documentation, documentation and data (computing), data. This is in contrast to Computer hardware, hardware, from which the system is built and which actually performs the work. ...
(its '' non-functional'' attributes), while preserving its
functionality Function or functionality may refer to: Computing * Function key, a type of key on computer keyboards * Function model, a structured representation of processes in a system * Function object or functor or functionoid, a concept of object-oriente ...
. Potential advantages of refactoring may include improved code readability and reduced complexity; these can improve the
source code In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the w ...
s
maintainability In engineering, maintainability is the ease with which a product can be maintained to: * correct defects or their cause, * Repair or replace faulty or worn-out components without having to replace still working parts, * prevent unexpected working ...
and create a simpler, cleaner, or more expressive internal
architecture Architecture is the art and technique of designing and building, as distinguished from the skills associated with construction. It is both the process and the product of sketching, conceiving, planning, designing, and constructing building ...
or
object model In computing, object model has two related but distinct meanings: # The properties of objects in general in a specific computer programming language, technology, notation or methodology that uses them. Examples are the object models of ''Java'', ...
to improve extensibility. Another potential goal for refactoring is improved performance; software engineers face an ongoing challenge to write programs that perform faster or use less memory. Typically, refactoring applies a series of standardized basic ''micro-refactorings'', each of which is (usually) a tiny change in a computer program's source code that either preserves the behavior of the software, or at least does not modify its conformance to functional requirements. Many development environments provide automated support for performing the mechanical aspects of these basic refactorings. If done well, code refactoring may help software developers discover and fix hidden or dormant bugs or vulnerabilities in the system by simplifying the underlying logic and eliminating unnecessary levels of complexity. If done poorly, it may fail the requirement that external functionality not be changed, and may thus introduce new bugs.


Motivation

Refactoring is usually motivated by noticing a
code smell In computer programming, a code smell is any characteristic in the source code of a program that possibly indicates a deeper problem. Determining what is and is not a code smell is subjective, and varies by language, developer, and development meth ...
. For example, the method at hand may be very long, or it may be a near
duplicate Duplication, duplicate, and duplicator may refer to: Biology and genetics * Gene duplication, a process which can result in free mutation * Chromosomal duplication, which can cause Bloom and Rett syndrome * Polyploidy, a phenomenon also known ...
of another nearby method. Once recognized, such problems can be addressed by ''refactoring'' the source code, or transforming it into a new form that behaves the same as before but that no longer "smells". For a long routine, one or more smaller subroutines can be extracted; or for duplicate routines, the duplication can be removed and replaced with one shared function. Failure to perform refactoring can result in accumulating
technical debt In software development, technical debt (also known as design debt or code debt) is the implied cost of additional rework caused by choosing an easy (limited) solution now instead of using a better approach that would take longer. Analogous with ...
; on the other hand, refactoring is one of the primary means of repaying technical debt.


Benefits

There are two general categories of benefits to the activity of refactoring. #
Maintainability In engineering, maintainability is the ease with which a product can be maintained to: * correct defects or their cause, * Repair or replace faulty or worn-out components without having to replace still working parts, * prevent unexpected working ...
. It is easier to fix bugs because the source code is easy to read and the intent of its author is easy to grasp. This might be achieved by reducing large monolithic routines into a set of individually concise, well-named, single-purpose methods. It might be achieved by moving a method to a more appropriate class, or by removing misleading comments. # Extensibility. It is easier to extend the capabilities of the application if it uses recognizable design patterns, and it provides some flexibility where none before may have existed. Performance engineering can remove inefficiencies in programs, known as software bloat, arising from traditional software-development strategies that aim to minimize an application's development time rather than the time it takes to run. Performance engineering can also tailor
software Software is a set of computer programs and associated software documentation, documentation and data (computing), data. This is in contrast to Computer hardware, hardware, from which the system is built and which actually performs the work. ...
to the hardware on which it runs, for example, to take advantage of parallel processors and vector units.


Challenges

Refactoring requires extracting software system structure, data models, and intra-application dependencies to get back knowledge of an existing software system. The turnover of teams implies missing or inaccurate knowledge of the current state of a system and about design decisions made by departing developers. Further code refactoring activities may require additional effort to regain this knowledge. Refactoring activities generate architectural modifications that deteriorate the structural architecture of a software system. Such deterioration affects architectural properties such as maintainability and comprehensibility which can lead to a complete re-development of software systems. Code refactoring activities are secured with
software intelligence Software Intelligence is insight into the structural condition of software assets produced by software designed to analyze database structure, software framework and source code to better understand and control complex software systems in Informatio ...
when using tools and techniques providing data about algorithms and sequences of code execution. Providing a comprehensible format for the inner-state of software system structure, data models, and intra-components dependencies is a critical element to form a high-level understanding and then refined views of what needs to be modified, and how.


Testing

Automatic unit tests should be set up before refactoring to ensure routines still behave as expected. Unit tests can bring stability to even large refactors when performed with a single
atomic commit In the field of computer science, an atomic commit is an operation that applies a set of distinct changes as a single operation. If the changes are applied, then the atomic commit is said to have succeeded. If there is a failure before the atomic ...
. A common strategy to allow safe and atomic refactors spanning multiple projects is to store all projects in a single repository, known as
monorepo In version-control systems, a monorepo ("mono" meaning 'single' and "repo" being short for ' repository') is a software-development strategy in which the code for a number of projects is stored in the same repository. This practice dates back to a ...
. With unit testing in place, refactoring is then an iterative cycle of making a small
program transformation A program transformation is any operation that takes a computer program and generates another program. In many cases the transformed program is required to be semantically equivalent to the original, relative to a particular formal semantics and ...
, testing it to ensure correctness, and making another small transformation. If at any point a test fails, the last small change is undone and repeated in a different way. Through many small steps the program moves from where it was to where you want it to be. For this very iterative process to be practical, the tests must run very quickly, or the programmer would have to spend a large fraction of their time waiting for the tests to finish. Proponents of
extreme programming Extreme programming (XP) is a software development methodology intended to improve software quality and responsiveness to changing customer requirements. As a type of agile software development,"Human Centred Technology Workshop 2006 ", 2006, P ...
and other agile software development describe this activity as an integral part of the software development cycle.


Techniques

Here are some examples of micro-refactorings; some of these may only apply to certain languages or language types. A longer list can be found in Martin Fowler's refactoring book and website.(these are only about OOP however
Refactoring techniques in Fowler's refactoring Website
/ref> Many development environments provide automated support for these micro-refactorings. For instance, a programmer could click on the name of a variable and then select the "Encapsulate field" refactoring from a
context menu A context menu (also called contextual, shortcut, and pop up or pop-up menu) is a menu in a graphical user interface (GUI) that appears upon user interaction, such as a right-click mouse operation. A context menu offers a limited set of choic ...
. The IDE would then prompt for additional details, typically with sensible defaults and a preview of the code changes. After confirmation by the programmer it would carry out the required changes throughout the code. * Techniques that allow for more
understanding Understanding is a psychological process related to an abstract or physical object, such as a person, situation, or message whereby one is able to use concepts to model that object. Understanding is a relation between the knower and an object ...
** Program Dependence Graph - explicit representation of data and control dependencies ** System Dependence Graph - representation of procedure calls between PDG **
Software intelligence Software Intelligence is insight into the structural condition of software assets produced by software designed to analyze database structure, software framework and source code to better understand and control complex software systems in Informatio ...
- reverse engineers the initial state to understand existing intra-application dependencies * Techniques that allow for more abstraction ** Encapsulate field – force code to access the field with getter and setter methods ** Generalize type – create more general types to allow for more code sharing ** Replace type-checking code with state/strategy ** Replace conditional with polymorphism * Techniques for breaking code apart into more logical pieces ** Componentization breaks code down into reusable semantic units that present clear, well-defined, simple-to-use interfaces. ** Extract class moves part of the code from an existing class into a new class. ** Extract method, to turn part of a larger
method Method ( grc, μέθοδος, methodos) literally means a pursuit of knowledge, investigation, mode of prosecuting such inquiry, or system. In recent centuries it more often means a prescribed process for completing a task. It may refer to: *Scien ...
into a new method. By breaking down code in smaller pieces, it is more easily understandable. This is also applicable to
function Function or functionality may refer to: Computing * Function key, a type of key on computer keyboards * Function model, a structured representation of processes in a system * Function object or functor or functionoid, a concept of object-oriente ...
s. * Techniques for improving names and location of code ** Move method or move field – move to a more appropriate class or source file ** Rename method or rename field – changing the name into a new one that better reveals its purpose ** Pull up – in
object-oriented programming Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of ...
(OOP), move to a superclass ** Push down – in OOP, move to a subclass * Automatic
clone detection In computer programming, duplicate code is a sequence of source code that occurs more than once, either within a program or across different programs owned or maintained by the same entity. Duplicate code is generally considered undesirable for a n ...


Hardware refactoring

While the term ''refactoring'' originally referred exclusively to refactoring of software code, in recent years code written in
hardware description language In computer engineering, a hardware description language (HDL) is a specialized computer language used to describe the structure and behavior of electronic circuits, and most commonly, digital logic circuits. A hardware description language en ...
s has also been refactored. The term ''hardware refactoring'' is used as a shorthand term for refactoring of code in hardware description languages. Since hardware description languages are not considered to be
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
s by most hardware engineers, hardware refactoring is to be considered a separate field from traditional code refactoring. Automated refactoring of analog hardware descriptions (in VHDL-AMS) has been proposed by Zeng and Huss. In their approach, refactoring preserves the simulated behavior of a hardware design. The non-functional measurement that improves is that refactored code can be processed by standard synthesis tools, while the original code cannot. Refactoring of digital hardware description languages, albeit manual refactoring, has also been investigated by
Synopsys Synopsys is an American electronic design automation (EDA) company that focuses on silicon design and verification, silicon intellectual property and software security and quality. Products include tools for logic synthesis and physical de ...
fellow A fellow is a concept whose exact meaning depends on context. In learned or professional societies, it refers to a privileged member who is specially elected in recognition of their work and achievements. Within the context of higher education ...
Mike Keating. His target is to make complex systems easier to understand, which increases the designers' productivity.


History

The first known use of the term "refactoring" in the published literature was in a September, 1990 article by William Opdyke and Ralph Johnson. Griswold's Ph.D. thesis, Opdyke's Ph.D. thesis, published in 1992, also used this term. Although refactoring code has been done informally for decades, William Griswold's 1991 Ph.D. dissertation is one of the first major academic works on refactoring functional and procedural programs, followed by William Opdyke's 1992 dissertation on the refactoring of object-oriented programs, although all the theory and machinery have long been available as
program transformation A program transformation is any operation that takes a computer program and generates another program. In many cases the transformed program is required to be semantically equivalent to the original, relative to a particular formal semantics and ...
systems. All of these resources provide a catalog of common methods for refactoring; a refactoring method has a description of how to apply the
method Method ( grc, μέθοδος, methodos) literally means a pursuit of knowledge, investigation, mode of prosecuting such inquiry, or system. In recent centuries it more often means a prescribed process for completing a task. It may refer to: *Scien ...
and indicators for when you should (or should not) apply the method. Martin Fowler's book ''Refactoring: Improving the Design of Existing Code'' is the canonical reference. The terms "factoring" and "factoring out" have been used in this way in the Forth community since at least the early 1980s. Chapter Six of Leo Brodie's book '' Thinking Forth'' (1984) is dedicated to the subject. In extreme programming, the Extract Method refactoring technique has essentially the same meaning as factoring in Forth; to break down a "word" (or
function Function or functionality may refer to: Computing * Function key, a type of key on computer keyboards * Function model, a structured representation of processes in a system * Function object or functor or functionoid, a concept of object-oriente ...
) into smaller, more easily maintained functions. Refactorings can also be reconstructed posthoc to produce concise descriptions of complex software changes recorded in software repositories like CVS or SVN.


Automated code refactoring

Many software
editors Editing is the process of selecting and preparing written, photographic, visual, audible, or cinematic material used by a person or an entity to convey a message or information. The editing process can involve correction, condensation, or ...
and
IDEs Ides or IDES may refer to: Calendar dates * Ides (calendar), a day in the Roman calendar that fell roughly in the middle of the month. In March, May, July, and October it was the 15th day of the month; in other months it was the 13th. **Ides of Mar ...
have automated refactoring support. It is possible to refactor application code as well as test code. Here is a list of a few of these editors, or so-called refactoring browsers. *
DMS Software Reengineering Toolkit The DMS Software Reengineering Toolkit is a proprietary set of program transformation tools available for automating custom source program analysis, modification, translation or generation of software systems for arbitrary mixtures of source langu ...
(Implements large-scale refactoring for C, C++, C#, COBOL, Java, PHP and other languages) * Eclipse based: ** Eclipse (for
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
, and to a lesser extent, C++, PHP, Ruby and JavaScript) ** PyDev (for
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
) ** Photran (a Fortran plugin for the Eclipse IDE) *
Embarcadero Delphi Delphi is a general-purpose programming language and a software product that uses the Delphi dialect of the Object Pascal programming language and provides an integrated development environment (IDE) for rapid application development of desktop, ...
* IntelliJ based: ** Resharper (for C#) **
AppCode JetBrains s.r.o. (formerly IntelliJ Software s.r.o.) is a Czech software development company which makes tools for software developers and project managers. , the company has offices in Prague; Munich; Berlin; Boston, Massachusetts; Amsterda ...
(for
Objective-C Objective-C is a general-purpose, object-oriented programming language that adds Smalltalk-style messaging to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was selected by NeXT for its NeXT ...
, C and C++) **
IntelliJ IDEA IntelliJ IDEA is an integrated development environment (IDE) written in Java for developing computer software written in Java, Kotlin, Groovy, and other JVM-based languages. It is developed by JetBrains (formerly known as IntelliJ) and is av ...
(for
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
) **
PyCharm PyCharm is an integrated development environment (IDE) used for programming in Python. It provides code analysis, a graphical debugger, an integrated unit tester, integration with version control systems, and supports web development with Djan ...
(for
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
) **
WebStorm JetBrains s.r.o. (formerly IntelliJ Software s.r.o.) is a Czech software development company which makes tools for software developers and project managers. , the company has offices in Prague; Munich; Berlin; Boston, Massachusetts; Amsterda ...
(for
JavaScript JavaScript (), often abbreviated as JS, is a programming language that is one of the core technologies of the World Wide Web, alongside HTML and CSS. As of 2022, 98% of websites use JavaScript on the client side for webpage behavior, of ...
) **
PhpStorm JetBrains s.r.o. (formerly IntelliJ Software s.r.o.) is a Czech software development company which makes tools for software developers and project managers. , the company has offices in Prague; Munich; Berlin; Boston, Massachusetts; Amsterda ...
(for
PHP PHP is a general-purpose scripting language geared toward web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementation is now produced by The PHP Group. ...
) **
Android Studio Android Studio is the official integrated development environment (IDE) for Google's Android operating system, built on JetBrains' IntelliJ IDEA software and designed specifically for Android development. It is available for download on Windows ...
(for
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
and C++) *
JDeveloper JDeveloper is a freeware IDE supplied by Oracle Corporation. It offers features for development in Java, XML, SQL and PL/SQL, HTML, JavaScript, BPEL and PHP. JDeveloper covers the full development lifecycle from design through coding, debuggi ...
(for
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
) *
NetBeans NetBeans is an integrated development environment (IDE) for Java. NetBeans allows applications to be developed from a set of modular software components called ''modules''. NetBeans runs on Windows, macOS, Linux and Solaris. In addition to Java ...
(for
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...
) * Smalltalk: Most dialects include powerful refactoring tools. Many use the original refactoring browser produced in the early '90s by Ralph Johnson. * Visual Studio based: **
Visual Studio Visual Studio is an integrated development environment (IDE) from Microsoft. It is used to develop computer programs including websites, web apps, web services and mobile apps. Visual Studio uses Microsoft software development platforms such ...
(for .NET and C++) ** CodeRush (addon for Visual Studio) ** Visual Assist (addon for Visual Studio with refactoring support for C# and C++) * Wing IDE (for
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
) *
Xcode Xcode is Apple's integrated development environment (IDE) for macOS, used to develop software for macOS, iOS, iPadOS, watchOS, and tvOS. It was initially released in late 2003; the latest stable release is version 14.2, released on December 13, ...
(for C,
Objective-C Objective-C is a general-purpose, object-oriented programming language that adds Smalltalk-style messaging to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was selected by NeXT for its NeXT ...
, and
Swift Swift or SWIFT most commonly refers to: * SWIFT, an international organization facilitating transactions between banks ** SWIFT code * Swift (programming language) * Swift (bird), a family of birds It may also refer to: Organizations * SWIFT, ...
) *
Qt Creator Qt Creator is a cross-platform C++, JavaScript and QML integrated development environment (IDE) which simplifies GUI application development. It is part of the SDK for the Qt GUI application development framework and uses the Qt API, which e ...
(for C++, Objective-C and QML)


See also

* Amelioration pattern * Code review * Database refactoring *
Decomposition (computer science) Decomposition in computer science, also known as factoring, is breaking a complex problem or system into parts that are easier to conceive, understand, program, and maintain. Overview There are different types of decomposition defined in comput ...
*
Modular programming Modular programming is a software design technique that emphasizes separating the functionality of a Computer program, program into independent, interchangeable modules, such that each contains everything necessary to execute only one aspect of th ...
*
Obfuscated code In software development, obfuscation is the act of creating source or machine code that is difficult for humans or computers to understand. Like obfuscation in natural language, it may use needlessly roundabout expressions to compose statemen ...
*
Prefactoring Prefactoring is the application of experience to the creation of new software systems. Its relationship to its namesake refactoring is that lessons learned from refactoring are part of that experience. Experience is captured in guidelines that ca ...
*
Separation of concerns In computer science, separation of concerns is a design principle for separating a computer program into distinct sections. Each section addresses a separate '' concern'', a set of information that affects the code of a computer program. A concern ...
*
Software peer review In software development, peer review is a type of software review in which a work product (document, code, or other) is examined by author's colleagues, in order to evaluate the work product's technical content and quality. Purpose The purpose ...
*
Test-driven development Test-driven development (TDD) is a software development process relying on software requirements being converted to test cases before software is fully developed, and tracking all software development by repeatedly testing the software against al ...


References


Further reading

* * * * * * *


External links


What Is Refactoring?
(c2.com article)
Martin Fowler's homepage about refactoring
* {{Authority control Extreme programming Technology neologisms