HOME

TheInfoList



OR:

In compiler theory, common subexpression elimination (CSE) is a compiler optimization that searches for instances of identical expressions (i.e., they all evaluate to the same value), and analyzes whether it is worthwhile replacing them with a single variable holding the computed value.


Example

In the following code: a = b * c + g; d = b * c * e; it may be worth transforming the code to: tmp = b * c; a = tmp + g; d = tmp * e; if the cost of storing and retrieving tmp is less than the cost of calculating b * c an extra time.


Principle

The possibility to perform CSE is based on
available expression In the field of compiler optimizations, available expressions is an analysis algorithm that determines for each point in the program the set of expressions that need not be recomputed. Those expressions are said to be ''available'' at such a point. ...
analysis (a
data flow analysis In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted. ...
). An expression b*c is available at a point ''p'' in a program if: * every path from the initial node to p evaluates b*c before reaching ''p'', * and there are no assignments to b or c after the evaluation but before ''p''. The cost/benefit analysis performed by an optimizer will calculate whether the cost of the store to tmp is less than the cost of the multiplication; in practice other factors such as which values are held in which registers are also significant. Compiler writers distinguish two kinds of CSE: * local common subexpression elimination works within a single
basic block In compiler construction, a basic block is a straight-line code sequence with no branches in except to the entry and no branches out except at the exit. This restricted form makes a basic block highly amenable to analysis. Compilers usually deco ...
* global common subexpression elimination works on an entire procedure, Both kinds rely on
data flow analysis In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted. ...
of which expressions are available at which points in a program.


Benefits

The benefits of performing CSE are great enough that it is a commonly used optimization. In simple cases like in the example above, programmers may manually eliminate the duplicate expressions while writing the code. The greatest source of CSEs are intermediate code sequences generated by the compiler, such as for
array An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
indexing calculations, where it is not possible for the developer to manually intervene. In some cases language features may create many duplicate expressions. For instance, C macros, where macro expansions may result in common subexpressions not apparent in the original source code. Compilers need to be judicious about the number of temporaries created to hold values. An excessive number of temporary values creates register pressure possibly resulting in spilling registers to memory, which may take longer than simply recomputing an arithmetic result when it is needed.


See also

*
Global value numbering Value numbering is a technique of determining when two computations in a program are equivalent and eliminating one of them with a semantics-preserving optimization. Global value numbering Global value numbering (GVN) is a compiler optimization bas ...
*
Loop-invariant code motion In computer programming, loop-invariant code consists of statements or expressions (in an imperative programming language) that can be moved outside the body of a loop without affecting the semantics of the program. Loop-invariant code motion ( ...


References

*
Steven S. Muchnick Steven Stanley Muchnick (1945-2020) was a noted computer science researcher, best known as author of the 1997 treatise on compilers, ''"Advanced Compiler Design and Implementation."'' Background In 1974, Muchnick was awarded a PhD in computer sc ...
,
Advanced Compiler Design and Implementation
' ( Morgan Kaufmann, 1997) pp. 378–396 * John Cocke
"Global Common Subexpression Elimination."
''Proceedings of a Symposium on Compiler Construction'', ''ACM SIGPLAN Notices'' 5(7), July 1970, pages 850-856. * Briggs, Preston, Cooper, Keith D., and Simpson, L. Taylor.
Value Numbering
" ''Software-Practice and Experience'', 27(6), June 1997, pages 701-724. {{Compiler optimizations Compiler optimizations