HOME

TheInfoList



OR:

In
compiler construction In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily ...
, strength reduction is a
compiler optimization In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power con ...
where expensive operations are replaced with equivalent but less expensive operations. The classic example of strength reduction converts "strong" multiplications inside a loop into "weaker" additions – something that frequently occurs in array addressing. Examples of strength reduction include: * replacing a multiplication within a loop with an addition * replacing an exponentiation within a loop with a multiplication


Code analysis

Most of a program's execution time is typically spent in a small section of code (called a hot spot), and that code is often inside a loop that is executed over and over. A compiler uses methods to identify loops and recognize the characteristics of register values within those loops. For strength reduction, the compiler is interested in: *Loop invariants: the values which do not change within the body of a loop. *Induction variables: the values which are being iterated each time through the loop. Loop invariants are essentially constants within a loop, but their value may change outside of the loop. Induction variables are changing by known amounts. The terms are relative to a particular loop. When loops are nested, an induction variable in the outer loop can be a loop invariant in the inner loop. Strength reduction looks for expressions involving a loop invariant and an induction variable. Some of those expressions can be simplified. For example, the multiplication of loop invariant c and induction variable i c = 7; for (i = 0; i < N; i++) can be replaced with successive weaker additions c = 7; k = 0; for (i = 0; i < N; i++)


Strength reduction example

Below is an example that will strength-reduce all the loop multiplications that arose from array indexing address calculations. Imagine a simple loop that sets an array to the
identity matrix In linear algebra, the identity matrix of size n is the n\times n square matrix with ones on the main diagonal and zeros elsewhere. Terminology and notation The identity matrix is often denoted by I_n, or simply by I if the size is immaterial o ...
. for (i = 0; i < n; i++)


Intermediate code

The compiler will view this code as 0010 ; for (i = 0, i < n; i++) 0020 ; 0410 G0001: This expresses 2-dimensional array ''A'' as a 1-dimensional array of n*n size, so that whenever the high-level code expresses A
, y The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline ...
it will internally be A x*n)+yfor any given valid indices x and y.


Many optimizations

The compiler will start doing many optimizations – not just strength reduction. Expressions that are constant (invariant) within a loop will be hoisted out of the loop. Constants can be loaded outside of both loops—such as floating point registers fr3 and fr4. Recognition that some variables don't change allows registers to be merged; n is constant, so r2, r4, r7, r12 can be hoisted and collapsed. The common value i*n is computed in (the hoisted) r8 and r13, so they collapse. The innermost loop (0120-0260) has been reduced from 11 to 7 intermediate instructions. The only multiply that remains in the innermost loop is line 0210's multiply by 8. 0010 ; for (i = 0, i < n; i++) 0020 0410 G0001: There are more optimizations to do. Register r3 is the main variable in the innermost loop (0140-0260); it gets incremented by 1 each time through the loop. Register r8 (which is invariant in the innermost loop) is added to r3. Instead of using r3, the compiler can eliminate r3 and use r9. The loop, instead of being controlled by r3 = 0 to n-1, can be controlled by r9=r8+0 to r8+n-1. That adds four instructions and kills four instructions, but there's one fewer instruction inside the loop. 0110 ; r3 = #0 killed ; j = 0 0115 r9 = r8 ; new assignment 0117 r20 = r8 + r2 ; new limit 0120 G0002: 0140 ; cmp r3, r2 killed ; j < n 0145 cmp r9, r20 ; r8 + j < r8 + n = r20 0150 bge G0003 0160 0170 ; A ,j= 0.0; 0200 ; r9 = r8 + r3 killed ; calculate subscript i * n + j 0210 r10 = r9 * #8 ; calculate byte address 0230 fstore fr3, A 100240 0250 ; r3 = r3 + #1 killed ; j++ 0255 r9 = r9 + #1 ; new loop variable 0260 br G0002 Now r9 is the loop variable, but it interacts with the multiply by 8. Here we get to do some strength reduction. The multiply by 8 can be reduced to some successive additions of 8. Now there are no multiplications inside the loop. 0115 r9 = r8 ; new assignment 0117 r20 = r8 + r2 ; new limit 0118 r10 = r8 * #8 ; initial value of r10 0120 G0002: 0145 cmp r9, r20 ; r8 + j < r8 + n = r20 0150 bge G0003 0160 0170 ; A ,j= 0.0; 0210 ; r10 = r9 * #8 killed ; calculate byte address 0230 fstore fr3, A 100240 0245 r10 = r10 + #8 ; strength reduced multiply 0255 r9 = r9 + #1 ; loop variable 0260 br G0002 Registers r9 and r10 (= 8*r9) aren't both needed; r9 can be eliminated in the loop. The loop is now 5 instructions. 0115 ; r9 = r8 killed 0117 r20 = r8 + r2 ; limit 0118 r10 = r8 * #8 ; initial value of r10 0119 r22 = r20 * #8 ; new limit 0120 G0002: 0145 ; cmp r9, r20 killed ; r8 + j < r8 + n = r20 0147 cmp r10, r22 ; r10 = 8*(r8 + j) < 8*(r8 + n) = r22 0150 bge G0003 0160 0170 ; A ,j= 0.0; 0230 fstore fr3, A 100240 0245 r10 = r10 + #8 ; strength reduced multiply 0255 ; r9 = r9 + #1 killed ; loop variable 0260 br G0002


Outer loop

Back to the whole picture: 0010 ; for (i = 0, i < n; i++) 0020 0410 G0001: There are now four multiplications within the outer loop that increments r1. Register r8 = r1*r2 at 0190 can be strength reduced by setting it before entering the loop (0055) and incrementing it by r2 at the bottom of the loop (0385). The value r8*8 (at 0118) can be strength reduced by initializing it (0056) and adding 8*r2 to it when r8 gets incremented (0386). Register r20 is being incremented by the invariant/constant r2 each time through the loop at 0117. After being incremented, it is multiplied by 8 to create r22 at 0119. That multiplication can be strength reduced by adding 8*r2 each time through the loop. 0010 ; for (i = 0, i < n; i++) 0020 0410 G0001:


The last multiply

That leaves the two loops with only one multiplication operation (at 0330) within the outer loop and no multiplications within the inner loop. 0010 ; for (i = 0, i < n; i++) 0020 0410 G0001: At line 0320, r14 is the sum of r8 and r1, and r8 and r1 are being incremented in the loop. Register r8 is being bumped by r2 (=n) and r1 is being bumped by 1. Consequently, r14 is being bumped by n+1 each time through the loop. The last loop multiply at 0330 can be strength reduced by adding (r2+1)*8 each time through the loop. 0010 ; for (i = 0, i < n; i++) 0020 0410 G0001: There's still more to go. Constant folding will recognize that r1=0 in the preamble, so several instructions will clean up. Register r8 isn't used in the loop, so it can disappear. Furthermore, r1 is only being used to control the loop, so r1 can be replaced by a different induction variable such as r40. Where i went 0 <= i < n, register r40 goes 0 <= r40 < 8 * n * n. 0010 ; for (i = 0, i < n; i++) 0020 0410 G0001:


Other strength reduction operations

::''This material is disputed. It is better described as
peephole optimization Peephole optimization is an optimization technique performed on a small set of compiler-generated instructions; the small set is known as the peephole or window. Peephole optimization involves changing the small set of instructions to an equiva ...
and
instruction assignment Instruction or instructions may refer to: Computing * Instruction, one operation of a processor within a computer architecture instruction set * Computer program, a collection of instructions Music * Instruction (band), a 2002 rock band from New ...
. Operator strength reduction uses
mathematical identities In mathematics, an identity is an equality relating one mathematical expression ''A'' to another mathematical expression ''B'', such that ''A'' and ''B'' (which might contain some variables) produce the same value for all values of th ...
to replace slow math operations with faster operations. The benefits depend on the target CPU and sometimes on the surrounding code (which can affect the availability of other functional units within the CPU). * replacing integer division or multiplication by a power of 2 with an
arithmetic shift In computer programming, an arithmetic shift is a shift operator, sometimes termed a signed shift (though it is not restricted to signed operands). The two basic types are the arithmetic left shift and the arithmetic right shift. For binary ...
or
logical shift In computer science, a logical shift is a bitwise operation that shifts all the bits of its operand. The two base variants are the logical left shift and the logical right shift. This is further modulated by the number of bit positions a gi ...
* replacing integer multiplication by a constant with a combination of shifts, adds or subtracts * replacing integer division by a constant with a multiplication, taking advantage of the limited range of machine integers. This method also works if divisor is a non-integer sufficiently greater than 1, e.g. √2 or π.


Induction variable (orphan)

Induction variable In computer science, an induction variable is a variable that gets successor function, increased or decreased by a fixed amount on every iteration of a loop or is a linear function of another induction variable. For example, in the following loop, ...
or recursive strength reduction replaces a function of some systematically changing variable with a simpler calculation using previous values of the function. In a
procedural programming language Procedural programming is a programming paradigm, derived from imperative programming, based on the concept of the ''procedure call''. Procedures (a type of routine or subroutine) simply contain a series of computational steps to be carried ...
this would apply to an expression involving a loop variable and in a
declarative language In computer science, declarative programming is a programming paradigm—a style of building the structure and elements of computer programs—that expresses the logic of a computation without describing its control flow. Many languages that a ...
it would apply to the argument of a recursive function. For example, f x = ... (3 ** x) ... (f (x + 1)) ... becomes f x = f' x 1 where f' x z = ... z ... (f' (x + 1) (3 * z)) ... Here modified recursive function takes a second parameter z = 3 ** x, allowing the expensive computation (3 ** x) to be replaced by the cheaper (3 * z).


See also

*
Memoization In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again. Memoization ...
*
Partial evaluation In computing, partial evaluation is a technique for several different types of program optimization by specialization. The most straightforward application is to produce new programs that run faster than the originals while being guaranteed to ...
*
Fast inverse square root Fast inverse square root, sometimes referred to as Fast InvSqrt() or by the hexadecimal constant 0x5F3759DF, is an algorithm that estimates \frac, the reciprocal (or multiplicative inverse) of the square root of a 32-bit floating-point number ...


Notes


References

* * * * {{DEFAULTSORT:Strength Reduction Compiler optimizations