Use-def Chain
   HOME

TheInfoList



OR:

Within
computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
, a use-definition chain (or UD chain) is a
data structure In computer science, a data structure is a data organization and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the relationships amo ...
that consists of a use ''U'', of a variable, and all the definitions ''D'' of that variable that can reach that use without any other intervening definitions. A UD Chain generally means the assignment of some value to a variable. A counterpart of a ''UD Chain'' is a definition-use chain (or DU chain), which consists of a definition ''D'' of a variable and all the uses ''U'' reachable from that definition without any other intervening definitions. Both UD and DU chains are created by using a form of
static code analysis In computer science, static program analysis (also known as static analysis or static simulation) is the analysis of computer programs performed without executing them, in contrast with dynamic program analysis, which is performed on programs duri ...
known as
data flow analysis Data-flow analysis is a technique for gathering information about the possible set of values calculated at various points in a computer program. It forms the foundation for a wide variety of compiler optimizations and program verification techn ...
. Knowing the use-def and def-use chains for a program or subprogram is a prerequisite for many
compiler optimization An optimizing compiler is a compiler designed to generate code that is optimized in aspects such as minimizing program execution time, memory usage, storage size, and power consumption. Optimization is generally implemented as a sequence of op ...
s, including
constant propagation Constant folding and constant propagation are related compiler optimizations used by many modern compilers. An advanced form of constant propagation known as sparse conditional constant propagation can more accurately propagate constants and sim ...
and
common subexpression elimination In compiler theory, common subexpression elimination (CSE) is a compiler optimization that searches for instances of identical expressions (i.e., they all evaluate to the same value), and analyzes whether it is worthwhile replacing them with a sin ...
.


Purpose

Making the use-define or define-use chains is a step in liveness analysis, so that logical representations of all the variables can be identified and tracked through the code. Consider the following snippet of code: int x = 0; /* A */ x = x + y; /* B */ /* 1, some uses of x */ x = 35; /* C */ /* 2, some more uses of x */ Notice that x is assigned a value at three points (marked A, B, and C). However, at the point marked "1", the use-def chain for x should indicate that its current value must have come from line B (and its value at line B must have come from line A). Contrariwise, at the point marked "2", the use-def chain for x indicates that its current value must have come from line C. Since the value of the x in block 2 does not depend on any definitions in block 1 or earlier, x might as well be a different variable there; practically speaking, it ''is'' a different variable — call it x2. int x = 0; /* A */ x = x + y; /* B */ /* 1, some uses of x */ int x2 = 35; /* C */ /* 2, some uses of x2 */ The process of splitting x into two separate variables is called live range splitting. See also
static single assignment form In compiler design, static single assignment form (often abbreviated as SSA form or simply SSA) is a type of intermediate representation (IR) where each variable is assigned exactly once. SSA is used in most high-quality optimizing compilers for ...
.


Setup

The list of statements determines a strong order among statements. *Statements are labeled using the following conventions: , where ''i'' is an integer in ; and ''n'' is the number of statements in the
basic block In compiler construction, a basic block is a straight-line code sequence with no branches in except to the entry and no branches out except at the exit. This restricted form makes a basic block highly amenable to analysis. Compilers usually decom ...
*Variables are identified in italic (e.g., ''v'',''u'' and ''t'') *Every variable is assumed to have a definition in the context or scope. (In
static single assignment In compiler design, static single assignment form (often abbreviated as SSA form or simply SSA) is a type of intermediate representation (IR) where each variable is assigned exactly once. SSA is used in most high-quality optimizing compilers for ...
form, use-define chains are explicit because each chain contains a single element.) For a variable, such as ''v'', its declaration is identified as ''V'' (italic capital letter), and for short, its declaration is identified as . In general, a declaration of a variable can be in an outer scope (e.g., a
global variable In computer programming, a global variable is a variable with global scope, meaning that it is visible (hence accessible) throughout the program, unless shadowed. The set of all global variables is known as the ''global environment'' or ''global ...
).


Definition of a variable

When a variable, ''v'', is on the
LHS The initials LHS are used for: Schools Australia * Leumeah High School, New South Wales * Lyneham High School, Australian Capital Territory * Lurnea High School, New South Wales United Kingdom * Larbert High School, Stenhousemuir, Scotland * Lith ...
of an assignment statement, such as , then is a definition of ''v''. Every variable (''v'') has at least one definition by its declaration (''V'') (or initialization).


Use of a variable

If variable, ''v'', is on the RHS of statement , there is a statement, with ''i'' < ''j'' and , that it is a definition of ''v'' and it has a use at (or, in short, when a variable, ''v'', is on the RHS of a statement , then ''v'' has a use at statement ).


Execution

Consider the sequential execution of the list of statements, , and what can now be observed as the computation at statement, ''j'': *A definition at statement with ''i'' < ''j'' is alive at ''j'', if it has a use at a statement with ''k'' ≥ ''j''. The set of alive definitions at statement ''i'' is denoted as and the number of alive definitions as , A(i), . ( is a simple but powerful concept: theoretical and practical results in space complexity theory, access complexity(I/O complexity),
register allocation In compiler optimization, register allocation is the process of assigning local automatic variables and Expression (computer science), expression results to a limited number of processor registers. Register allocation can happen over a basic bloc ...
and
cache locality In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference localit ...
exploitation are based on .) *A definition at statement kills all previous definitions ( with ''k'' < ''i'') for the same variables.


Execution example for def-use-chain

This example is based on a Java algorithm for finding the gcd. (It is not important to understand what this function does.) /** * @param(a, b) The values used to calculate the divisor. * @return The greatest common divisor of a and b. */ int gcd(int a, int b) To find out all def-use-chains for variable d, do the following steps: #Search for the first time the variable is defined (write access). #: In this case it is "" (l.7) # Search for the first time the variable is read. #: In this case it is "" # Write down this information in the following style: ame of the variable you are creating a def-use-chain for, the concrete write access, the concrete read access#: In this case it is: Repeat these steps in the following style: combine each write access with each read access (but NOT the other way round). The result should be: , d=b, return d , d=b, while(d!=0) , d=b, if(c>d)
, d=b, c=c-d The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...
, d=b, d=d-c , d=d-c, while(d!=0) , d=d-c, if(c>d) , d=d-c, c=c-d , d=d-c, d=d-c
You have to take care, if the variable is changed by the time. For example: From line 7 down to line 13 in the source code, is not redefined / changed. At line 14, could be redefined. This is why you have to recombine this write access on with all possible read accesses which could be reached. In this case, only the code beyond line 10 is relevant. Line 7, for example, cannot be reached again. For your understanding, you can imagine 2 different variables : 1, d1=b, return d1 1, d1=b, while(d1!=0) 1, d1=b, if(c>d1) 1, d1=b, c=c-d1 1, d1=b, d1=d1-c 2, d2=d2-c, while(d2!=0) 2, d2=d2-c, if(c>d2) 2, d2=d2-c, c=c-d2 2, d2=d2-c, d2=d2-c As a result, you could get something like this. The variable would be replaced by /** * @param(a, b) The values used to calculate the divisor. * @return The greatest common divisor of a and b. **/ int gcd(int a, int b)


Method of building a ''use-def'' (or ''ud'') chain

# Set definitions in statement # For each in , find live definitions that have use in statement # Make a link among definitions and uses # Set the statement , as definition statement # Kill previous definitions With this algorithm, two things are accomplished: #A
directed acyclic graph In mathematics, particularly graph theory, and computer science, a directed acyclic graph (DAG) is a directed graph with no directed cycles. That is, it consists of vertices and edges (also called ''arcs''), with each edge directed from one ...
(DAG) is created on the variable uses and definitions. The DAG specifies a data dependency among assignment statements, as well as a
partial order In mathematics, especially order theory, a partial order on a set is an arrangement such that, for certain pairs of elements, one precedes the other. The word ''partial'' is used to indicate that not every pair of elements needs to be comparable ...
(therefore parallelism among statements). #When statement is reached, there is a list of ''live'' variable assignments. If only one assignment is live, for example,
constant propagation Constant folding and constant propagation are related compiler optimizations used by many modern compilers. An advanced form of constant propagation known as sparse conditional constant propagation can more accurately propagate constants and sim ...
might be used.


References

{{Compiler optimizations Compiler optimizations Data-flow analysis