Data Dependency
   HOME

TheInfoList



OR:

A data dependency in
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includi ...
is a situation in which a
program statement In computer programming, a statement is a syntactic unit of an imperative programming language that expresses some action to be carried out. A program written in such a language is formed by a sequence of one or more statements. A statement may ha ...
(instruction) refers to the data of a preceding statement. In
compiler theory In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
, the technique used to discover data dependencies among statements (or instructions) is called
dependence analysis In compiler theory, dependence analysis produces execution-order constraints between statements/instructions. Broadly speaking, a statement ''S2'' depends on ''S1'' if ''S1'' must be executed before ''S2''. Broadly, there are two classes of depen ...
. There are three types of dependencies: data, name, and control.


Data dependencies

Assuming statement S_1 and S_2, S_2 depends on S_1 if: :\left (S_1) \cap O(S_2)\right\cup \left (S_1) \cap I(S_2)\right\cup \left (S_1) \cap O(S_2)\right\neq \varnothing where: * I(S_i) is the set of memory locations read by * O(S_j) is the set of memory locations written by and * there is a feasible run-time execution path from S_1 to This Condition is called Bernstein Condition, named by A. J. Bernstein. Three cases exist: * Anti-dependence: I(S_1) \cap O(S_2) \neq \varnothing, S_1 \rightarrow S_2 and S_1 reads something before S_2 overwrites it * Flow (data) dependence: O(S_1) \cap I(S_2) \neq \varnothing, S_1 \rightarrow S_2 and S_1 writes before something read by S_2 * Output dependence: O(S_1) \cap O(S_2) \neq \varnothing, S_1 \rightarrow S_2 and both write the same memory location.


Flow dependency (True dependency)

A Flow dependency, also known as a data dependency or true dependency or read-after-write (RAW), occurs when an instruction depends on the result of a previous instruction: also known as name dependency 1. A = 3 2. B = A 3. C = B Instruction 3 is truly dependent on instruction 2, as the final value of C depends on the instruction updating B. Instruction 2 is truly dependent on instruction 1, as the final value of B depends on the instruction updating A. Since instruction 3 is truly dependent upon instruction 2 and instruction 2 is truly dependent on instruction 1, instruction 3 is also truly dependent on instruction 1.
Instruction level parallelism Instruction-level parallelism (ILP) is the parallel or simultaneous execution of a sequence of instructions in a computer program. More specifically ILP refers to the average number of instructions run per step of this parallel execution. Disc ...
is therefore not an option in this example.


Anti-dependency

An anti-dependency, also known as write-after-read (WAR), occurs when an instruction requires a value that is later updated. In the following example, instruction 2 anti-depends on instruction 3 — the ordering of these instructions cannot be changed, nor can they be executed in parallel (possibly changing the instruction ordering), as this would affect the final value of A. 1. B = 3 2. A = B + 1 3. B = 7 Example : MUL R3,R1,R2 ADD R2,R5,R6 It is clear that there is anti-dependence between these 2 instructions. At first we read R2 then in second instruction we are Writing a new value for it. An anti-dependency is an example of a ''name dependency''. That is, renaming of variables could remove the dependency, as in the next example: 1. B = 3 N. B2 = B 2. A = B2 + 1 3. B = 7 A new variable, B2, has been declared as a copy of B in a new instruction, instruction N. The anti-dependency between 2 and 3 has been removed, meaning that these instructions may now be executed in parallel. However, the modification has introduced a new dependency: instruction 2 is now truly dependent on instruction N, which is truly dependent upon instruction 1. As flow dependencies, these new dependencies are impossible to safely remove.


Output dependency

An output dependency, also known as write-after-write (WAW), occurs when the ordering of instructions will affect the final output value of a variable. In the example below, there is an output dependency between instructions 3 and 1 — changing the ordering of instructions in this example will change the final value of A, thus these instructions cannot be executed in parallel. 1. B = 3 2. A = B + 1 3. B = 7 As with anti-dependencies, output dependencies are ''name dependencies''. That is, they may be removed through renaming of variables, as in the below modification of the above example: 1. B2 = 3 2. A = B2 + 1 3. B = 7 A commonly used naming convention for data dependencies is the following: Read-after-Write or RAW (flow dependency), Write-After-Read or WAR (anti-dependency), or Write-after-Write or WAW (output dependency).


Control dependency

An instruction B has a control dependency on a preceding instruction A if the outcome of A determines whether B should be executed or not. In the following example, the instruction S_2 has a control dependency on instruction S_1. However, S_3 does not depend on S_1 because S_3 is always executed irrespective of the outcome of S_1. S1. if (a

b) S2. a = a + b S3. b = a + b Intuitively, there is control dependence between two statements A and B if * B could be possibly executed after A * The outcome of the execution of A will determine whether B will be executed or not. A typical example is that there are control dependences between the condition part of an if statement and the statements in its true/false bodies. A formal definition of control dependence can be presented as follows: A statement S_2 is said to be control dependent on another statement S_1
iff In logic and related fields such as mathematics and philosophy, "if and only if" (shortened as "iff") is a biconditional logical connective between statements, where either both statements are true or both are false. The connective is bicon ...
* there exists a path P from S_1 to S_2 such that every statement S_iS_1 within P will be followed by S_2 in each possible path to the end of the program and * S_1 will not necessarily be followed by S_2, i.e. there is an execution path from S_1 to the end of the program that does not go through S_2. Expressed with the help of (post-)dominance the two conditions are equivalent to * S_2 post-dominates all S_i * S_2 does not post-dominate S_1


Construction of control dependences

Control dependences are essentially the dominance frontier in the reverse graph of the
control-flow graph In computer science, a control-flow graph (CFG) is a representation, using graph notation, of all paths that might be traversed through a program during its execution. The control-flow graph was discovered by Frances E. Allen, who noted that ...
(CFG). Thus, one way of constructing them, would be to construct the post-dominance frontier of the CFG, and then reversing it to obtain a control dependence graph. The following is a pseudo-code for constructing the post-dominance frontier: for each X in a bottom-up traversal of the post-dominator tree do: PostDominanceFrontier(X) ← ∅ for each Y ∈ Predecessors(X) do: if immediatePostDominator(Y) ≠ X: then PostDominanceFrontier(X) ← PostDominanceFrontier(X) ∪ done for each Z ∈ Children(X) do: for each Y ∈ PostDominanceFrontier(Z) do: if immediatePostDominator(Y) ≠ X: then PostDominanceFrontier(X) ← PostDominanceFrontier(X) ∪ done done done Here, Children(X) is the set of nodes in the CFG that are immediately post-dominated by X, and Predecessors(X) are the set of nodes in the CFG that directly precede X in the CFG. Note that node X shall be processed only after all its Children have been processed. Once the post-dominance frontier map is computed, reversing it will result in a map from the nodes in the CFG to the nodes that have a control dependence on them.


Implications

Conventional programs are written assuming the
sequential execution model In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members (also called ''elements'', or ''terms''). The number of elements (possibly infinite) is called ...
. Under this model, instructions execute one after the other, atomically (i.e., at any given point in time, only one instruction is executed) and in the order specified by the program. However, dependencies among statements or instructions may hinder parallelism — parallel execution of multiple instructions, either by a parallelizing compiler or by a processor exploiting
instruction-level parallelism Instruction-level parallelism (ILP) is the parallel or simultaneous execution of a sequence of instructions in a computer program. More specifically ILP refers to the average number of instructions run per step of this parallel execution. Disc ...
. Recklessly executing multiple instructions without considering related dependences may cause danger of getting wrong results, namely
hazards A hazard is a potential source of harm. Substances, events, or circumstances can constitute hazards when their nature would allow them, even just theoretically, to cause damage to health, life, property, or any other interest of value. The probabi ...
.


References

{{reflist Compilers Analysis of parallel algorithms