Dataflow architecture is a
dataflow-based
computer architecture that directly contrasts the traditional
von Neumann architecture or
control flow architecture. Dataflow architectures have no
program counter
The program counter (PC), commonly called the instruction pointer (IP) in Intel x86 and Itanium microprocessors, and sometimes called the instruction address register (IAR), the instruction counter, or just part of the instruction sequencer, ...
, in concept: the executability and execution of instructions is solely determined based on the availability of input arguments to the instructions,
so that the order of instruction execution may be hard to predict.
Although no commercially successful general-purpose computer hardware has used a dataflow architecture, it has been successfully implemented in specialized hardware such as in
digital signal processing
Digital signal processing (DSP) is the use of digital processing, such as by computers or more specialized digital signal processors, to perform a wide variety of signal processing operations. The digital signals processed in this manner are a ...
,
network routing,
graphics processing,
telemetry, and more recently in data warehousing, and
artificial intelligence
Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...
(as: polymorphic dataflow Convolution Engine, structure-driven, dataflow
scheduling). It is also very relevant in many software architectures today including
database
In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...
engine designs and
parallel computing
Parallel computing is a type of computing, computation in which many calculations or Process (computing), processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. ...
frameworks.
Synchronous dataflow architectures tune to match the workload presented by real-time data path applications such as wire speed packet forwarding. Dataflow architectures that are deterministic in nature enable programmers to manage complex tasks such as processor
load balancing, synchronization and accesses to common resources.
Meanwhile, there is a clash of terminology, since the term ''
dataflow'' is used for a subarea of parallel programming: for
dataflow programming.
History
Hardware architectures for dataflow was a major topic in
computer architecture research in the 1970s and early 1980s.
Jack Dennis of
MIT pioneered the field of static dataflow architectures while the Manchester Dataflow Machine
[Manchester Dataflow Research Project, Research Reports: Abstracts, September 1997](_blank)
/ref> and MIT Tagged Token architecture were major projects in dynamic dataflow.
The research, however, never overcame the problems related to:
* Efficiently broadcasting data tokens in a massively parallel system.
* Efficiently dispatching instruction tokens in a massively parallel system.
* Building content-addressable memory
Content-addressable memory (CAM) is a special type of computer memory used in certain very-high-speed searching applications. It is also known as associative memory or associative storage and compares input search data against a table of stored ...
(CAM) large enough to hold all of the dependencies of a real program.
Instructions and their data dependencies proved to be too fine-grained to be effectively distributed in a large network. That is, the time for the instructions and tagged results to travel through a large connection network was longer than the time to do many computations.
Maurice Wilkes
Sir Maurice Vincent Wilkes (26 June 1913 – 29 November 2010) was an English computer scientist who designed and helped build the EDSAC, Electronic Delay Storage Automatic Calculator (EDSAC), one of the earliest stored-program computers, and ...
wrote in 1995 that "Data flow stands apart as being the most radical of all approaches to parallelism and the one that has been least successful. ... If any practical machine based on data flow ideas and offering real power ever emerges, it will be very different from what the originators of the concept had in mind."[M. V. Wilkes, ''Computing Perspectives'', Morgan Kaufmann, 1995, ISBN 1-55860-317-4, page 79.]
Out-of-order execution
In computer engineering, out-of-order execution (or more formally dynamic execution) is an instruction scheduling paradigm used in high-performance central processing units to make use of instruction cycles that would otherwise be wasted. In t ...
(OOE) has become the dominant computing paradigm since the 1990s. It is a form of restricted dataflow. This paradigm introduced the idea of an ''execution window''. The ''execution window'' follows the sequential order of the von Neumann architecture, however within the window, instructions are allowed to be completed in data dependency order. This is accomplished in CPUs that dynamically tag the data dependencies of the code in the execution window. The logical complexity of dynamically keeping track of the data dependencies, restricts ''OOE'' CPU
A central processing unit (CPU), also called a central processor, main processor, or just processor, is the primary processor in a given computer. Its electronic circuitry executes instructions of a computer program, such as arithmetic, log ...
s to a small number of execution units (2-6) and limits the execution window sizes to the range of 32 to 200 instructions, much smaller than envisioned for full dataflow machines.
Dataflow architecture topics
Static and dynamic dataflow machines
Designs that use conventional memory addresses as data dependency tags are called static dataflow machines. These machines did not allow multiple instances of the same routines to be executed simultaneously because the simple tags could not differentiate between them.
Designs that use content-addressable memory
Content-addressable memory (CAM) is a special type of computer memory used in certain very-high-speed searching applications. It is also known as associative memory or associative storage and compares input search data against a table of stored ...
(CAM) are called dynamic dataflow machines. They use tags in memory to facilitate parallelism.
Compiler
Normally, in the control flow architecture, compiler
In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...
s analyze program source code
In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer.
Since a computer, at base, only ...
for data dependencies between instructions in order to better organize the instruction sequences in the binary output files. The instructions are organized sequentially but the dependency information itself is not recorded in the binaries. Binaries compiled for a dataflow machine contain this dependency information.
A dataflow compiler records these dependencies by creating unique tags for each dependency instead of using variable names. By giving each dependency a unique tag, it allows the non-dependent code segments in the binary to be executed ''out of order'' and in parallel. Compiler detects the loops, break statements and various programming control syntax for data flow.
Programs
Programs are loaded into the CAM of a dynamic dataflow computer. When all of the tagged operands of an instruction become available (that is, output from previous instructions and/or user input), the instruction is marked as ready for execution by an execution unit
In computer engineering, an execution unit (E-unit or EU) is a part of a processing unit that performs the operations and calculations forwarded from the instruction unit. It may have its own internal control sequence unit (not to be confused w ...
.
This is known as ''activating'' or ''firing'' the instruction. Once an instruction is completed by an execution unit, its output data is sent (with its tag) to the CAM. Any instructions that are dependent upon this particular datum (identified by its tag value) are then marked as ready for execution. In this way, subsequent instructions are executed in proper order, avoiding race condition
A race condition or race hazard is the condition of an electronics, software, or other system where the system's substantive behavior is dependent on the sequence or timing of other uncontrollable events, leading to unexpected or inconsistent ...
s. This order may differ from the sequential order envisioned by the human programmer, the programmed order.
Instructions
An instruction, along with its required data operands, is transmitted to an execution unit as a packet, also called an ''instruction token''. Similarly, output data is transmitted back to the CAM as a ''data token''. The packetization of instructions and results allows for parallel execution of ready instructions on a large scale.
Dataflow networks deliver the instruction tokens to the execution units and return the data tokens to the CAM. In contrast to the conventional von Neumann architecture, data tokens are not permanently stored in memory, rather they are transient messages that only exist when in transit to the instruction storage.
See also
* Parallel computing
Parallel computing is a type of computing, computation in which many calculations or Process (computing), processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. ...
* SISAL
Sisal (, ; ''Agave sisalana'') is a species of flowering plant native to southern Mexico, but widely cultivated and naturalized in many other countries. It yields a stiff fibre used in making rope and various other products. The sisal fiber is ...
* Binary Modular Dataflow Machine (BMDFM)
* Systolic array
In parallel computer architectures, a systolic array is a homogeneous network of tightly coupled data processing units (DPUs) called cells or nodes. Each node or DPU independently computes a partial result as a function of the data received fro ...
* Transport triggered architecture
* Network on a chip
A network on a chip or network-on-chip (NoC or )This article uses the convention that "NoC" is pronounced . Therefore, it uses the convention "a" for the indefinite article corresponding to NoC ("a NoC"). Other sources may pronounce it as an ...
(NoC)
** System on a chip
A system on a chip (SoC) is an integrated circuit that combines most or all key components of a computer or Electronics, electronic system onto a single microchip. Typically, an SoC includes a central processing unit (CPU) with computer memory, ...
(SoC)
* In-memory computing
The term is used for two different things:
# In computer science, in-memory processing, also called compute-in-memory (CIM), or processing-in-memory (PIM), is a computer architecture in which data operations are available directly on the data ...
References
{{Hardware acceleration
Hardware acceleration
Classes of computers
Computer architecture