HOME

TheInfoList



OR:

Distributed
data flow In computing, dataflow is a broad concept, which has various meanings depending on the application and context. In the context of software architecture, data flow relates to stream processing or reactive programming. Software architecture Dataf ...
(also abbreviated as ''distributed flow'') refers to a set of
events Event may refer to: Gatherings of people * Ceremony, an event of ritual significance, performed on a special occasion * Convention (meeting), a gathering of individuals engaged in some common interest * Event management, the organization of ev ...
in a
distributed application A distributed system is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another from any system. Distributed computing is a field of computer sci ...
or
protocol Protocol may refer to: Sociology and politics * Protocol (politics), a formal agreement between nation states * Protocol (diplomacy), the etiquette of diplomacy and affairs of state * Etiquette, a code of personal behavior Science and technolog ...
. Distributed data flows serve a purpose analogous to variables or method parameters in
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
s such as
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
, in that they can represent state that is stored or communicated by a layer of software. Unlike variables or parameters, which represent a unit of state that resides in a single location, distributed flows are dynamic and distributed: they simultaneously appear in multiple locations within the network at the same time. As such, distributed flows are a more natural way of modeling the semantics and inner workings of certain classes of distributed systems. In particular, the distributed data flow abstraction has been used as a convenient way of expressing the high-level logical relationships between parts of distributed protocols.Ostrowski, K., Birman, K., Dolev, D., and Sakoda, C. (2009). "Implementing Reliable Event Streams in Large Systems via Distributed Data Flows and Recursive Delegation", ''3rd ACM International Conference on Distributed Event-Based Systems (DEBS 2009)'', Nashville, TN, USA, July 6–9, 2009, http://www.cs.cornell.edu/~krzys/krzys_debs2009.pdfOstrowski, K., Birman, K., and Dolev, D. (2009). "Distributed Data Flow Language for Multi-Party Protocols", ''5th ACM SIGOPS Workshop on Programming Languages and Operating Systems (PLOS 2009)'', Big Sky, MT, USA. October 11, 2009, http://www.cs.cornell.edu/~krzys/krzys_plos2009.pdfOstrowski, K., Birman, K., Dolev, D. (2009). "Programming Live Distributed Objects with Distributed Data Flows", Submitted to the ''International Conference on Object Oriented Programming, Systems, Languages and Applications (OOPSLA 2009)'', http://www.cs.cornell.edu/~krzys/krzys_oopsla2009.pdf


Informal properties

A distributed data flow satisfies the following informal properties. * Asynchronous, non-blocking, and one-way. Each event represents a single instance of a non-blocking, one-way,
asynchronous method invocation In multithreaded computer programming, asynchronous method invocation (AMI), also known as asynchronous method calls or the asynchronous pattern is a design pattern in which the call site is not blocked while waiting for the called code to finish ...
or other form of explicit or implicit
message passing In computer science, message passing is a technique for invoking behavior (i.e., running a program) on a computer. The invoking program sends a message to a process (which may be an actor or object) and relies on that process and its supporting i ...
between two layers or software components. For example, each event might represent a single request to
multicast In computer networking, multicast is group communication where data transmission is addressed to a group of destination computers simultaneously. Multicast can be one-to-many or many-to-many distribution. Multicast should not be confused with ...
a
packet Packet may refer to: * A small container or pouch ** Packet (container), a small single use container ** Cigarette packet ** Sugar packet * Network packet, a formatted unit of data carried by a packet-mode computer network * Packet radio, a fo ...
, issued by an
application layer An application layer is an abstraction layer that specifies the shared communications protocols and Interface (computing), interface methods used by Host (network), hosts in a communications network. An ''application layer'' abstraction is speci ...
to an underlying multicast protocol. The requirement that events are one-way and asynchronous is important. Invocations of methods that may return results would normally be represented as two separate flows: one flow that represents the requests, and another flow that represents responses. * Homogeneous, unidirectional, and uniform. All events in the distributed flow serve the same functional and logical purpose, and are related to one another; generally, we require that they represent method calls or message exchanges between instances of the same functional
layers Layer or layered may refer to: Arts, entertainment, and media * ''Layers'' (Kungs album) * ''Layers'' (Les McCann album) * ''Layers'' (Royce da 5'9" album) *"Layers", the title track of Royce da 5'9"'s sixth studio album *Layer, a female Maveric ...
, or instances of the same components, but perhaps on different
nodes In general, a node is a localized swelling (a "knot") or a point of intersection (a Vertex (graph theory), vertex). Node may refer to: In mathematics *Vertex (graph theory), a vertex in a mathematical graph *Vertex (geometry), a point where two ...
within a
computer network A computer network is a set of computers sharing resources located on or provided by network nodes. The computers use common communication protocols over digital interconnections to communicate with each other. These interconnections are ...
. Furthermore, all events must flow in the same direction (i.e., one type of a layer or component always produces, and the other always consumes the events), and carry the same type of a
payload Payload is the object or the entity which is being carried by an aircraft or launch vehicle. Sometimes payload also refers to the carrying capacity of an aircraft or launch vehicle, usually measured in terms of weight. Depending on the nature of ...
. For example, a set of events that includes all multicast requests issued by the same application layer to the same multicast protocol is a distributed flow. On the other hand, a set of events that includes multicast requests made by different applications to different multicast protocols would not be considered a distributed flow, and neither would be a set of events that represent multicast requests as well as acknowledgments and error notifications. * Concurrent, continuous, and distributed. The flow usually includes all events that flow between the two layers of software, simultaneously at different locations, and over a finite or infinite period of time. Thus, in general, events in a distributed flow are distributed both in space (they occur at different nodes) and in time (they occur at different times). For example, the flow of multicast requests would include all such requests made by instances of the given application on different nodes; normally, such flow would include events that occur on all nodes participating in the given multicast protocol. A flow, in which all events occur at the same node would be considered degenerate.


Formal representation

Formally, we represent each event in a distributed flow as a quadruple of the form (x,t,k,v), where ''x'' is the location (e.g., the network address of a physical node) at which the event occurs, ''t'' is the time at which this happens, ''k'' is a version, or a sequence number identifying the particular event, and ''v'' is a value that represents the event payload (e.g., all the arguments passed in a method call). Each distributed flow is a (possibly infinite) set of such quadruples that satisfies the following three formal properties. * For any finite point in time ''t'', there can be only finitely many events in the flow that occur at time ''t'' or earlier. This implies that in which flow, one can always point to the point in time at which the flow originated. The flow itself can be infinite; in such case, at any point in time, eventually a new event will appear in the flow. * For any pair of events ''e_1'' and ''e_2'' that occur at the same location, if e_1 occurs at an earlier time than e_2, then the version number in e_1 must also be smaller than that of e_2. * For any pair of events ''e_1'' and ''e_2'' that occur at the same location, if the two events have the same version numbers, they must also have the same values. In addition to the above, flows can have a number of additional properties. * Consistency. A distributed flow is said to be ''consistent'' if events with the same version always have the same value, even if they occur at different locations. Consistent flows typically represent various sorts of global decisions made by the protocol or application. * Monotonicity. A distributed flow is said to be ''weakly monotonic'' if for any pair of events e_1 and e_2 that occur at the same location, if e_1 has a smaller version than e_2, then e_1 must carry a smaller value than e_2. A distributed flow is said to be ''strongly monotonic'' (or simply ''monotonic'') if this is true even for pairs of events e_1 and e_2 that occur at different locations. Strongly monotonic flows are always consistent. They typically represent various sorts of irreversible decisions. Weakly monotonic flows may or may not be consistent.


References

{{DEFAULTSORT:Distributed Data Flow Network protocols Distributed computing architecture