HOME

TheInfoList



OR:

In
computer programming Computer programming is the process of performing a particular computation (or more generally, accomplishing a specific computing result), usually by designing and building an executable computer program. Programming involves tasks such as ana ...
, data-driven programming is a
programming paradigm Programming paradigms are a way to classify programming languages based on their features. Languages can be classified into multiple paradigms. Some paradigms are concerned mainly with implications for the execution model of the language, suc ...
in which the program statements describe the data to be matched and the processing required rather than defining a sequence of steps to be taken. Standard examples of data-driven languages are the text-processing languages
sed sed ("stream editor") is a Unix utility that parses and transforms text, using a simple, compact programming language. It was developed from 1973 to 1974 by Lee E. McMahon of Bell Labs, and is available today for most operating systems. sed w ...
and
AWK AWK (''awk'') is a domain-specific language designed for text processing and typically used as a data extraction and reporting tool. Like sed and grep, it is a filter, and is a standard feature of most Unix-like operating systems. The AWK lang ...
, where the data is a sequence of lines in an
input stream In computer science, a stream is a sequence of data elements made available over time. A stream can be thought of as items on a conveyor belt being processed one at a time rather than in large batches. Streams are processed differently from ...
– these are thus also known as line-oriented languages – and pattern matching is primarily done via
regular expression A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" or ...
s or line numbers.


Related paradigms

Data-driven programming is similar to
event-driven programming In computer programming, event-driven programming is a programming paradigm in which the flow of the program is determined by events such as user actions ( mouse clicks, key presses), sensor outputs, or message passing from other programs or t ...
, in that both are structured as pattern matching and resulting processing, and are usually implemented by a
main loop In computer science, the event loop is a programming construct or design pattern that waits for and dispatches events or messages in a program. The event loop works by making a request to some internal or external "event provider" (that generally ...
, though they are typically applied to different domains. The condition/action model is also similar to
aspect-oriented programming In computing, aspect-oriented programming (AOP) is a programming paradigm that aims to increase modularity by allowing the separation of cross-cutting concerns. It does so by adding behavior to existing code (an advice) ''without'' modifying t ...
, where when a
join point In computer science, a join point is a point in the control flow of a program where the control flow can arrive via two different paths. In particular, it's a basic block that has more than one predecessor. In aspect-oriented programming a set of ...
(condition) is reached, a
pointcut In aspect-oriented programming, a pointcut is a set of join points. Pointcut specifies where exactly to apply advice, which allows separation of concerns and helps in modularizing business logic. Pointcuts are often specified using class names or m ...
(action) is executed. A similar paradigm is used in some
tracing Tracing may refer to: Computer graphics * Image tracing, digital image processing to convert raster graphics into vector graphics * Path tracing, a method of rendering images of three-dimensional scenes such that the global illumination is faithf ...
frameworks such as
DTrace DTrace is a comprehensive dynamic tracing framework originally created by Sun Microsystems for troubleshooting kernel and application problems on production systems in real time. Originally developed for Solaris, it has since been released under ...
, where one lists probes (instrumentation points) and associated actions, which execute when the condition is satisfied. Adapting
abstract data type In computer science, an abstract data type (ADT) is a mathematical model for data types. An abstract data type is defined by its behavior (Semantics (computer science), semantics) from the point of view of a ''User (computing), user'', of the dat ...
design methods to
object-oriented programming Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of pr ...
results in a data-driven design. This type of design is sometimes used in object-oriented programming to define classes during the conception of a piece of software.


Applications

Data-driven programming is typically applied to streams of structured data, for filtering, transforming, aggregating (such as computing statistics), or calling other programs. Typical streams include
log files In computing, logging is the act of keeping a log of events that occur in a computer system, such as problems, errors or just information on current operations. These events may occur in the operating system or in other software. A message or l ...
,
delimiter-separated values Formats that use delimiter-separated values (also DSV)DSV stands for ''Delimiter Separated Values'' store two-dimensional arrays of data by separating the values in each row with specific delimiter characters. Most database and spreadsheet program ...
, or email messages, notably for
email filtering Email filtering is the processing of email to organize it according to specified criteria. The term can apply to the intervention of human intelligence, but most often refers to the automatic processing of messages at an SMTP server, possibly appl ...
. For example, an AWK program may take as input a stream of log statements, and for example send all to the console, write ones starting with WARNING to a "WARNING" file, and send an email to a
sysadmin A system administrator, or sysadmin, or admin is a person who is responsible for the upkeep, configuration, and reliable operation of computer systems, especially multi-user computers, such as servers. The system administrator seeks to ensu ...
in case any line starts with "ERROR". It could also record how many warnings are logged per day. Alternatively, one can process streams of delimiter-separated values, processing each line or aggregated lines, such as the sum or max. In email, a language like
procmail procmail is an email server software component — specifically, a message delivery agent (MDA). It was one of the earliest mail filter programs. It is typically used in Unix-like mail systems, using the mbox and Maildir storage formats. procm ...
can specify conditions to match on some emails, and what actions to take (deliver, bounce, discard, forward, etc.). Some data-driven languages are
Turing-complete In computability theory, a system of data-manipulation rules (such as a computer's instruction set, a programming language, or a cellular automaton) is said to be Turing-complete or computationally universal if it can be used to simulate any Tur ...
, such as AWK and even sed, while others are intentionally very limited, notably for filtering. An extreme example of the latter is
pcap In the field of computer network administration, pcap is an application programming interface (API) for capturing network traffic. While the name is an abbreviation of ''packet capture'', that is not the API's proper name. Unix-like systems ...
, which only consists of filtering, with the only action being “capture”. Less extremely,
sieve A sieve, fine mesh strainer, or sift, is a device for separating wanted elements from unwanted material or for controlling the particle size distribution of a sample, using a screen such as a woven mesh or net or perforated sheet material. T ...
has filters and actions, but in the base standard has no variables or loops, only allowing stateless filtering statements: each input element is processed independently. Variables allow state, which allow operations that depend on more than one input element, such as aggregation (summing inputs) or throttling (allow at most 5 mails per hour from each sender, or limiting repeated log messages). Data-driven languages frequently have a default action: if no condition matches, line-oriented languages may print the line (as in sed), or deliver a message (as in sieve). In some applications, such as filtering, matching is may be done ''exclusively'' (so only ''first'' matching statement), while in other cases ''all'' matching statements are applied. In either case, failure to match ''any'' pattern may be "default behavior" or can be seen as an error, to be caught by a catch-all statement at the end.


Benefits and issues

While the benefits and issues may vary between implementation, there are a few big potential benefits of and problems with this paradigm. Functionality simply requires that it knows the
abstract data type In computer science, an abstract data type (ADT) is a mathematical model for data types. An abstract data type is defined by its behavior (Semantics (computer science), semantics) from the point of view of a ''User (computing), user'', of the dat ...
of the variables it is working with. Functions and
interfaces Interface or interfacing may refer to: Academic journals * Interface (journal), ''Interface'' (journal), by the Electrochemical Society * ''Interface, Journal of Applied Linguistics'', now merged with ''ITL International Journal of Applied Lin ...
can be used on all objects with the same data fields, for instance the object's "position". Data can be grouped into objects or "entities" according to preference with little to no consequence. While data-driven design does prevent coupling of data and functionality, in some cases, data-driven programming has been argued to lead to bad
object-oriented design Object-oriented design (OOD) is the process of planning a system of interacting objects for the purpose of solving a software problem. It is one approach to software design. Overview An object contains encapsulated data and procedures grouped t ...
, especially when dealing with more abstract data. This is because a purely data-driven object or entity is defined by the way it is represented. Any attempt to change the structure of the object would immediately break the functions that rely on it. As an example, one might represent driving directions as a series of intersections (two intersecting streets) where the driver must turn right or left. If an intersection (in the United States) is represented in data by the zip code (5-digit number) and two
street name A street name is an identifying name given to a street or road. In toponymic terminology, names of streets and roads are referred to as hodonyms (from Greek ‘road’, and ‘name’). The street name usually forms part of the address (th ...
s (strings of text), bugs may appear when a city where streets intersect multiple times is encountered. While this example may be oversimplified, restructuring of data is a fairly common problem in software engineering, either to eliminate bugs, increase efficiency, or support new features.


Languages

*
AWK AWK (''awk'') is a domain-specific language designed for text processing and typically used as a data extraction and reporting tool. Like sed and grep, it is a filter, and is a standard feature of most Unix-like operating systems. The AWK lang ...
* Oz *
Perl Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offici ...
– data-driven programming as in AWK and sed is one paradigm supported by Perl *
sed sed ("stream editor") is a Unix utility that parses and transforms text, using a simple, compact programming language. It was developed from 1973 to 1974 by Lee E. McMahon of Bell Labs, and is available today for most operating systems. sed w ...
*
Lua Lua or LUA may refer to: Science and technology * Lua (programming language) * Latvia University of Agriculture * Last universal ancestor, in evolution Ethnicity and language * Lua people, of Laos * Lawa people, of Thailand sometimes referred t ...
*
Clojure Clojure (, like ''closure'') is a dynamic and functional dialect of the Lisp programming language on the Java platform. Like other Lisp dialects, Clojure treats code as data and has a Lisp macro system. The current development process is comm ...

Tab (language)
* fdm *
maildrop Maildrop is a Mail delivery agent used by the Courier Mail Server. The maildrop Mail Delivery Agent (MDA) also includes filtering functionality. Maildrop receives mail via stdin and delivers in both Maildir and mbox formats. Features Mai ...
*
procmail procmail is an email server software component — specifically, a message delivery agent (MDA). It was one of the earliest mail filter programs. It is typically used in Unix-like mail systems, using the mbox and Maildir storage formats. procm ...
*
Sieve A sieve, fine mesh strainer, or sift, is a device for separating wanted elements from unwanted material or for controlling the particle size distribution of a sample, using a screen such as a woven mesh or net or perforated sheet material. T ...
*
BASIC BASIC (Beginners' All-purpose Symbolic Instruction Code) is a family of general-purpose, high-level programming languages designed for ease of use. The original version was created by John G. Kemeny and Thomas E. Kurtz at Dartmouth College ...


See also

* Data-directed programming *
Backus–Naur form In computer science, Backus–Naur form () or Backus normal form (BNF) is a metasyntax notation for context-free grammars, often used to describe the syntax of languages used in computing, such as computer programming languages, document formats ...


References


External links


"The important part is moving program logic away from hardwired control structures and into data."
{{Types of programming languages Programming paradigms