Pipeline programming
   HOME

TheInfoList



OR:

In
software engineering Software engineering is a branch of both computer science and engineering focused on designing, developing, testing, and maintaining Application software, software applications. It involves applying engineering design process, engineering principl ...
, a pipeline consists of a chain of processing elements ( processes, threads,
coroutine Coroutines are computer program components that allow execution to be suspended and resumed, generalizing subroutines for cooperative multitasking. Coroutines are well-suited for implementing familiar program components such as cooperative task ...
s, functions, ''etc.''), arranged so that the output of each element is the input of the next. The concept is analogous to a physical
pipeline A pipeline is a system of Pipe (fluid conveyance), pipes for long-distance transportation of a liquid or gas, typically to a market area for consumption. The latest data from 2014 gives a total of slightly less than of pipeline in 120 countries ...
. Usually some amount of buffering is provided between consecutive elements. The information that flows in these pipelines is often a
stream A stream is a continuous body of water, body of surface water Current (stream), flowing within the stream bed, bed and bank (geography), banks of a channel (geography), channel. Depending on its location or certain characteristics, a strea ...
of records,
bytes The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable un ...
, or bits, and the elements of a pipeline may be called
filters Filtration is a physical process that separates solid matter and fluid from a mixture. Filter, filtering, filters or filtration may also refer to: Science and technology Computing * Filter (higher-order function), in functional programming * Fil ...
. This is also called the pipe(s) and filters
design pattern A design pattern is the re-usable form of a solution to a design problem. The idea was introduced by the architect Christopher Alexander and has been adapted for various other disciplines, particularly software engineering. The " Gang of Four" ...
which is
monolithic A monolith is a monument or natural feature consisting of a single massive stone or rock. Monolith or monolithic may also refer to: Architecture * Monolithic architecture, a style of construction in which a building is carved, cast or excavated f ...
. Its advantages are simplicity and low cost while its disadvantages are lack of elasticity,
fault tolerance Fault tolerance is the ability of a system to maintain proper operation despite failures or faults in one or more of its components. This capability is essential for high-availability, mission-critical, or even life-critical systems. Fault t ...
and
scalability Scalability is the property of a system to handle a growing amount of work. One definition for software systems specifies that this may be done by adding resources to the system. In an economic context, a scalable business model implies that ...
. Connecting elements into a pipeline is analogous to
function composition In mathematics, the composition operator \circ takes two function (mathematics), functions, f and g, and returns a new function h(x) := (g \circ f) (x) = g(f(x)). Thus, the function is function application, applied after applying to . (g \c ...
. Narrowly speaking, a pipeline is linear and one-directional, though sometimes the term is applied to more general flows. For example, a primarily one-directional pipeline may have some communication in the other direction, known as a '' return channel'' or ''backchannel,'' as in
the lexer hack In computer programming, the lexer hack is a solution to parsing context-sensitive grammars such as C, where classifying a sequence of characters as a variable name or a type name requires contextual information, by feeding contextual information ...
, or a pipeline may be fully bi-directional. Flows with one-directional trees and
directed acyclic graph In mathematics, particularly graph theory, and computer science, a directed acyclic graph (DAG) is a directed graph with no directed cycles. That is, it consists of vertices and edges (also called ''arcs''), with each edge directed from one ...
topologies behave similarly to linear pipelines. The lack of cycles in such flows makes them simple, and thus they may be loosely referred to as "pipelines".


Implementation

Pipelines are often implemented in a multitasking OS, by launching all elements at the same time as processes, and automatically servicing the data read requests by each process with the data written by the upstream process. This can be called a ''multiprocessed pipeline.'' In this way, the scheduler will naturally switch the CPU among the processes so as to minimize its idle time. In other common models, elements are implemented as lightweight threads or as coroutines to reduce the OS overhead often involved with processes. Depending on the OS, threads may be scheduled directly by the OS or by a thread manager. Coroutines are always scheduled by a coroutine manager of some form. Read and write requests are usually blocking operations. This means that the execution of the source process, upon writing, is suspended until all data can be written to the destination process. Likewise, the execution of the destination process, upon reading, is suspended until at least some of the requested data can be obtained from the source process. This cannot lead to a deadlock, where both processes would wait indefinitely for each other to respond, since at least one of the processes will soon have its request serviced by the operating system, and continue to run. For performance, most operating systems implementing pipes use pipe buffers, which allow the source process to provide more data than the destination process is currently able or willing to receive. Under most Unixes and Unix-like operating systems, a special command is also available, typically called "buffer", that implements a pipe buffer of potentially much larger and configurable size. This command can be useful if the destination process is significantly slower than the source process, but it is desired that the source process complete its task as soon as possible. E.g., if the source process consists of a command which reads an audio track from a CD and the destination process consists of a command which compresses the
waveform In electronics, acoustics, and related fields, the waveform of a signal is the shape of its Graph of a function, graph as a function of time, independent of its time and Magnitude (mathematics), magnitude Scale (ratio), scales and of any dis ...
audio data to a format like MP3. In this case, buffering the entire track in a pipe buffer would allow the CD drive to spin down more quickly, and enable the user to remove the CD from the drive before the encoding process has finished. Such a buffer command can be implemented using
system call In computing, a system call (syscall) is the programmatic way in which a computer program requests a service from the operating system on which it is executed. This may include hardware-related services (for example, accessing a hard disk drive ...
s for reading and writing data. Wasteful busy waiting can be avoided by using facilities such as
poll Poll, polled, or polling may refer to: Forms of voting and counting * Poll, a formal election ** Election verification exit poll, a survey taken to verify election counts ** Polling, voting to make decisions or determine opinions ** Polling pla ...
or select or multithreading. Some notable examples of pipeline software systems include: * RaftLib – C/C++ Apache 2.0 License


VM/CMS and z/OS

CMS Pipelines CMS Pipelines is a feature of the VM/CMS operating system that allows the user to create and use a pipeline. The programs in a pipeline operate on a sequential stream of records. A program writes records that are read by the next program in the pip ...
is a port of the pipeline idea to
VM/CMS VM (often: VM/CMS) is a family of IBM virtual machine operating systems used on IBM mainframes System/370, System/390, zSeries, System z and compatible systems, including the Hercules emulator for personal computers. Design The heart o ...
and
z/OS z/OS is a 64-bit operating system for IBM z/Architecture mainframes, introduced by IBM in October 2000. It derives from and is the successor to OS/390, which in turn was preceded by a string of MVS versions.Starting with the earliest: ...
systems. It supports much more complex pipeline structures than Unix shells, with steps taking multiple input streams and producing multiple output streams. (Such functionality is supported by the Unix kernel, but few programs use it as it makes for complicated syntax and blocking modes, although some shells do support it via arbitrary
file descriptor In Unix and Unix-like computer operating systems, a file descriptor (FD, less frequently fildes) is a process-unique identifier (handle) for a file or other input/output resource, such as a pipe or network socket. File descriptors typically h ...
assignment). Traditional application programs on IBM mainframe operating systems have no standard input and output streams to allow redirection or piping. Instead of spawning processes with external programs, CMS Pipelines features a lightweight dispatcher to concurrently execute instances of more than 200 built-in programs that implement typical UNIX utilities and interface to devices and operating system services. In addition to the built-in programs, CMS Pipelines defines a framework to allow user-written
REXX Rexx (restructured extended executor) is a high-level programming language developed at IBM by Mike Cowlishaw. Both proprietary and open-source software, open source Rexx interpreter (computing), interpreters exist for a wide range of comput ...
programs with input and output streams that can be used in the pipeline. Data on IBM mainframes typically resides in a
record-oriented filesystem In computer science, a record-oriented filesystem is a file system where data is stored as collections of records. This is in contrast to a byte-oriented filesystem, where the data is treated as an unformatted stream of bytes. There are seve ...
and connected I/O devices operate in record mode rather than stream mode. As a consequence, data in CMS Pipelines is handled in record mode. For text files, a record holds one line of text. In general, CMS Pipelines does not buffer the data but passes records of data in a lock-step fashion from one program to the next. This ensures a deterministic flow of data through a network of interconnected pipelines.


Object pipelines

Beside byte stream-based pipelines, there are also object pipelines. In an object pipeline, processing elements output objects instead of text.
PowerShell PowerShell is a shell program developed by Microsoft for task automation and configuration management. As is typical for a shell, it provides a command-line interpreter for interactive use and a script interpreter for automation via a langu ...
includes an internal object pipeline that transfers
.NET The .NET platform (pronounced as "''dot net"'') is a free and open-source, managed code, managed computer software framework for Microsoft Windows, Windows, Linux, and macOS operating systems. The project is mainly developed by Microsoft emplo ...
objects between functions within the PowerShell runtime. Channels, found in the Limbo programming language, are other examples of this metaphor.


Pipelines in GUIs

Graphical environments such as
RISC OS RISC OS () is an operating system designed to run on ARM architecture, ARM computers. Originally designed in 1987 by Acorn Computers of England, it was made for use in its new line of ARM-based Acorn Archimedes, Archimedes personal computers an ...
and ROX Desktop also use pipelines. Rather than providing a save
dialog box In computing, a dialog box (also simply dialog) is a graphical control element in the form of a small window that communicates information to the user and prompts them for a response. Dialog boxes are classified as " modal" or "modeless", dep ...
containing a
file manager A file manager or file browser is a computer program that provides a user interface to manage computer files, files and folder (computing), folders. The most common Computer file#Operations, operations performed on files or groups of files incl ...
to let the user specify where a program should write data, RISC OS and ROX provide a save dialog box containing an
icon An icon () is a religious work of art, most commonly a painting, in the cultures of the Eastern Orthodox, Oriental Orthodox, Catholic Church, Catholic, and Lutheranism, Lutheran churches. The most common subjects include Jesus, Mary, mother of ...
(and a field to specify the name). The destination is specified by dragging and dropping the icon. The user can drop the icon anywhere an already-saved file could be dropped, including onto icons of other programs. If the icon is dropped onto a program's icon, it is loaded and the contents that would otherwise have been saved are passed in on the new program's standard input stream. For instance, a user browsing the
world-wide web The World Wide Web (WWW or simply the Web) is an information system that enables Content (media), content sharing over the Internet through user-friendly ways meant to appeal to users beyond Information technology, IT specialists and hobbyis ...
might come across a .gz compressed image which they want to edit and re-upload. Using GUI pipelines, they could drag the link to their de-archiving program, drag the icon representing the extracted contents to their
image editor Image editing encompasses the processes of altering images, whether they are Digital photography, digital photographs, traditional Photographic processing, photo-chemical photographs, or illustrations. Traditional analog image editing is known ...
, edit it, open the save as dialog, and drag its icon to their uploading software. Conceptually, this method could be used with a conventional save dialog box, but this would require the user's programs to have an obvious and easily accessible location in the filesystem. As this is often not the case, GUI pipelines are rare.


Other considerations

The name "pipeline" comes from a rough analogy with physical plumbing in that a pipeline usually allows information to flow in only one direction, like water often flows in a pipe. Pipes and
filters Filtration is a physical process that separates solid matter and fluid from a mixture. Filter, filtering, filters or filtration may also refer to: Science and technology Computing * Filter (higher-order function), in functional programming * Fil ...
can be viewed as a form of
functional programming In computer science, functional programming is a programming paradigm where programs are constructed by Function application, applying and Function composition (computer science), composing Function (computer science), functions. It is a declarat ...
, using byte streams as data objects. More specifically, they can be seen as a particular form of monad for I/O."Monadic I/O and UNIX shell programming"
. The concept of pipeline is also central to the Cocoon web development framework or to any XProc (the W3C Standards) implementations, where it allows a source stream to be modified before eventual display. This pattern encourages the use of text streams as the input and output of programs. This reliance on text has to be accounted when creating
graphic Graphics () are visual images or designs on some surface, such as a wall, canvas, screen, paper, or stone, to inform, illustrate, or entertain. In contemporary usage, it includes a pictorial representation of the data, as in design and manufa ...
shells to text programs.


See also

*
Anonymous pipe In computer science, an anonymous pipe is a simplex FIFO communication channel that may be used for one-way interprocess communication (IPC). An implementation is often integrated into the operating system's file IO subsystem. Typically a parent ...
*
Component-based software engineering Component-based software engineering (CBSE), also called component-based development (CBD), is a style of software engineering that aims to construct a software system from software component, components that are loosely-Coupling (computer program ...
* Flow-based programming *
GStreamer GStreamer is a Pipeline (computing), pipeline-based multimedia framework that links together a wide variety of media processing systems to complete complex workflows. For instance, GStreamer can be used to build a system that reads files in one f ...
for a multimedia framework built on plugin pipelines *
Graphics pipeline The computer graphics pipeline, also known as the rendering pipeline, or graphics pipeline, is a framework within computer graphics that outlines the necessary procedures for transforming a three-dimensional (3D) scene into a two-dimensional (2 ...
* Iteratees * Named pipe, an operating system construct intermediate to anonymous pipe and file. *
Pipeline (computing) In computing, a pipeline, also known as a data pipeline, is a set of data processing elements connected in series, where the output of one element is the input of the next one. The elements of a pipeline are often executed in parallel or in time- ...
for other computer-related versions of the concept. *
Kahn process networks A Kahn process network (KPN, or process network) is a Distributed computing, distributed ''model of computation'' in which a group of deterministic sequential Process (computing), processes communicate through unbounded FIFO (computing and electron ...
to extend the pipeline concept to a more generic directed graph structure *
Pipeline (Unix) In Unix-like computer operating systems, a pipeline is a mechanism for inter-process communication using message passing. A pipeline is a set of process (computing), processes chained together by their standard streams, so that the output text of ...
for details specific to
Unix Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
* Plumber – "intelligent pipes" developed as part of Plan 9 * Producer–consumer problem – for implementation aspects of software pipelines *
Software design pattern In software engineering, a software design pattern or design pattern is a general, reusable solution to a commonly occurring problem in many contexts in software design. A design pattern is not a rigid structure to be transplanted directly into s ...
*
Stream processing In computer science, stream processing (also known as event stream processing, data stream processing, or distributed stream processing) is a programming paradigm which views Stream (computing), streams, or sequences of events in time, as the centr ...
* XML pipeline for processing of
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing data. It defines a set of rules for encoding electronic document, documents in a format that is both human-readable and Machine-r ...
files


Notes


External links


Pipeline Processing.


{{DEFAULTSORT:Pipeline (Software) Software design patterns Programming paradigms Inter-process communication sv:Vertikalstreck#Datavetenskap