Overview
The most outstanding feature of algorithmic skeletons, which differentiates them from other high-level parallel programming models, is that orchestration and synchronization of the parallel activities is implicitly defined by the skeleton patterns. Programmers do not have to specify the synchronizations between the application's sequential parts. This yields two implications. First, as the communication/data access patterns are known in advance, cost models can be applied to schedule skeletons programs. Second, that algorithmic skeleton programming reduces the number of errors when compared to traditional lower-level parallel programming models (Threads, MPI).Example program
The following example is based on the Javpartition(...)
which implements the well-known QuickSort pivot and swap scheme.
Range r
. In this case we simply invoke Java's default (Arrays.sort) method for the given sub-array.
Frameworks and libraries
ASSIST
ASSIST''M. Aldinucci, M. Coppola, M. Danelutto, N. Tonellotto, M. Vanneschi, and C. Zoccolo.'' "High level grid programming with ASSIST." Computational Methods in Science and Technology, 12(1):21–32, 2006. is a programming environment which provides programmers with a structured coordination language. The coordination language can express parallel programs as an arbitrary graph of software modules. The module graph describes how a set of modules interact with each other using a set of typed data streams. The modules can be sequential or parallel. Sequential modules can be written in C, C++, or Fortran; and parallel modules are programmed with a special ASSIST parallel module (''parmod''). AdHoc, a hierarchical and fault-tolerant Distributed Shared Memory (DSM) system is used to interconnect streams of data between processing elements by providing a repository with: get/put/remove/execute operations. Research around AdHoc has focused on transparency, scalability, and fault-tolerance of the data repository. While not a classical skeleton framework, in the sense that no skeletons are provided, ASSIST's generic ''parmod'' can be specialized into classical skeletons such as: ''farm'', ''map'', etc. ASSIST also supports autonomic control of ''parmods'', and can be subject to a performance contract by dynamically adapting the number of resources used.CO2P3S
CO2P3S (Correct Object-Oriented Pattern-based Parallel Programming System), is a pattern oriented development environment, which achieves parallelism using threads in Java. CO2P3S is concerned with the complete development process of a parallel application. Programmers interact through a programming GUI to choose a pattern and its configuration options. Then, programmers fill the hooks required for the pattern, and new code is generated as a framework in Java for the parallel execution of the application. The generated framework uses three levels, in descending order of abstraction: patterns layer, intermediate code layer, and native code layer. Thus, advanced programmers may intervene the generated code at multiple levels to tune the performance of their applications. The generated code is mostlyCalcium & Skandium
Calcium is greatly inspired by Lithium and Muskel. As such, it provides algorithmic skeleton programming as a Java library. Both task and data parallel skeletons are fully nestable; and are instantiated via parametric skeleton objects, not inheritance. Calcium supports the execution of skeleton applications on top of theEden
Eden is a parallel programming language for distributed memory environments, which extends Haskell. Processes are defined explicitly to achieve parallel programming, while their communications remain implicit. Processes communicate through unidirectional channels, which connect one writer to exactly one reader. Programmers only need to specify which data a processes depends on. Eden's process model provides direct control over process granularity, data distribution and communication topology. Eden is not a skeleton language in the sense that skeletons are not provided as language constructs. Instead, skeletons are defined on top of Eden's lower-level process abstraction, supporting both task andeSkel
The Edinburgh Skeleton Library (eSkel) is provided in C and runs on top of MPI. The first version of eSkel was described in, while a later version is presented in. In, nesting-mode and interaction-mode for skeletons are defined. The nesting-mode can be either transient or persistent, while the interaction-mode can be either implicit or explicit. Transient nesting means that the nested skeleton is instantiated for each invocation and destroyed Afterwards, while persistent means that the skeleton is instantiated once and the same skeleton instance will be invoked throughout the application. Implicit interaction means that the flow of data between skeletons is completely defined by the skeleton composition, while explicit means that data can be generated or removed from the flow in a way not specified by the skeleton composition. For example, a skeleton that produces an output without ever receiving an input has explicit interaction. Performance prediction for scheduling and resource mapping, mainly for pipe-lines, has been explored by Benoit et al. They provided a performance model for each mapping, based on process algebra, and determine the best scheduling strategy based on the results of the model. More recent works have addressed the problem of adaptation on structured parallel programming, in particular for the pipe skeleton.FastFlow
FastFlowHDC
Higher-order Divide and Conquer (HDC) is a subset of the functional languageHOC-SA
HOC-SA is aJaSkel
JaSkel is a Java-based skeleton framework providing skeletons such as farm, pipe and heartbeat. Skeletons are specialized using inheritance. Programmers implement the abstract methods for each skeleton to provide their application specific code. Skeletons in JaSkel are provided in both sequential, concurrent and dynamic versions. For example, the concurrent farm can be used in shared memory environments (threads), but not in distributed environments (clusters) where the distributed farm should be used. To change from one version to the other, programmers must change their classes' signature to inherit from a different skeleton. The nesting of skeletons uses the basic Java Object class, and therefore no type system is enforced during the skeleton composition. The distribution aspects of the computation are handled in JaSkel using AOP, more specifically the AspectJ implementation. Thus, JaSkel can be deployed on both cluster and Grid like infrastructures. Nevertheless, a drawback of the JaSkel approach is that the nesting of the skeleton strictly relates to the deployment infrastructure. Thus, a double nesting of farm yields a better performance than a single farm on hierarchical infrastructures. This defeats the purpose of using AOP to separate the distribution and functional concerns of the skeleton program.Lithium & Muskel
Lithium and its successor Muskel are skeleton frameworks developed at University of Pisa, Italy. Both of them provide nestable skeletons to the programmer as Java libraries. The evaluation of a skeleton application follows a formal definition of operational semantics introduced by Aldinucci and Danelutto, which can handle both task and data parallelism. The semantics describe both functional and parallel behavior of the skeleton language using a labeled transition system. Additionally, several performance optimization are applied such as: skeleton rewriting techniques 8, 10 task lookahead, and server-to-server lazy binding. At the implementation level, Lithium exploits macro-data flow to achieve parallelism. When the input stream receives a new parameter, the skeleton program is processed to obtain a macro-data flow graph. The nodes of the graph are macro-data flow instructions (MDFi) which represent the sequential pieces of code provided by the programmer. Tasks are used to group together several MDFi, and are consumed by idle processing elements from a task pool. When the computation of the graph is concluded, the result is placed into the output stream and thus delivered back to the user. Muskel also provides non-functional features such as Quality of Service (QoS); security between task pool and interpreters; and resource discovery, load balancing, and fault tolerance when interfaced with Java / Jini Parallel Framework (JJPF), a distributed execution framework. Muskel also provides support for combining structured with unstructured programming and recent research has addressed extensibility.Mallba
Mallba is a library for combinatorial optimizations supporting exact, heuristic and hybrid search strategies. Each strategy is implemented in Mallba as a generic skeleton which can be used by providing the required code. On the exact search algorithms Mallba provides branch-and-bound and dynamic-optimization skeletons. For local search heuristics Mallba supports:Marrow
Marrow is a C++ algorithmic skeleton framework for the orchestration ofMuesli
The Muenster Skeleton Library Muesli is a C++ template library which re-implements many of the ideas and concepts introduced in Skil, e.g. higher order functions, currying, and polymorphic typeP3L, SkIE, SKElib
P3L (Pisa Parallel Programming Language) is a skeleton based coordination language. P3L provides skeleton constructs which are used to coordinate the parallel or sequential execution of C code. A compiler named Anacleto is provided for the language. Anacleto uses implementation templates to compile P3 L code into a target architecture. Thus, a skeleton can have several templates each optimized for a different architecture. A template implements a skeleton on a specific architecture and provides a parametric process graph with a performance model. The performance model can then be used to decide program transformations which can lead to performance optimizations. A P3L module corresponds to a properly defined skeleton construct with input and output streams, and other sub-modules or sequential C code. Modules can be nested using the two tier model, where the outer level is composed of task parallel skeletons, while data parallel skeletons may be used in the inner level 4 Type verification is performed at the data flow level, when the programmer explicitly specifies the type of the input and output streams, and by specifying the flow of data between sub-modules. SkIE (Skeleton-based Integrated Environment) is quite similar to P3L, as it is also based on a coordination language, but provides advanced features such as debugging tools, performance analysis, visualization and graphical user interface. Instead of directly using the coordination language, programmers interact with a graphical tool, where parallel modules based on skeletons can be composed. SKELib builds upon the contributions of P3L and SkIE by inheriting, among others, the template system. It differs from them because a coordination language is no longer used, but instead skeletons are provided as a library in C, with performance similar as the one achieved in P3L. Contrary to Skil, another C like skeleton framework, type safety is not addressed in SKELib.PAS and EPAS
PAS (Parallel Architectural Skeletons) is a framework for skeleton programming developed in C++ and MPI. Programmers use an extension of C++ to write their skeleton applications1 . The code is then passed through a Perl script which expands the code to pure C++ where skeletons are specialized through inheritance. In PAS, every skeleton has a Representative (Rep) object which must be provided by the programmer and is in charge of coordinating the skeleton's execution. Skeletons can be nested in a hierarchical fashion via the Rep objects. Besides the skeleton's execution, the Rep also explicitly manages the reception of data from the higher level skeleton, and the sending of data to the sub-skeletons. A parametrized communication/synchronization protocol is used to send and receive data between parent and sub-skeletons. An extension of PAS labeled as SuperPas and later as EPAS addresses skeleton extensibility concerns. With the EPAS tool, new skeletons can be added to PAS. A Skeleton Description Language (SDL) is used to describe the skeleton pattern by specifying the topology with respect to a virtual processor grid. The SDL can then be compiled into native C++ code, which can be used as any other skeleton.SBASCO
SBASCO (Skeleton-BAsed Scientific COmponents) is a programming environment oriented towards efficient development of parallel and distributed numerical applications. SBASCO aims at integrating two programming models: skeletons and components with a custom composition language. An application view of a component provides a description of its interfaces (input and output type); while a configuration view provides, in addition, a description of the component's internal structure and processor layout. A component's internal structure can be defined using three skeletons: farm, pipe and multi-block. SBASCO's addresses domain decomposable applications through its multi-block skeleton. Domains are specified through arrays (mainly two dimensional), which are decomposed into sub-arrays with possible overlapping boundaries. The computation then takes place in an iterative BSP like fashion. The first stage consists of local computations, while the second stage performs boundary exchanges. A use case is presented for a reaction-diffusion problem in. Two type of components are presented in. Scientific Components (SC) which provide the functional code; and Communication Aspect Components (CAC) which encapsulate non-functional behavior such as communication, distribution processor layout and replication. For example, SC components are connected to a CAC component which can act as a manager at runtime by dynamically re-mapping processors assigned to a SC. A use case showing improved performance when using CAC components is shown in.SCL
The Structured Coordination Language (SCL) was one of the earliest skeleton programming languages. It provides a co-ordination language approach for skeleton programming over software components. SCL is considered a base language, and was designed to be integrated with a host language, for example Fortran or C, used for developing sequential software components. In SCL, skeletons are classified into three types: configuration, elementary and computation. Configuration skeletons abstract patterns for commonly used data structures such as distributed arrays (ParArray). Elementary skeletons correspond to data parallel skeletons such as map, scan, and fold. Computation skeletons which abstract the control flow and correspond mainly to task parallel skeletons such as farm, SPMD, and iterateUntil. The coordination language approach was used in conjunction with performance models for programming traditional parallel machines as well as parallel heterogeneous machines that have different multiple cores on each processing node.SkePU
SkePU SkePU is a skeleton programming framework for multicore CPUs and multi-GPU systems. It is a C++ template library with six data-parallel and one task-parallel skeletons, two container types, and support for execution on multi-GPU systems both with CUDA and OpenCL. Recently, support for hybrid execution, performance-aware dynamic scheduling and load balancing is developed in SkePU by implementing a backend for the StarPU runtime system. SkePU is being extended for GPU clusters.SKiPPER & QUAFF
SKiPPER is a domain specific skeleton library for vision applications which provides skeletons in CAML, and thus relies on CAML for type safety. Skeletons are presented in two ways: declarative and operational. Declarative skeletons are directly used by programmers, while their operational versions provide an architecture specific target implementation. From the runtime environment, CAML skeleton specifications, and application specific functions (provided in C by the programmer), new C code is generated and compiled to run the application on the target architecture. One of the interesting things about SKiPPER is that the skeleton program can be executed sequentially for debugging. Different approaches have been explored in SKiPPER for writing operational skeletons: static data-flow graphs, parametric process networks, hierarchical task graphs, and tagged-token data-flow graphs. QUAFF is a more recent skeleton library written in C++ and MPI. QUAFF relies on template-based meta-programming techniques to reduce runtime overheads and perform skeleton expansions and optimizations at compilation time. Skeletons can be nested and sequential functions are stateful. Besides type checking, QUAFF takes advantage of C++ templates to generate, at compilation time, new C/MPI code. QUAFF is based on the CSP-model, where the skeleton program is described as a process network and production rules (single, serial, par, join).SkeTo
The SkeTo project is a C++ library which achieves parallelization using MPI. SkeTo is different from other skeleton libraries because instead of providing nestable parallelism patterns, SkeTo provides parallel skeletons for parallel data structures such as: lists, trees, and matrices. The data structures are typed using templates, and several parallel operations can be invoked on them. For example, the list structure provides parallel operations such as: map, reduce, scan, zip, shift, etc... Additional research around SkeTo has also focused on optimizations strategies by transformation, and more recently domain specific optimizations. For example, SkeTo provides a fusion transformation which merges two successive function invocations into a single one, thus decreasing the function call overheads and avoiding the creation of intermediate data structures passed between functions.Skil
Skil is an imperative language for skeleton programming. Skeletons are not directly part of the language but are implemented with it. Skil uses a subset of C language which provides functional language like features such as higher order functions, curring and polymorphic types. When Skil is compiled, such features are eliminated and a regular C code is produced. Thus, Skil transforms polymorphic high order functions into monomorphic first order C functions. Skil does not support nestable composition of skeletons. Data parallelism is achieved using specific data parallel structures, for example to spread arrays among available processors. Filter skeletons can be used.STAPL Skeleton Framework
In STAPL Skeleton Framework skeletons are defined as parametric data flow graphs, letting them scale beyond 100,000 cores. In addition, this framework addresses composition of skeletons as point-to-point composition of their corresponding data flow graphs through the notion of ports, allowing new skeletons to be easily added to the framework. As a result, this framework eliminate the need for reimplementation and global synchronizations in composed skeletons. STAPL Skeleton Framework supports nested composition and can switch between parallel and sequential execution in each level of nesting. This framework benefits from scalable implementation of STAPL parallel containers and can run skeletons on various containers including vectors, multidimensional arrays, and lists.T4P
T4P was one of the first systems introduced for skeleton programming. The system relied heavily on functional programming properties, and five skeletons were defined as higher order functions: Divide-and-Conquer, Farm, Map, Pipe and RaMP. A program could have more than one implementation, each using a combination of different skeletons. Furthermore, each skeleton could have different parallel implementations. A methodology based on functional program transformations guided by performance models of the skeletons was used to select the most appropriate skeleton to be used for the program as well as the most appropriate implementation of the skeleton.Frameworks comparison
* Activity years is the known activity years span. The dates represented in this column correspond to the first and last publication date of a related article in a scientific journal or conference proceeding. Note that a project may still be active beyond the activity span, and that we have failed to find a publication for it beyond the given date. * Programming language is the interface with which programmers interact to code their skeleton applications. These languages are diverse, encompassing paradigms such as: functional languages, coordination languages, markup languages, imperative languages, object-oriented languages, and even graphical user interfaces. Inside the programming language, skeletons have been provided either as language constructs or libraries. Providing skeletons as language construct implies the development of a custom domain specific language and its compiler. This was clearly the stronger trend at the beginning of skeleton research. The more recent trend is to provide skeletons as libraries, in particular with object-oriented languages such as C++ and Java. * Execution language is the language in which the skeleton applications are run or compiled. It was recognized very early that the programming languages (specially in the functional cases), were not efficient enough to execute the skeleton programs. Therefore, skeleton programming languages were simplified by executing skeleton application on other languages. Transformation processes were introduced to convert the skeleton applications (defined in the programming language) into an equivalent application on the target execution language. Different transformation processes were introduced, such as code generation or instantiation of lowerlevel skeletons (sometimes called operational skeletons) which were capable of interacting with a library in the execution language. The transformed application also gave the opportunity to introduce target architecture code, customized for performance, into the transformed application. Table 1 shows that a favorite for execution language has been the C language. * Distribution library provides the functionality to achieve parallel/distributed computations. The big favorite in this sense has been MPI, which is not surprising since it integrates well with the C language, and is probably the most used tool for parallelism in cluster computing. The dangers of directly programming with the distribution library are, of course, safely hidden away from the programmers who never interact with the distribution library. Recently, the trend has been to develop skeleton frameworks capable of interacting with more than one distribution library. For example, CO2 P3 S can use Threads, RMI or Sockets; Mallba can use Netstream or MPI; or JaSkel which uses AspectJ to execute the skeleton applications on different skeleton frameworks. * Type safety refers to the capability of detecting type incompatibility errors in skeleton program. Since the first skeleton frameworks were built on functional languages such as Haskell, type safety was simply inherited from the host language. Nevertheless, as custom languages were developed for skeleton programming, compilers had to be written to take type checking into consideration; which was not as difficult as skeleton nesting was not fully supported. Recently however, as we began to host skeleton frameworks on object-oriented languages with full nesting, the type safety issue has resurfaced. Unfortunately, type checking has been mostly overlooked (with the exception of QUAFF), and specially in Java based skeleton frameworks. * Skeleton nesting is the capability of hierarchical composition of skeleton patterns. Skeleton Nesting was identified as an important feature in skeleton programming from the very beginning, because it allows the composition of more complex patterns starting from a basic set of simpler patterns. Nevertheless, it has taken the community a long time to fully support arbitrary nesting of skeletons, mainly because of the scheduling and type verification difficulties. The trend is clear that recent skeleton frameworks support full nesting of skeletons. * File access is the capability to access and manipulate files from an application. In the past, skeleton programming has proven useful mostly for computational intensive applications, where small amounts of data require big amounts of computation time. Nevertheless, many distributed applications require or produce large amounts of data during their computation. This is the case for astrophysics, particle physics, bio-informatics, etc. Thus, providing file transfer support that integrates with skeleton programming is a key concern which has been mostly overlooked. * Skeleton set is the list of supported skeleton patterns. Skeleton sets vary greatly from one framework to the other, and more shocking, some skeletons with the same name have different semantics on different frameworks. The most common skeleton patterns in the literature are probably farm, pipe, and map.See also
*References
{{DEFAULTSORT:Algorithmic Skeleton Concurrent programming languages Parallel computing C++ libraries