HOME

TheInfoList



OR:

Cilk, Cilk++, Cilk Plus and OpenCilk are general-purpose
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
s designed for multithreaded
parallel computing Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different f ...
. They are based on the C and
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
programming languages, which they extend with constructs to express parallel loops and the fork–join idiom. Originally developed in the 1990s at the
Massachusetts Institute of Technology The Massachusetts Institute of Technology (MIT) is a private land-grant research university in Cambridge, Massachusetts. Established in 1861, MIT has played a key role in the development of modern technology and science, and is one of th ...
(MIT) in the group of
Charles E. Leiserson Charles Eric Leiserson is a computer scientist, specializing in the theory of parallel computing and distributed computing, and particularly practical applications thereof. As part of this effort, he developed the Cilk multithreaded language. ...
, Cilk was later commercialized as Cilk++ by a spinoff company, Cilk Arts. That company was subsequently acquired by
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 ser ...
, which increased compatibility with existing C and C++ code, calling the result Cilk Plus. After Intel stopped supporting Cilk Plus in 2017, MIT is again developing Cilk in the form of OpenCilk.


History


MIT Cilk

The Cilk programming language grew out of three separate projects at the MIT Laboratory for Computer Science: * Theoretical work on scheduling multi-threaded applications. * StarTech – a parallel
chess program In computer chess, a chess engine is a computer program that analyzes chess or chess variant positions, and generates a move or list of moves that it regards as strongest. A chess engine is usually a back end with a command-line interface wit ...
built to run on the Thinking Machines Corporation's Connection Machine model CM-5. * PCM/Threaded-C – a C-based package for scheduling continuation-passing-style threads on the CM-5 In April 1994 the three projects were combined and christened "Cilk". The name Cilk is not an acronym, but an allusion to "nice threads" (
silk Silk is a natural protein fiber, some forms of which can be woven into textiles. The protein fiber of silk is composed mainly of fibroin and is produced by certain insect larvae to form cocoons. The best-known silk is obtained from th ...
) and the C programming language. The Cilk-1 compiler was released in September 1994. The original Cilk language was based on
ANSI C ANSI C, ISO C, and Standard C are successive standards for the C programming language published by the American National Standards Institute (ANSI) and ISO/IEC JTC 1/SC 22/WG 14 of the International Organization for Standardization (ISO) and th ...
, with the addition of Cilk-specific keywords to signal parallelism. When the Cilk keywords are removed from Cilk source code, the result should always be a valid C program, called the ''serial elision'' (or ''C elision'') of the full Cilk program, with the same semantics as the Cilk program running on a single processor. Despite several similarities, Cilk is not directly related to AT&T Bell Labs'
Concurrent C Concurrent means happening at the same time. Concurrency, concurrent, or concurrence may refer to: Law * Concurrence, in jurisprudence, the need to prove both ''actus reus'' and ''mens rea'' * Concurring opinion (also called a "concurrence"), a ...
. Cilk was implemented as a translator to C, targeting the GNU C Compiler (GCC). The last version, Cilk 5.4.6, is available from the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL), but is no longer supported. A showcase for Cilk's capabilities was the Cilkchess parallel chess-playing program, which won several computer chess prizes in the 1990s, including the 1996 Open Dutch Computer Chess Championship.


Cilk Arts and Cilk++

Prior to , the market for Cilk was restricted to high-performance computing. The emergence of multicore processors in mainstream computing meant that hundreds of millions of new parallel computers were being shipped every year. Cilk Arts was formed to capitalize on that opportunity: in 2006, Leiserson launched Cilk Arts to create and bring to market a modern version of Cilk that supports the commercial needs of an upcoming generation of programmers. The company closed a Series A venture financing round in October 2007, and its product, Cilk++ 1.0, shipped in December, 2008. Cilk++ differs from Cilk in several ways: support for C++, support for loops, and hyperobjects a new construct designed to solve data race problems created by parallel accesses to global variables. Cilk++ was
proprietary software Proprietary software is software that is deemed within the free and open-source software to be non-free because its creator, publisher, or other rightsholder or rightsholder partner exercises a legal monopoly afforded by modern copyright and i ...
. Like its predecessor, it was implemented as a Cilk-to-C++ compiler. It supported the
Microsoft Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washi ...
and GNU compilers.


Intel Cilk Plus

On July 31, 2009, Cilk Arts announced on its web site that its products and engineering team were now part of
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 ser ...
Corp. In early 2010, the Cilk website at www.cilk.com began redirecting to the Intel website (as of early 2017, the original Cilk website no longer resolves to a host). Intel and Cilk Arts integrated and advanced the technology further resulting in a September 2010 release of Intel
Cilk Plus Cilk, Cilk++, Cilk Plus and OpenCilk are general-purpose programming languages designed for multithreaded parallel computing. They are based on the C and C++ programming languages, which they extend with constructs to express parallel loo ...
. Cilk Plus adopts simplifications, proposed by Cilk Arts in Cilk++, to eliminate the need for several of the original Cilk keywords while adding the ability to spawn functions and to deal with variables involved in reduction operations. Cilk Plus differs from Cilk and Cilk++ by adding array extensions, being incorporated in a commercial compiler (from Intel), and compatibility with existing debuggers. Cilk Plus was first implemented in the
Intel C++ Compiler Intel oneAPI DPC++/C++ Compiler and Intel C++ Compiler Classic are Intel’s C, C++, SYCL, and Data Parallel C++ (DPC++) compilers for Intel processor-based systems, available for Windows, Linux, and macOS operating systems. Overview Intel ...
with the release of the Intel compiler in Intel Composer XE 2010. An open source ( BSD-licensed) implementation was contributed by Intel to the
GNU Compiler Collection The GNU Compiler Collection (GCC) is an optimizing compiler produced by the GNU Project supporting various programming languages, hardware architectures and operating systems. The Free Software Foundation (FSF) distributes GCC as free softwar ...
(GCC), which shipped Cilk Plus support in version 4.9, except for the keyword, which was added in GCC 5.0. In February 2013, Intel announced a
Clang Clang is a compiler front end for the C, C++, Objective-C, and Objective-C++ programming languages, as well as the OpenMP, OpenCL, RenderScript, CUDA, and HIP frameworks. It acts as a drop-in replacement for the GNU Compiler Collection ...
fork In cutlery or kitchenware, a fork (from la, furca 'pitchfork') is a utensil, now usually made of metal, whose long handle terminates in a head that branches into several narrow and often slightly curved tine (structural), tines with which one ...
with Cilk Plus support. The Intel Compiler, but not the open source implementations, comes with a race detector and a performance analyzer. Intel later discontinued it, recommending its users switch to instead using either
OpenMP OpenMP (Open Multi-Processing) is an application programming interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran, on many platforms, instruction-set architectures and operating syst ...
or Intel's own TBB library for their parallel programming needs.


Differences between versions

In the original MIT Cilk implementation, the first Cilk keyword is in fact cilk, which identifies a function which is written in Cilk. Since Cilk procedures can call C procedures directly, but C procedures cannot directly call or
spawn Spawn or spawning may refer to: * Spawn (biology), the eggs and sperm of aquatic animals Arts, entertainment, and media * Spawn (character), a fictional character in the comic series of the same name and in the associated franchise ** '' Spawn: A ...
Cilk procedures, this keyword is needed to distinguish Cilk code from C code. Cilk Plus removes this restriction, as well as the cilk keyword, so C and C++ functions can call into Cilk Plus code and vice versa.


Deprecation of Cilk Plus

In May, 2017, GCC 7.1 was released and marked Cilk Plus support as deprecated. Intel itself announced in September 2017 that they would deprecate Cilk Plus with the 2018 release of the Intel Software Development Tools. In May 2018, GCC 8.1 was released with Cilk Plus support removed.


OpenCilk

After Cilk Plus support was deprecated by Intel, MIT has taken on the development of Cilk in the OpenCilk implementation, focusing on the LLVM/Clang fork now termed "Tapir". OpenCilk remains largely compatible with Intel Cilk Plus. Its first stable version was released in March 2021.


Language features

The principle behind the design of the Cilk language is that the programmer should be responsible for ''exposing'' the parallelism, identifying elements that can safely be executed in parallel; it should then be left to the run-time environment, particularly the scheduler, to decide during execution how to actually divide the work between processors. It is because these responsibilities are separated that a Cilk program can run without rewriting on any number of processors, including one.


Task parallelism: spawn and sync

Cilk's main addition to C are two keywords that together allow writing task-parallel programs. * The keyword, when preceding a function call (), indicates that the function call () can safely run in parallel with the statements following it in the calling function. Note that the scheduler is not ''obligated'' to run this procedure in parallel; the keyword merely alerts the scheduler that it can do so. * A statement indicates that execution of the current function cannot proceed until all previously spawned function calls have completed. This is an example of a
barrier A barrier or barricade is a physical structure which blocks or impedes something. Barrier may also refer to: Places * Barrier, Kentucky, a community in the United States * Barrier, Voerendaal, a place in the municipality of Voerendaal, Netherl ...
method. (In Cilk Plus, the keywords are spelled and , or and if the Cilk Plus headers are included.) Below is a recursive implementation of the
Fibonacci Fibonacci (; also , ; – ), also known as Leonardo Bonacci, Leonardo of Pisa, or Leonardo Bigollo Pisano ('Leonardo the Traveller from Pisa'), was an Italian mathematician from the Republic of Pisa, considered to be "the most talented Wester ...
function in Cilk, with parallel recursive calls, which demonstrates the , and keywords. The original Cilk required any function using these to be annotated with the keyword, which is gone as of Cilk Plus. (Cilk program code is not numbered; the numbers have been added only to make the discussion easier to follow.) cilk int fib(int n) If this code was executed by a ''single'' processor to determine the value of , that processor would create a frame for , and execute lines 1 through 5. On line 6, it would create spaces in the frame to hold the values of and . On line 8, the processor would have to suspend the current frame, create a new frame to execute the procedure , execute the code of that frame until reaching a return statement, and then resume the frame with the value of fib(1) placed into 's variable. On the next line, it would need to suspend again to execute and place the result in 's variable. When the code is executed on a ''multiprocessor'' machine, however, execution proceeds differently. One processor starts the execution of ; when it reaches line 8, however, the keyword modifying the call to tells the processor that it can safely give the job to a second processor: this second processor can create a frame for , execute its code, and store its result in 's frame when it finishes; the first processor continues executing the code of at the same time. A processor is not obligated to assign a spawned procedure elsewhere; if the machine only has two processors and the second is still busy on when the processor executing gets to the procedure call, the first processor will suspend and execute itself, as it would if it were the only processor. Of course, if another processor is available, then it will be called into service, and all three processors would be executing separate frames simultaneously. (The preceding description is not entirely accurate. Even though the common terminology for discussing Cilk refers to processors making the decision to spawn off work to other processors, it is actually the scheduler which assigns procedures to processors for execution, using a policy called ''work-stealing'', described later.) If the processor executing were to execute line 13 before both of the other processors had completed their frames, it would generate an incorrect result or an error; would be trying to add the values stored in and , but one or both of those values would be missing. This is the purpose of the keyword, which we see in line 11: it tells the processor executing a frame that it must suspend its own execution until all the procedure calls it has spawned off have returned. When is allowed to proceed past the statement in line 11, it can only be because and have completed and placed their results in and , making it safe to perform calculations on those results. The code example above uses the syntax of Cilk-5. The original Cilk (Cilk-1) used a rather different syntax that required programming in an explicit continuation-passing style, and the Fibonacci examples looks as follows: thread fib(cont int k, int n) thread sum(cont int k, int x, int y) Inside 's recursive case, the keyword indicates the creation of a ''successor'' thread (as opposed to the ''child'' threads created by ), which executes the subroutine after waiting for the ''continuation variables'' and to be filled in by the recursive calls. The base case and use a operation to set their continuation variable to the value of , effectively "returning" the value to the successor thread.


Inlets

The two remaining Cilk keywords are slightly more advanced, and concern the use of ''inlets''. Ordinarily, when a Cilk procedure is spawned, it can return its results to the parent procedure only by putting those results in a variable in the parent's frame, as we assigned the results of our spawned procedure calls in the example to x and y. The alternative is to use an inlet. An inlet is a function internal to a Cilk procedure which handles the results of a spawned procedure call as they return. One major reason to use inlets is that all the inlets of a procedure are guaranteed to operate atomically with regards to each other and to the parent procedure, thus avoiding the bugs that could occur if the multiple returning procedures tried to update the same variables in the parent frame at the same time. * The inlet keyword identifies a function defined within the procedure as an inlet. * The abort keyword can only be used inside an inlet; it tells the scheduler that any other procedures that have been spawned off by the parent procedure can safely be aborted. Inlets were removed when Cilk became Cilk++, and are not present in Cilk Plus.


Parallel loops

Cilk++ added an additional construct, the parallel loop, denoted in Cilk Plus. These loops look like void loop(int *a, int n) This implements the parallel map idiom: the body of the loop, here a call to followed by an assignment to the array , is executed for each value of from zero to in an indeterminate order. The optional "grain size" pragma determines the coarsening: any sub-array of one hundred or fewer elements is processed sequentially. Although the Cilk specification does not specify the exact behavior of the construct, the typical implementation is a divide-and-conquer recursion, as if the programmer had written static void recursion(int *a, int start, int end) void loop(int *a, int n) The reasons for generating a divide-and-conquer program rather than the obvious alternative, a loop that spawn-calls the loop body as a function, lie in both the grainsize handling and in efficiency: doing all the spawning in a single task makes load balancing a bottleneck. A review of various parallel loop constructs on HPCwire found the construct to be quite general, but noted that the Cilk Plus specification did not stipulate that its iterations need to be data-independent, so a compiler cannot automatically vectorize a loop. The review also noted the fact that reductions (e.g., sums over arrays) need additional code.


Reducers and hyperobjects

Cilk++ added a kind of objects called ''hyperobjects'', that allow multiple strands to share state without
race condition A race condition or race hazard is the condition of an electronics, software, or other system where the system's substantive behavior is Sequential logic, dependent on the sequence or timing of other uncontrollable events. It becomes a software ...
s and without using explicit locks. Each strand has a view on the hyperobject that it can use and update; when the strands synchronize, the views are combined in a way specified by the programmer. The most common type of hyperobject is a reducer, which corresponds to the reduction clause in
OpenMP OpenMP (Open Multi-Processing) is an application programming interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran, on many platforms, instruction-set architectures and operating syst ...
or to the algebraic notion of a
monoid In abstract algebra, a branch of mathematics, a monoid is a set equipped with an associative binary operation and an identity element. For example, the nonnegative integers with addition form a monoid, the identity element being 0. Monoid ...
. Each reducer has an
identity element In mathematics, an identity element, or neutral element, of a binary operation operating on a set is an element of the set that leaves unchanged every element of the set when the operation is applied. This concept is used in algebraic structures su ...
and an
associative operation In mathematics, the associative property is a property of some binary operations, which means that rearranging the parentheses in an expression will not change the result. In propositional logic, associativity is a valid rule of replacement f ...
that combines two values. The archetypal reducer is
summation In mathematics, summation is the addition of a sequence of any kind of numbers, called ''addends'' or ''summands''; the result is their ''sum'' or ''total''. Beside numbers, other types of values can be summed as well: functions, vectors, ma ...
of numbers: the identity element is zero, and the associative ''reduce'' operation computes a sum. This reducer is built into Cilk++ and Cilk Plus: // Compute ∑ foo(i) for i from 0 to N, in parallel. cilk::reducer_opadd result(0); cilk_for (int i = 0; i < N; i++) result += foo(i); Other reducers can be used to construct
linked list In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes which ...
s or strings, and programmers can define custom reducers. A limitation of hyperobjects is that they provide only limited
determinacy Determinacy is a subfield of set theory, a branch of mathematics, that examines the conditions under which one or the other player of a game has a winning strategy, and the consequences of the existence of such strategies. Alternatively and si ...
. Burckhardt ''et al.'' point out that even the sum reducer can result in non-deterministic behavior, showing a program that may produce either or depending on the scheduling order: void add1(cilk::reducer_opadd &r) // ... cilk::reducer_opadd r(0); cilk_spawn add1(r); if (r

0) cilk_sync; output(r.get_value());


Array notation

Intel Cilk Plus adds notation to express high-level operations on entire arrays or sections of arrays; e.g., an axpy-style function that is ordinarily written // y ← α x + y void axpy(int n, float alpha, const float *x, float *y) can in Cilk Plus be expressed as y :n+= alpha * x :n This notation helps the compiler to effectively vectorize the application. Intel Cilk Plus allows C/C++ operations to be applied to multiple array elements in parallel, and also provides a set of built-in functions that can be used to perform vectorized shifts, rotates, and reductions. Similar functionality exists in Fortran 90; Cilk Plus differs in that it never allocates temporary arrays, so memory usage is easier to predict.


Elemental functions

In Cilk Plus, an elemental function is a regular function which can be invoked either on scalar arguments or on array elements in parallel. They are similar to the kernel functions of
OpenCL OpenCL (Open Computing Language) is a software framework, framework for writing programs that execute across heterogeneous computing, heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), d ...
.


#pragma simd

This pragma gives the compiler permission to vectorize a loop even in cases where auto-vectorization might fail. It is the simplest way to manually apply vectorization.


Work-stealing

The Cilk scheduler uses a policy called "work-stealing" to divide procedure execution efficiently among multiple processors. Again, it is easiest to understand if we look first at how Cilk code is executed on a single-processor machine. The processor maintains a
stack Stack may refer to: Places * Stack Island, an island game reserve in Bass Strait, south-eastern Australia, in Tasmania’s Hunter Island Group * Blue Stack Mountains, in Co. Donegal, Ireland People * Stack (surname) (including a list of people ...
on which it places each frame that it has to suspend in order to handle a procedure call. If it is executing ''fib(2)'', and encounters a recursive call to ''fib(1)'', it will save ''fib(2)'''s state, including its variables and where the code suspended execution, and put that state on the stack. It will not take a suspended state off the stack and resume execution until the procedure call that caused the suspension, and any procedures called in turn by that procedure, have all been fully executed. With multiple processors, things of course change. Each processor still has a stack for storing frames whose execution has been suspended; however, these stacks are more like
deque In computer science, a double-ended queue (abbreviated to deque, pronounced ''deck'', like "cheque") is an abstract data type that generalizes a queue, for which elements can be added to or removed from either the front (head) or back (tail). I ...
s, in that suspended states can be removed from either end. A processor can still only remove states from its ''own'' stack from the same end that it puts them on; however, any processor which is not currently working (having finished its own work, or not yet having been assigned any) will pick another processor at random, through the scheduler, and try to "steal" work from the opposite end of their stack suspended states, which the stealing processor can then begin to execute. The states which get stolen are the states that the processor stolen from would get around to executing last.


See also

* Grand Central Dispatch *
Intel Concurrent Collections Concurrent Collections (known as CnC) is a programming model for software frameworks to expose parallelism in applications. The Concurrent Collections conception originated from tagged stream processing development with HP TStreams. TStreams Arou ...
(CnC) *
Intel Parallel Building Blocks Intel Parallel Building Blocks (PBB) was a collection of three programming solutions designed for multithreaded parallel computing. PBB consisted of Cilk Plus, Threading Building Blocks (TBB) and Intel Array Building Blocks (ArBB).
(PBB) ** Intel Array Building Blocks (ArBB) *
Intel Parallel Studio Intel Parallel Studio XE was a software development product developed by Intel that facilitated native code development on Windows, macOS and Linux in C++ and Fortran for parallel computing. Parallel programming enables software programs to t ...
*
NESL NESL is a parallel programming language developed at Carnegie Mellon by the SCandAL project and released in 1993. It integrates various ideas from parallel algorithms, functional programming, and array programming languages. The most important ...
*
OpenMP OpenMP (Open Multi-Processing) is an application programming interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran, on many platforms, instruction-set architectures and operating syst ...
*
Parallel computing Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different f ...
*
Sieve C++ Parallel Programming System A sieve, fine mesh strainer, or sift, is a device for separating wanted elements from unwanted material or for controlling the particle size distribution of a sample, using a screen such as a woven mesh or net or perforated sheet material. ...
*
Threading Building Blocks oneAPI Threading Building Blocks (oneTBB; formerly Threading Building Blocks or TBB), is a C++ template library developed by Intel for parallel programming on multi-core processors. Using TBB, a computation is broken down into tasks that ca ...
(TBB) *
Unified Parallel C Unified Parallel C (UPC) is an extension of the C programming language designed for high-performance computing on large-scale parallel machines, including those with a common global address space ( SMP and NUMA) and those with distributed memory ...


References


External links

* for OpenCilk
Intel's Cilk Plus website

Cilk Project website at MIT
* Arch D. Robison
"Cilk Plus: Language Support for Thread and Vector Parallelism"
an
"Parallel Programming with Cilk Plus"
July 16, 2012. {{Parallel computing Articles with example code Concurrent programming languages C programming language family Massachusetts Institute of Technology software Intel software Programming languages created in 1994