In
parallel computing
Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different fo ...
, an embarrassingly parallel workload or problem (also called embarrassingly parallelizable, perfectly parallel, delightfully parallel or pleasingly parallel) is one where little or no effort is needed to separate the problem into a number of parallel tasks. This is often the case where there is little or no dependency or need for communication between those parallel tasks, or for results between them.
[Section 1.4.4 of: ]
Thus, these are different from
distributed computing
A distributed system is a system whose components are located on different computer network, networked computers, which communicate and coordinate their actions by message passing, passing messages to one another from any system. Distributed com ...
problems that need communication between tasks, especially communication of intermediate results. They are easy to perform on
server farm
A server farm or server cluster is a collection of Server (computing), computer servers, usually maintained by an organization to supply server functionality far beyond the capability of a single machine. They often consist of thousands of compu ...
s which lack the special infrastructure used in a true
supercomputer
A supercomputer is a computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second ( FLOPS) instead of million instructions ...
cluster. They are thus well suited to large, Internet-based
volunteer computing
Volunteer computing is a type of distributed computing in which people donate their computers' unused resources to a research-oriented project, and sometimes in exchange for credit points. The fundamental idea behind it is that a modern desktop co ...
platforms such as
BOINC
The Berkeley Open Infrastructure for Network Computing (BOINC, pronounced – rhymes with "oink") is an open-source middleware system for volunteer computing (a type of distributed computing). Developed originally to support SETI@home, it beca ...
, and do not suffer from
parallel slowdown
Parallel slowdown is a phenomenon in parallel computing where parallelization of a parallel algorithm beyond a certain point causes the program to run slower (take more time to run to completion).
Parallel slowdown is typically the result of a ...
. The opposite of embarrassingly parallel problems are
inherently serial problems, which cannot be parallelized at all.
A common example of an embarrassingly parallel problem is 3D video rendering handled by a
graphics processing unit
A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobi ...
, where each frame (forward method) or pixel (
ray tracing method) can be handled with no interdependency.
Some forms of
password cracking
In cryptanalysis and computer security, password cracking is the process of recovering passwords from data that has been stored in or transmitted by a computer system in scrambled form. A common approach (brute-force attack) is to repeatedly try ...
are another embarrassingly parallel task that is easily distributed on
central processing unit
A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, an ...
s,
CPU core
A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, and ...
s, or clusters.
Etymology
"Embarrassingly" is used here in the same sense as in the phrase "an
embarrassment of riches
An embarrassment of riches is an idiom that means an overabundance of something, or too much of a good thing, that originated in 1738 as John Ozell's translation of a French play, ''L'Embarras des richesses'' (1726), by Léonor Jean Christine ...
", meaning an overabundance—here referring to parallelization problems which are "embarrassingly easy". The term may also imply embarrassment on the part of developers or compilers: "Because so many important problems remain unsolved mainly due to their intrinsic computational complexity, it would be embarrassing not to develop parallel implementations of polynomial
homotopy
In topology, a branch of mathematics, two continuous functions from one topological space to another are called homotopic (from grc, ὁμός "same, similar" and "place") if one can be "continuously deformed" into the other, such a deforma ...
continuation methods." The term is first found in the literature in a 1986 book on multiprocessors by
MATLAB
MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementation ...
's creator
Cleve Moler
Cleve Barry Moler is an American mathematician and computer programmer specializing in numerical analysis. In the mid to late 1970s, he was one of the authors of LINPACK and EISPACK, Fortran libraries for numerical computing. He invented MATLA ...
,
who claims to have invented the term.
An alternative term, ''pleasingly parallel'', has gained some use, perhaps to avoid the negative connotations of embarrassment in favor of a positive reflection on the parallelizability of the problems: "Of course, there is nothing embarrassing about these programs at all."
Examples
Some examples of embarrassingly parallel problems include:
*
Monte Carlo analysis
Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be determini ...
* Distributed relational database queries usin
distributed set processing
*
Numerical integration
In analysis, numerical integration comprises a broad family of algorithms for calculating the numerical value of a definite integral, and by extension, the term is also sometimes used to describe the numerical solution of differential equations ...
* Serving static files on a webserver to multiple users at once.
* Bulk processing of unrelated files of similar nature in general, such as photo gallery resizing and conversion.
* The
Mandelbrot set
The Mandelbrot set () is the set of complex numbers c for which the function f_c(z)=z^2+c does not diverge to infinity when iterated from z=0, i.e., for which the sequence f_c(0), f_c(f_c(0)), etc., remains bounded in absolute value.
This ...
,
Perlin noise
Perlin noise is a type of gradient noise developed by Ken Perlin.
History
Ken Perlin developed Perlin noise in 1983 as a result of his frustration with the "machine-like" look of computer-generated imagery (CGI) at the time. He formally descr ...
and similar images, where each point is calculated independently.
*
Rendering of
computer graphics
Computer graphics deals with generating images with the aid of computers. Today, computer graphics is a core technology in digital photography, film, video games, cell phone and computer displays, and many specialized applications. A great de ...
. In
computer animation
Computer animation is the process used for digitally generating animations. The more general term computer-generated imagery (CGI) encompasses both static scenes (still images) and dynamic images (moving images), while computer animation refe ...
, each
frame
A frame is often a structural system that supports other components of a physical construction and/or steel frame that limits the construction's extent.
Frame and FRAME may also refer to:
Physical objects
In building construction
*Framing (con ...
or pixel may be rendered independently (see
parallel rendering Parallel rendering (or distributed rendering) is the application of parallel programming to the computational domain of computer graphics. Rendering graphics can require massive computational resources for complex scenes that arise in scientific vi ...
).
* Some
brute-force search
In computer science, brute-force search or exhaustive search, also known as generate and test, is a very general problem-solving technique and algorithmic paradigm that consists of systematically enumerating all possible candidates for the soluti ...
es in
cryptography
Cryptography, or cryptology (from grc, , translit=kryptós "hidden, secret"; and ''graphein'', "to write", or ''-logia'', "study", respectively), is the practice and study of techniques for secure communication in the presence of adver ...
. Notable real-world examples include
distributed.net and
proof-of-work
Proof of work (PoW) is a form of cryptographic proof in which one party (the ''prover'') proves to others (the ''verifiers'') that a certain amount of a specific computational effort has been expended. Verifiers can subsequently confirm this exp ...
systems used in
cryptocurrency
A cryptocurrency, crypto-currency, or crypto is a digital currency designed to work as a medium of exchange through a computer network that is not reliant on any central authority, such as a government or bank, to uphold or maintain it. It i ...
.
*
BLAST
Blast or The Blast may refer to:
* Explosion, a rapid increase in volume and release of energy in an extreme manner
*Detonation, an exothermic front accelerating through a medium that eventually drives a shock front
Film
* ''Blast'' (1997 film) ...
searches in
bioinformatics
Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
with split databases.
* Large scale
facial recognition system
A facial recognition system is a technology capable of matching a human face from a digital image or a video frame against a database of faces. Such a system is typically employed to authenticate users through ID verification services, and wo ...
s that compare thousands of arbitrary acquired faces (e.g., a security or surveillance video via
closed-circuit television
Closed-circuit television (CCTV), also known as video surveillance, is the use of video cameras to transmit a signal to a specific place, on a limited set of monitors. It differs from broadcast television in that the signal is not openly t ...
) with similarly large number of previously stored faces (e.g., a ''
rogues gallery
A rogues' gallery (or rogues gallery) is a police collection of mug shots or other images of criminal suspects kept for identification purposes.
History
In 1855, Allan Pinkerton, founder of the Pinkerton National Detective Agency, established a ...
'' or similar
watch list).
* Computer simulations comparing many independent scenarios.
*
Genetic algorithm
In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to gene ...
s.
*
Ensemble calculations of
numerical weather prediction
Numerical weather prediction (NWP) uses mathematical models of the atmosphere and oceans to predict the weather based on current weather conditions. Though first attempted in the 1920s, it was not until the advent of computer simulation in th ...
.
* Event simulation and reconstruction in
particle physics
Particle physics or high energy physics is the study of fundamental particles and forces that constitute matter and radiation. The fundamental particles in the universe are classified in the Standard Model as fermions (matter particles) an ...
.
* The
marching squares algorithm.
* Sieving step of the
quadratic sieve The quadratic sieve algorithm (QS) is an integer factorization algorithm and, in practice, the second fastest method known (after the general number field sieve). It is still the fastest for integers under 100 decimal digits or so, and is considerab ...
and the
number field sieve
In number theory, the general number field sieve (GNFS) is the most efficient classical algorithm known for factoring integers larger than . Heuristically, its complexity for factoring an integer (consisting of bits) is of the form
:\exp\left( ...
.
* Tree growth step of the
random forest
Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of th ...
machine learning technique.
*
Discrete Fourier transform
In mathematics, the discrete Fourier transform (DFT) converts a finite sequence of equally-spaced samples of a function into a same-length sequence of equally-spaced samples of the discrete-time Fourier transform (DTFT), which is a complex- ...
where each harmonic is independently calculated.
*
Convolutional neural network
In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...
s running on
GPU
A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobi ...
s.
*
Hyperparameter grid search in machine learning.
* Parallel search in
constraint programming
Constraint programming (CP) is a paradigm for solving combinatorial problems that draws on a wide range of techniques from artificial intelligence, computer science, and operations research. In constraint programming, users declaratively state th ...
Implementations
* In
R (programming language)
R is a programming language for statistical computing and graphics supported by the R Core Team and the R Foundation for Statistical Computing. Created by statisticians Ross Ihaka and Robert Gentleman, R is used among data miners, bioinform ...
– The Simple Network of Workstations (SNOW) package implements a simple mechanism for using a set of workstations or a
Beowulf cluster
A Beowulf cluster is a computer cluster of what are normally identical, commodity-grade computers networked into a small local area network with libraries and programs installed which allow processing to be shared among them. The result is a hig ...
for embarrassingly parallel computations.
Simple Network of Workstations (SNOW) package
/ref> Similar R packages include "future", "parallel" and others.
See also
* Amdahl's law
In computer architecture, Amdahl's law (or Amdahl's argument) is a formula which gives the theoretical speedup in latency of the execution of a task at fixed workload that can be expected of a system whose resources are improved. It states that ...
defines value '' P'', which would be almost or exactly equal to 1 for embarrassingly parallel problems.
* Map (parallel pattern) Map is an idiom in parallel computing where a simple operation is applied to all elements of a sequence, potentially in parallel. It is used to solve embarrassingly parallel problems: those problems that can be decomposed into independent subtask ...
*Multiprocessing
Multiprocessing is the use of two or more central processing units (CPUs) within a single computer system. The term also refers to the ability of a system to support more than one processor or the ability to allocate tasks between them. There ar ...
*Massively parallel
Massively parallel is the term for using a large number of computer processors (or separate computers) to simultaneously perform a set of coordinated computations in parallel. GPUs are massively parallel architecture with tens of thousands of t ...
*Parallel computing
Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different fo ...
*Process-oriented programming
Process-oriented programming is a programming paradigm that separates the concerns of data structures and the concurrent processes that act upon them. The data structures in this case are typically persistent, complex, and large scale - the subject ...
*Shared-nothing architecture A shared-nothing architecture (SN) is a distributed computing architecture in which each update request is satisfied by a single node (processor/memory/storage unit) in a computer cluster. The intent is to eliminate contention among nodes. Nodes do ...
(SN)
*Symmetric multiprocessing
Symmetric multiprocessing or shared-memory multiprocessing (SMP) involves a multiprocessor computer hardware and software architecture where two or more identical processors are connected to a single, shared main memory, have full access to all ...
(SMP)
*Connection Machine
A Connection Machine (CM) is a member of a series of massively parallel supercomputers that grew out of doctoral research on alternatives to the traditional von Neumann architecture of computers by Danny Hillis at Massachusetts Institute of Techno ...
*Cellular automaton
A cellular automaton (pl. cellular automata, abbrev. CA) is a discrete model of computation studied in automata theory. Cellular automata are also called cellular spaces, tessellation automata, homogeneous structures, cellular structures, tessel ...
* CUDA framework
*Manycore processor
Manycore processors are special kinds of multi-core processors designed for a high degree of parallel processing, containing numerous simpler, independent processor cores (from a few tens of cores to thousands or more). Manycore processors are us ...
*Vector processor
In computing, a vector processor or array processor is a central processing unit (CPU) that implements an instruction set where its instructions are designed to operate efficiently and effectively on large one-dimensional arrays of data called ...
References
External links
Embarrassingly Parallel Computations
Engineering a Beowulf-style Compute Cluster
*
Star-P: High Productivity Parallel Computing
{{Parallel computing
Parallel computing
Distributed computing problems
Applications of distributed computing