Synchronous Rendezvous
   HOME

TheInfoList



OR:

In
parallel computing Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different fo ...
, a barrier is a type of
synchronization Synchronization is the coordination of events to operate a system in unison. For example, the conductor of an orchestra keeps the orchestra synchronized or ''in time''. Systems that operate with all parts in synchrony are said to be synchronou ...
method. A barrier for a group of threads or processes in the source code means any thread/process must stop at this point and cannot proceed until all other threads/processes reach this barrier. Many collective routines and directive-based parallel languages impose implicit barriers. For example, a parallel ''do'' loop in Fortran with
OpenMP OpenMP (Open Multi-Processing) is an application programming interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran, on many platforms, instruction-set architectures and operating syste ...
will not be allowed to continue on any thread until the last iteration is completed. This is in case the program relies on the result of the loop immediately after its completion. In
message passing In computer science, message passing is a technique for invoking behavior (i.e., running a program) on a computer. The invoking program sends a message to a process (which may be an actor or object) and relies on that process and its supporting i ...
, any global communication (such as reduction or scatter) may imply a barrier. In
concurrent computing Concurrent computing is a form of computing in which several computations are executed '' concurrently''—during overlapping time periods—instead of ''sequentially—''with one completing before the next starts. This is a property of a syst ...
, a barrier may be in a ''raised'' or ''lowered state''. The term latch is sometimes used to refer to a barrier that starts in the raised state and cannot be re-raised once it is in the lowered state. The term count-down latch is sometimes used to refer to a latch that is automatically lowered once a pre-determined number of threads/processes have arrived.


Implementation

The basic barrier has mainly two variables, one of which records the pass/stop state of the barrier, the other of which keeps the total number of threads that have entered in the barrier. The barrier state was initialized to be "stop" by the first threads coming into the barrier. Whenever a thread enters, based on the number of threads already in the barrier, only if it is the last one, the thread sets the barrier state to be "pass" so that all the threads can get out of the barrier. On the other hand, when the incoming thread is not the last one, it is trapped in the barrier and keeps testing if the barrier state has changed from "stop" to "pass", and it gets out only when the barrier state changes to "pass". The following C++ code demonstrates this procedure. struct barrier_type ; // barrier for p processors void barrier(barrier_type* b, int p) The potential problem is: # Due to all the threads repeatedly accessing the global variable for pass/stop, the communication traffic is rather high, which decreases the
scalability Scalability is the property of a system to handle a growing amount of work by adding resources to the system. In an economic context, a scalable business model implies that a company can increase sales given increased resources. For example, a ...
. This problem can be resolved by regrouping the threads and using multi-level barrier, e.g. Combining Tree Barrier. Also hardware implementations may have the advantage of higher
scalability Scalability is the property of a system to handle a growing amount of work by adding resources to the system. In an economic context, a scalable business model implies that a company can increase sales given increased resources. For example, a ...
.


Sense-Reversal Centralized Barrier

Instead of using the same value to represent pass/stop, sequential barriers use opposite values for pass/stop state. For example, if barrier 1 uses 0 to stop the threads, barrier 2 will use 1 to stop threads and barrier 3 will use 0 to stop threads again and so on. The following C++ code demonstrates this. struct barrier_type ; int local_sense = 0; // private per processor // barrier for p processors void barrier(barrier_type* b, int p)


Combining Tree Barrier

A Combining Tree Barrier is a hierarchical way of implementing barrier to resolve the
scalability Scalability is the property of a system to handle a growing amount of work by adding resources to the system. In an economic context, a scalable business model implies that a company can increase sales given increased resources. For example, a ...
by avoiding the case that all threads are spinning at the same location. In k-Tree Barrier, all threads are equally divided into subgroups of k threads and a first-round synchronizations are done within these subgroups. Once all subgroups have done their synchronizations, the first thread in each subgroup enters the second level for further synchronization. In the second level, like in the first level, the threads form new subgroups of k threads and synchronize within groups, sending out one thread in each subgroup to next level and so on. Eventually, in the final level there is only one subgroup to be synchronized. After the final-level synchronization, the releasing signal is transmitted to upper levels and all threads get past the barrier.


Hardware Barrier Implementation

The hardware barrier uses hardware to implement the above basic barrier model. The simplest hardware implementation uses dedicated wires to transmit signal to implement barrier. This dedicated wire performs OR/AND operation to act as the pass/block flags and thread counter. For small systems, such a model works and communication speed is not a major concern. In large multiprocessor systems this hardware design can make barrier implementation have high latency. The network connection among processors is one implementation to lower the latency, which is analogous to Combining Tree Barrier.N.R. Adiga, et al. An Overview of the BlueGene/L Supercomputer. ''Proceedings of the Conference on High Performance Networking and Computing,'' 2002.


See also

*
Fork–join model In parallel computing, the fork–join model is a way of setting up and executing parallel programs, such that execution branches off in parallel at designated points in the program, to "join" (merge) at a subsequent point and resume sequential ...
*
Rendezvous (Plan 9) Rendezvous is a data synchronization mechanism in Plan 9 from Bell Labs. It is a system call that allows two processes to exchange a single datum while synchronizing. The rendezvous call takes a ''tag'' and a ''value'' as its arguments. The ...
*
Memory barrier In computing, a memory barrier, also known as a membar, memory fence or fence instruction, is a type of barrier instruction that causes a central processing unit (CPU) or compiler to enforce an ordering constraint on memory operations issued be ...


References


External links

{{DEFAULTSORT:Barrier (Computer Science)
Computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
Concurrency control Concurrency (computer science) Parallel computing