HOME

TheInfoList



OR:

PAM (Parallel Augmented Maps) is an open-source parallel C++ library implementing the interface for sequence, ordered sets, ordered
maps A map is a symbolic depiction of interrelationships, commonly spatial, between things within a space. A map may be annotated with text and graphics. Like any graphic, a map may be fixed to paper or other durable media, or may be displayed on ...
, and
augmented map In computer science, an associative array, key-value store, map, symbol table, or dictionary is an abstract data type that stores a collection of (key, value) pairs, such that each possible key appears at most once in the collection. In mathem ...
s. The library is available on GitHub. It uses the underlying balanced binary tree structure using join-based algorithms. PAM supports four balancing schemes, including
AVL trees In computer science, an AVL tree (named after inventors Adelson-Velsky and Landis) is a self-balancing binary search tree. In an AVL tree, the heights of the two child subtrees of any node differ by at most one; if at any time they differ by m ...
, red-black trees,
treap In computer science, the treap and the randomized binary search tree are two closely related forms of binary search tree data structures that maintain a dynamic set of ordered keys and allow binary searches among the keys. After any sequence of i ...
s and weight-balanced trees. PAM is a parallel library and is also safe for concurrency. Its parallelism can be supported by
cilk Cilk, Cilk++, Cilk Plus and OpenCilk are general-purpose programming languages designed for multithreaded parallel computing. They are based on the C and C++ programming languages, which they extend with constructs to express parallel loop ...
,
OpenMP OpenMP is an application programming interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran, on many platforms, instruction-set architectures and operating systems, including Solaris, ...
or the scheduler in PBBS. Theoretically, all algorithms in PAM are work-efficient and have polylogarithmic depth. PAM uses underlying persistent tree structure such that multi-versioning is allowed. PAM also supports efficient GC.


Interface


Sequences

To define a sequence, users need to specify the key type of the sequence. PAM supports functions on sequences including construction, find an entry with a certain rank, first, last, next, previous, size, empty, filter, map-reduce, concatenating, etc.


Ordered sets

To define an ordered set, users need to specify the key type and the comparison function defining a
total ordering In mathematics, a total order or linear order is a partial order in which any two elements are comparable. That is, a total order is a binary relation \leq on some set X, which satisfies the following for all a, b and c in X: # a \leq a ( re ...
on the key type. On top of the sequence interface, PAM also supports functions for ordered sets including insertion, deletion, union,
intersection In mathematics, the intersection of two or more objects is another object consisting of everything that is contained in all of the objects simultaneously. For example, in Euclidean geometry, when two lines in a plane are not parallel, their ...
,
difference Difference commonly refers to: * Difference (philosophy), the set of properties by which items are distinguished * Difference (mathematics), the result of a subtraction Difference, The Difference, Differences or Differently may also refer to: Mu ...
, etc.


Ordered maps

To define an ordered map, users need to specify the key type, the comparison function on the key type, and the value type. On top of the ordered set interface, PAM also supports functions for ordered maps, such as insertion with combining values.


Augmented maps

To define an
augmented map In computer science, an associative array, key-value store, map, symbol table, or dictionary is an abstract data type that stores a collection of (key, value) pairs, such that each possible key appears at most once in the collection. In mathem ...
, users need to specify the key type, the comparison function on the key type, the value type, the augmented value type, the base function, the combine function and the identity of the combine function. On top of the ordered map interface, PAM also supports functions for augmented maps, such as aug_range. In addition to the tree structures, PAM also implements the prefix structure for augmented maps.


Implementation for Example Applications

The library also provides example implementations for a number of applications, including 1D stabbing query (using interval trees, 2D range query (using a
range tree In computer science, a range tree is an ordered tree data structure, ordered tree data structure to hold a list of points. It allows all points within a given range to be Range query (computer science), reported efficiently, and is typically used ...
and a sweepline algorithm), 2D segment query (using a
segment tree In computer science, the segment tree is a data structure used for storing information about Interval (mathematics), intervals or segments. It allows querying which of the stored segments contain a given point. A similar data structure is the i ...
and a sweepline algorithm), 2D rectangle query (using a tree structure and a sweepline algorithm), inverted index searching, etc.


Used in applications

The library has been tested in various applications, including database benchmarks, 2D
segment tree In computer science, the segment tree is a data structure used for storing information about Interval (mathematics), intervals or segments. It allows querying which of the stored segments contain a given point. A similar data structure is the i ...
, 2D
interval tree In computer science, an interval tree is a tree data structure to hold intervals. Specifically, it allows one to efficiently find all intervals that overlap with any given interval or point. It is often used for windowing queries, for instance, t ...
,
inverted index In computer science, an inverted index (also referred to as a postings list, postings file, or inverted file) is a database index storing a mapping from content, such as words or numbers, to its locations in a table, or in a document or a set of d ...
and
multiversion concurrency control Multiversion concurrency control (MCC or MVCC), is a non-locking concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory. ...
.


References

{{reflist


External links


PAM
the parallel augmented map library. C++ libraries Computer libraries