A conc-tree
[Prokopec, A. et al. (2015]
Conc-Trees for Functional and Parallel Programming
Research Paper, 2015[Prokopec A. (2014]
Data Structures and Algorithms for Data-Parallel Computing in a Managed Runtime
Doctoral Thesis, 2014 is a
data structure
In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, a ...
that stores element sequences, and provides amortized
O(1) time append and prepend operations, O(log n) time insert and remove operations and O(log n) time concatenation. This data structure is particularly viable for functional task-parallel and data-parallel programming, and is relatively simple to implement compared to other data-structures with similar asymptotic complexity.
Conc-trees were designed to improve efficiency of data-parallel operations that do not require sequential left-to-right iteration order,
[Steele, G. (2009]
Organizing Functional Code for Parallel Execution; or, foldl and foldr Considered Slightly Harmful and improve constant factors in these operations by avoiding unnecessary copies of the data.
Orthogonally, they are used to efficiently aggregate data in functional-style
task parallelism, task-parallel algorithms, as an implementation of the conc-list data abstraction.
[Steel, G. (2011]
How to Think about Parallel Programming: Not! Conc-list is a parallel programming counterpart to
cons, functional cons-lists, and was originally introduced by the
Fortress language.
Operations
The basic conc-tree operation is concatenation. Conc-trees work on the following basic data-types:
trait Conc
case class Empty extends Conc
case class Single elem: T) extends Conc
case class <> left: Conc right: Conc extends Conc
The ''<>'' type represents inner nodes, and is pronounced ''conc'', inspired by ''::'' (the
cons type) in functional lists, used for sequential programming.
Concatenation in O(log n) time then works by ensuring that the difference in levels (i.e. heights) between any two sibling trees is one or less, similar to invariants maintained in
AVL trees
In computer science, an AVL tree (named after inventors Adelson-Velsky and Landis) is a self-balancing binary search tree. It was the first such data structure to be invented. In an AVL tree, the heights of the two child subtrees of any node d ...
. This invariant ensures that the height of the tree (length of the longest path from the root to some leaf) is always logarithmic in the number of elements in the tree. Concatenation is implemented as follows:
def concat(xs: Conc ys: Conc
Amortized O(1) time appends (or prepends) are achieved by introducing a new inner node type called ''Append'', and using it to encode a logarithmic-length list of conc-trees, strictly decreasing in height. Every ''Append'' node ''ap'' must satisfy the following invariants:
1. Level of ''ap.left.right'' is always strictly larger than the level of ''ap.right''.
2. The tree ''ap.right'' never contains any ''Append'' nodes (i.e. it is in the normalized form, composed only from ''<>'', ''Single'' and ''Empty'').
With these invariants, appending is isomorphic to binary number addition—two adjacent trees of the same height can be linked on constant time, with at most a logarithmic number of carry operations. This is illustrated in the following figure, where an element is being appended to a conc-tree that corresponds to a binary number 11:

This binary number representation is similar to that of ''purely functional random access lists'' by Okasaki, with the difference that random access lists require all the trees to be ''complete''
binary tree
In computer science, a binary tree is a k-ary k = 2 tree data structure in which each node has at most two children, which are referred to as the ' and the '. A recursive definition using just set theory notions is that a (non-empty) binary t ...
s, whereas conc-trees are more relaxed, and only require balanced trees. These more relaxed invariants allow conc-trees to retain logarithmic time concatenation, while random access lists allow only O(n) concatenation.
The following is an implementation of an ''append'' method that is worst-case O(log n) time and amortized O(1) time:
case class Append left: Conc right: Conc extends Conc
private def append xs: Append ys: Conc =
if (xs.right.level > ys.level) new Append(xs, ys)
else
}
Conc-tree constructed this way never has more than O(log n) ''Append'' nodes, and can be converted back to the normalized form (one using only ''<>'', ''Single'' and ''Empty'' nodes) in O(log n) time.
A detailed demonstration of these operations can be found in online resources,
Parallel Programming lecture on Conc-Trees at EPFL
/ref> or in the original conc-tree paper. It was shown that these basic operations can be extended to support worst-case O(1) deque
In computer science, a double-ended queue (abbreviated to deque, pronounced ''deck'', like "cheque") is an abstract data type that generalizes a queue, for which elements can be added to or removed from either the front (head) or back (tail). I ...
operations, while keeping the O(log n) concatenation time bound, at the cost of increasing the constant factors of all operations.
References
{{Reflist
Trees (data structures)