computing Computing is any goal-oriented activity requiring, benefiting from, or creating computer, computing machinery. It includes the study and experimentation of algorithmic processes, and the development of both computer hardware, hardware and softw ...

, a rose tree is a term for the value of a

tree data structure In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children (depending on the type of tree), but must be conn ...

with a variable and unbounded number of branches per node. The term is mostly used in the

functional programming In computer science, functional programming is a programming paradigm where programs are constructed by Function application, applying and Function composition (computer science), composing Function (computer science), functions. It is a declarat ...

community, e.g., in the context of the

Bird–Meertens formalism The Bird–Meertens formalism (BMF) is a calculus for deriving programs from program specifications (in a functional programming setting) by a process of equational reasoning. It was devised by Richard Bird and Lambert Meertens as part of their ...

. Apart from the multi-branching property, the most essential characteristic of rose trees is the coincidence of

bisimilarity In theoretical computer science a bisimulation is a binary relation between state transition systems, associating systems that behave in the same way in that one system simulates the other and vice versa. Intuitively two systems are bisimilar if ...

with

identity Identity may refer to: * Identity document * Identity (philosophy) * Identity (social science) * Identity (mathematics) Arts and entertainment Film and television * ''Identity'' (1987 film), an Iranian film * ''Identity'' (2003 film), an ...

: two distinct rose trees are never bisimilar.

Naming

The name "rose tree" was coined by

Lambert Meertens Lambert Guillaume Louis Théodore Meertens or L.G.L.T. Meertens (born 10 May 1944, in Amsterdam) is a Dutch computer scientist and professor. , he is a researcher at the Kestrel Institute, a nonprofit computer science research center in Palo Al ...

to evoke the similarly named, and similarly structured, common rhododendron.

We shall call such trees ''rose trees'', a literal translation of ''rhododendron'' (Greek = rose, = tree), because of resemblance to the habitus of this shrub, except that the latter does not grow upside-down on the Northern hemisphere.

Recursive definition

Well-founded In mathematics, a binary relation is called well-founded (or wellfounded or foundational) on a set (mathematics), set or, more generally, a Class (set theory), class if every non-empty subset has a minimal element with respect to ; that is, t ...

rose trees can be defined by a recursive construction of entities of the following types:

A ''base entity'' is an element of a predefined ground set of values (the "tip"-values).
A ''branching entity'' (alternatively, a ''forking entity'' or a ''forest entity'') is either of the following sub-types:
1. A ''set'' of entities.
2. A ''sequence'' of entities.
3. A partial ''map'' from a predefined set of ''names'' to entities.
Any of (a)(b)(c) can be empty. Note that (b) can be seen as a special case of (c) – a sequence is just a map from an initial segment of the set \mathbb of natural numbers.
A ''pairing entity'' is an ordered pair such that is a branching entity and is an element of a predefined set of "label" values. Since a pairing entity can only contain a branching entity as its component, there is an induced division into sub-types (3a), (3b) or (3c) corresponding to sub-types of branching entities.

Typically, only some combinations of entity types are used for the construction. The original paper only considers 1+2b ("sequence-forking" rose trees) and 1+2a ("set-forking" rose trees). In later literature, the 1+2b variant is usually introduced by the following definition:

data Tree a = Leaf a ,  Node  ree a

A rose tree ..is either a leaf containing a value, or a node that can have an arbitrary list of subtrees. The most common definition used in functional programming (particularly in

Haskell Haskell () is a general-purpose, statically typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell pioneered several programming language ...

) combines 3+2b:

data Rose α = Node α  ose α

''An element of Rose α consists of a labelled node together with a list of subtrees''. That is, a rose tree is a pairing entity (type 3) whose branching entity is a sequence (thus of type 2b) of rose trees. Sometimes even the combination 1+3b is considered. The following table provides a summary of the most established combinations of entities.

''Notes:''

General definition

General rose trees can be defined via bisimilarity of accessible pointed multidigraphs with appropriate labelling of nodes and arrows. These structures are generalization of the notion of accessible pointed graph (abbreviated as ''apg'') from

non-well-founded set theory Non-well-founded set theories are variants of axiomatic set theory that allow sets to be elements of themselves and otherwise violate the rule of well-foundedness. In non-well-founded set theories, the foundation axiom of ZFC is replaced by axio ...

. We will use the ''apq'' acronym for the below described multidigraph structures. This is meant as an abbreviation of "accessible pointed quiver" where ''

quiver A quiver is a container for holding arrows or Crossbow bolt, bolts. It can be carried on an archer's body, the bow, or the ground, depending on the type of shooting and the archer's personal preference. Quivers were traditionally made of leath ...

'' is an established synonym for "multidigraph". In a correspondence to the types of entities used in the recursive definition, each node of an apq is assigned a type (1), (2a), (2b), (2c) or (3). The apqs are subject to conditions that mimic the properties of recursively constructed entities.

1. A node of type (1) is an element of the predefined set of ground values.
2. A node of type (1) does not appear as the source of an arrow.
1. A node of type (3) appears as the source of exactly one arrow.
2. The target of the arrow mentioned in (a) is a node of type (2).
Two distinct arrows with the same source node of type (2a) have distinct targets.
A node is labelled iff it is of type (3). The label belongs to the predefined set .
1. An arrow is labelled by an index from $\mathbb$ if its source node is of type (2b).
2. An arrow is labelled by a name from a predefined set if its source node is of type (2c).
3. Otherwise an arrow is unlabelled.
Labels of arrows with the same source node are distinct.
Labels of arrows with the same source node of type (2b) form an initial segment of $\mathbb$ .

A ''bisimilarity'' between apqs and is a relation between nodes such that the roots of and are -related and for every pair of -related nodes, the following are satisfied:

The nodes and have the same type.
If and are of type (1) then they are identical.
If and are of type (3) then they have the same label.
For every arrow of whose source node is there exists an arrow of whose source is and
1. the target nodes of and are -related,
2. the labels of and , if defined, are identical.
A symmetric condition is satisfied with and interchanged.

Two apqs and are said to be ''bisimilar'' if there exists a bisimilarity relation for them. This establishes an equivalence relation on the class of all apqs. A ''rose tree'' is then some fixed representation of the class of apqs that are bisimilar to some given apq . If the root node of is of type (1) then , thus can be represented by this root node. Otherwise, is a

proper class Proper may refer to: Mathematics * Proper map, in topology, a property of continuous function between topological spaces, if inverse images of compact subsets are compact * Proper morphism, in algebraic geometry, an analogue of a proper map f ...

– in this case the representation can be provided by

Scott's trick In set theory, Scott's trick is a method for giving a definition of equivalence classes for equivalence relations on a proper class (Jech 2003:65) by referring to levels of the cumulative hierarchy. The method relies on the axiom of regularity bu ...

to be the set of those elements of that have the lowest rank. As a result of the above set-theoretic construction, the class of all rose trees is defined, depending on the sets (ground values), (arrow names) and (node labels) as the definitory constituents. Subsequently, the structure of apqs can be carried over to a labelled multidigraph structure over . That is, elements of can themselves be considered as "nodes" with induced type assignment, node labelling and arrows. The class of arrows is a subclass of , that is, arrows are either source-target couples or source-label-target triples according to the type of the source. For every element of there is an induced apq such that is the root node of and the respective sets and of nodes and arrows of are formed by those elements of and that are accessible via a path of arrows starting at . The induced apq is bisimilar to apqs used for the construction of .

Pathname maps

Rose trees that do not contain set-branching nodes (type 2a) can be represented by pathname maps. A ''pathname'' is just a finite sequence of arrow labels. For an arrow path (a finite sequence of consecutive arrows), the pathname of is the corresponding sequence of arrow labels. Here it is assumed that each arrow is labelled ( denotes the labelling function). In general, each arrow path needs to be first reduced by removing all its arrows sourced at pairing nodes (type 3). A pathname is ''resolvable'' iff there exists a root-originating arrow path whose pathname is . Such is uniquely given up to a possible unlabelled last arrow (sourced at a pairing node). The ''target node'' of a non-empty resolvable path is the target node of the last arrow of the correspondent root-originating arrow path that does not end with an unlabelled arrow. The target of the empty path is the root node. Given a rose tree that does not contain set-branching nodes, the ''pathname map'' of is a map that assigns each resolvable pathname its ''value'' according to the following general scheme:

Recall that is the set of arrow labels (

\mathbb

is the set of natural numbers and is the set of arrow names) is the set of node labels, and is the set of ground values. The additional symbols and respectively mean an indicator of a resolvable pathname and the set of type tags, . The map is defined by the following prescription ( denotes the target of ): : It can be shown that different rose trees have different pathname maps. For "homogeneous" rose trees there is no need for type tagging, and their pathname map can be defined as summarized below:

In each case, there is a simple axiomatization in terms of pathnames: # is a non-empty prefix-closed subset of or . In case of , also needs to be "left-sibling-closed" to form a ''tree domain'', see Encoding by sequences. #In case of a nested list or a nested dictionary value, if is a pathname that is non-maximal in , then . In particular, a rose tree in the most common "Haskell" sense is just a map from a non-empty prefix-closed and left-sibling-closed set of finite sequences of natural numbers to a set . Such a definition is mostly used outside the branch of functional programming, see Tree (automata theory). Typically, documents that use this definition do not mention the term "rose tree" at all. ''Notes:''

Examples

The diagrams below show two examples of rose trees together with the correspondent Haskell code. In both cases, the Data.Tree module is used as it is provided by the Haskell containers package. The module introduces rose trees as pairing entities by the following definition: data Tree a = Node Both examples are contrived so as to demonstrate the concept of "sharing of substructures" which is a distinguished feature of rose trees. In both cases, the labelling function is injective (so that the labels 'a', 'b', 'c' or 'd' uniquely identify a subtree / node) which does not need to be satisfied in general. The natural numbers (0,1,2 or 3) along the arrows indicate the zero-based position in which a tree appears in the subForest sequence of a particular "super-tree". As a consequence of possible repetitions in subForest, there can be multiple arrows between nodes. In each of the examples, the rose tree in question is labelled by 'a' and equals the value of the a variable in the code. In both diagrams, the tree is pointed to by a source-less arrow.

Well-founded rose tree

import Data.Tree main :: IO () main = do let d = Node let c = Node let b = Node let a = Node print a

Non-well-founded rose tree

import Data.Tree main :: IO () main = do let root x = case x of 'a' -> (x, ,'b' 'b' -> (x, ,'c' 'c' -> (x, ,'a' let a = unfoldTree root 'a' putStrLn (take 900 (show a) ++ " ... (and so on)")

The first example presents a well-founded rose tree a obtained by an incremental construction. First d is constructed, then c then b and finally a. The rose tree can be represented by the pathname map shown on the left. The second example presents a non-well-founded rose tree a built by a breadth-first constructor unfoldTree. The rose tree is a Moore machine, see

notes Note, notes, or NOTE may refer to: Music and entertainment * Musical note, a pitched sound (or a symbol for a sound) in music * ''Notes'' (album), a 1987 album by Paul Bley and Paul Motian * ''Notes'', a common (yet unofficial) shortened versi ...

above. Its pathname map is defined by be respectively equal to or or according to where is the number of occurrences of in .

Relation to tree data structures

The general definition provides a connection to tree data structures: :''Rose trees are tree structures modulo bisimilarity.''

Mapping tree data structures to their values

The "tree structures" are those apqs (labelled multidigraphs from the general definition) in which each node is accessible by a ''unique'' arrow path. Every rose tree is bisimilar to such a tree structure (since every apq is bisimilar to its unfolding) and every such tree structure is bisimilar to ''exactly one'' rose tree which can therefore be regarded as the ''value'' of the tree structure. The diagram on the right shows an example of such a structure-to-value mapping. In the upper part of the diagram, a node-labelled

ordered tree In graph theory, a tree is an undirected graph in which any two vertices are connected by path, or equivalently a connected acyclic undirected graph. A forest is an undirected graph in which any two vertices are connected by path, or equival ...

is displayed, containing 23 nodes. In the lower part, a rose tree is shown that is the value of . (In both and , sibling arrows are implicitly ordered from left to right.) There is an induced subtree-to-subvalue mapping which is partially displayed by blue arrows. Observe that the mapping is many-to-one: distinct tree data structures can have the same value. As a particular consequence, a rose tree in general is not a tree in terms of "subvalue" relationship between its subvalues, see #Terminological_controversy.

Tree data type

The value mapping described above can be used to clarify the difference between the terms "tree data structure" and "tree data type": :''A tree data type is a set of values of tree data structures''. Note that there are 2 degrees of discrepancy between the terms. This becomes apparent when one compares a ''single'' tree data type with a ''single'' tree data structure. A single tree data type contains (infinitely) many values each of which is represented by (infinitely) many tree data structures. For example, given a set of labels, the set of rose trees in the Haskell sense (3b) with labels taken from is a single tree data type. All the above examples of rose trees belong to this data type. ''Notes:''

Terminological controversy

As it can be observed in the above text and diagrams, the term "rose tree" is controversial. There are two interrelated issues:

Obscure meaning of "node".
Discrepancy between "tree" and "sharing of substructures".

Interestingly, the term "node" does not appear in the original paper except for a single occurrence of "nodes" in an informal paragraph on page 20. In later literature the word is used abundantly. This can already be observed in the quoted comments to the definitions: * ''A rose tree ..is either a leaf ..or a node ../q>''. * ''An element of Rose α consists of a labelled node ../q>''. In particular, the definition of rose trees in the most common Haskell sense suggests that (within the context of discourse) "node" and "tree" are synonyms. Does it mean that every rose tree is coincident with its root node? If so, is such a property considered specific to rose trees or does it also apply to other trees? Such questions are left unanswered. The (B) problem becomes apparent when looking at the diagrams of the above examples. Both diagrams are faithful in the sense that each node is drawn ''exactly once''. One can immediately see that the underlying graphs are not trees. Using a quotation from

Tree (graph theory) In graph theory, a tree is an undirected graph in which any two vertices are connected by path, or equivalently a connected acyclic undirected graph. A forest is an undirected graph in which any two vertices are connected by path, or equi ...

:''The various kinds of data structures referred to as trees in computer science have underlying graphs that are trees in graph theory ../q>'' one can conclude that rose trees in general are not trees in usual meaning known from computer science.

Bayesian rose tree

There is at least one adoption of the term "rose tree" in computer science in which "sharing of substructures" is precluded. The concept of a ''Bayesian rose tree'' is based on the following definition of rose trees:

is a rose tree if either for some data point or where 's are rose trees over disjoint sets of data points.

References

{{CS-Trees Trees (data structures)