tree data structure
In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children (depending on the type of tree), but must be conn ...
in which each internal node has exactly four children. Quadtrees are the two-dimensional analog of
octree
An octree is a tree data structure in which each internal node has exactly eight children. Octrees are most often used to partition a three-dimensional space by recursively subdividing it into eight octants. Octrees are the three-dimensional anal ...
s and are most often used to partition a two-dimensional space by recursively subdividing it into four quadrants or regions. The data associated with a leaf cell varies by application, but the leaf cell represents a "unit of interesting spatial information".
The subdivided regions may be square or rectangular, or may have arbitrary shapes. This data structure was named a quadtree by
Raphael Finkel
Raphael Finkel (born 1951) is an American computer scientist and a professor at the University of Kentucky. He compiled the first version of the Jargon File. He is the author of ''An Operating Systems Vade Mecum'',J.L. Bentley in 1974. A similar partitioning is also known as a ''Q-tree''. All forms of quadtrees share some common features:
* They decompose space into adaptable cells
* Each cell (or bucket) has a maximum capacity. When maximum capacity is reached, the bucket splits
* The tree directory follows the spatial decomposition of the quadtree.
A tree-pyramid (T-pyramid) is a "complete" tree; every node of the T-pyramid has four child nodes except leaf nodes; all leaves are on the same level, the level that corresponds to individual pixels in the image. The data in a tree-pyramid can be stored compactly in an array as an
implicit data structure In computer science, an implicit data structure or space-efficient data structure is a data structure that stores very little information other than the main or required data: a data structure that requires low overhead. They are called "implicit" ...
Quadtrees may be classified according to the type of data they represent, including areas, points, lines and curves. Quadtrees may also be classified by whether the shape of the tree is independent of the order in which data is processed. The following are common types of quadtrees.
Region quadtree
The region quadtree represents a partition of space in two dimensions by decomposing the region into four equal quadrants, subquadrants, and so on with each leaf node containing data corresponding to a specific subregion. Each node in the tree either has exactly four children, or has no children (a leaf node). The height of quadtrees that follow this decomposition strategy (i.e. subdividing subquadrants as long as there is interesting data in the subquadrant for which more refinement is desired) is sensitive to and dependent on the spatial distribution of interesting areas in the space being decomposed. The region quadtree is a type of
trie
In computer science, a trie, also called digital tree or prefix tree, is a type of ''k''-ary search tree, a tree data structure used for locating specific keys from within a set. These keys are most often strings, with links between nodes def ...
.
A region quadtree with a depth of n may be used to represent an image consisting of 2n × 2n pixels, where each pixel value is 0 or 1. The root node represents the entire image region. If the pixels in any region are not entirely 0s or 1s, it is subdivided. In this application, each leaf node represents a block of pixels that are all 0s or all 1s. Note the potential savings in terms of space when these trees are used for storing images; images often have many regions of considerable size that have the same colour value throughout. Rather than store a big 2-D array of every pixel in the image, a quadtree can capture the same information potentially many divisive levels higher than the pixel-resolution sized cells that we would otherwise require. The tree resolution and overall size is bounded by the pixel and image sizes.
A region quadtree may also be used as a variable resolution representation of a data field. For example, the temperatures in an area may be stored as a quadtree, with each leaf node storing the average temperature over the subregion it represents.
Point quadtree
The point quadtree is an adaptation of a
binary tree
In computer science, a binary tree is a k-ary k = 2 tree data structure in which each node has at most two children, which are referred to as the ' and the '. A recursive definition using just set theory notions is that a (non-empty) binary t ...
used to represent two-dimensional point data. It shares the features of all quadtrees but is a true tree as the center of a subdivision is always on a point. It is often very efficient in comparing two-dimensional, ordered data points, usually operating in O(log n) time. Point quadtrees are worth mentioning for completeness, but they have been surpassed by ''k''-d trees as tools for generalized binary search.
Point quadtrees are constructed as follows. Given the next point to insert, we find the cell in which it lies and add it to the tree. The new point is added such that the cell that contains it is divided into quadrants by the vertical and horizontal lines that run through the point. Consequently, cells are rectangular but not necessarily square. In these trees, each node contains one of the input points.
Since the division of the plane is decided by the order of point-insertion, the tree's height is sensitive to and dependent on insertion order. Inserting in a "bad" order can lead to a tree of height linear in the number of input points (at which point it becomes a linked-list). If the point-set is static, pre-processing can be done to create a tree of balanced height.
Node structure for a point quadtree
A node of a point quadtree is similar to a node of a
binary tree
In computer science, a binary tree is a k-ary k = 2 tree data structure in which each node has at most two children, which are referred to as the ' and the '. A recursive definition using just set theory notions is that a (non-empty) binary t ...
, with the major difference being that it has four pointers (one for each quadrant) instead of two ("left" and "right") as in an ordinary binary tree. Also a key is usually decomposed into two parts, referring to x and y coordinates. Therefore, a node contains the following information:
* four pointers: quad NW’ quad NE’ quad SW’ and quad SE’* point; which in turn contains:
** key; usually expressed as x, y coordinates
** value; for example a name
Point-region (PR) quadtree
Point-region (PR) quadtrees are very similar to region quadtrees. The difference is the type of information stored about the cells. In a region quadtree, a uniform value is stored that applies to the entire area of the cell of a leaf. The cells of a PR quadtree, however, store a list of points that exist within the cell of a leaf. As mentioned previously, for trees following this decomposition strategy the height depends on the spatial distribution of the points. Like the point quadtree, the PR quadtree may also have a linear height when given a "bad" set.
Edge quadtree
Edge quadtrees (much like PM quadtrees) are used to store lines rather than points. Curves are approximated by subdividing cells to a very fine resolution, specifically until there is a single line segment per cell. Near corners/vertices, edge quadtrees will continue dividing until they reach their maximum level of decomposition. This can result in extremely unbalanced trees which may defeat the purpose of indexing.
Polygonal map (PM) quadtree
The polygonal map quadtree (or PM Quadtree) is a variation of quadtree which is used to store collections of polygons that may be degenerate (meaning that they have isolated vertices or edges).
A big difference between PM quadtrees and edge quadtrees is that the cell under consideration is not subdivided if the segments meet at a vertex in the cell.
There are three main classes of PM Quadtrees, which vary depending on what information they store within each black node. PM3 quadtrees can store any amount of non-intersecting edges and at most one point. PM2 quadtrees are the same as PM3 quadtrees except that all edges must share the same end point. Finally PM1 quadtrees are similar to PM2, but black nodes can contain a point and its edges or just a set of edges that share a point, but you cannot have a point and a set of edges that do not contain the point.
Compressed quadtrees
This section summarizes a subsection from a book by
Sariel Har-Peled
Sariel Har-Peled (born July 14, 1971, in Jerusalem) is an Israeli–American computer scientist known for his research in computational geometry. He is a Donald Biggar Willett Professor in Engineering at the University of Illinois at Urbana–Champ ...
.
If we were to store every node corresponding to a subdivided cell, we may end up storing a lot of empty nodes. We can cut down on the size of such sparse trees by only storing subtrees whose leaves have interesting data (i.e. "important subtrees"). We can actually cut down on the size even further. When we only keep important subtrees, the pruning process may leave long paths in the tree where the intermediate nodes have degree two (a link to one parent and one child). It turns out that we only need to store the node at the beginning of this path (and associate some meta-data with it to represent the removed nodes) and attach the subtree rooted at its end to . It is still possible for these compressed trees to have a linear height when given "bad" input points.
Although we trim a lot of the tree when we perform this compression, it is still possible to achieve logarithmic-time search, insertion, and deletion by taking advantage of ''Z''-order curves. The ''Z''-order curve maps each cell of the full quadtree (and hence even the compressed quadtree) in time to a one-dimensional line (and maps it back in time too), creating a total order on the elements. Therefore, we can store the quadtree in a data structure for ordered sets (in which we store the nodes of the tree).
We must state a reasonable assumption before we continue: we assume that given two real numbers