Metric Trees
   HOME

TheInfoList



OR:

A metric tree is any
tree In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, including only woody plants with secondary growth, plants that are ...
data structure In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, a ...
specialized to index data in
metric space In mathematics, a metric space is a set together with a notion of ''distance'' between its elements, usually called points. The distance is measured by a function called a metric or distance function. Metric spaces are the most general settin ...
s. Metric trees exploit properties of metric spaces such as the
triangle inequality In mathematics, the triangle inequality states that for any triangle, the sum of the lengths of any two sides must be greater than or equal to the length of the remaining side. This statement permits the inclusion of degenerate triangles, but ...
to make accesses to the data more efficient. Examples include the
M-tree In computer science, M-trees are tree data structures that are similar to R-trees and B-trees. It is constructed using a metric and relies on the triangle inequality for efficient range and k-nearest neighbor (k-NN) queries. While M-trees can perf ...
, vp-trees, cover trees, MVP trees, and
BK-tree A BK-tree is a metric tree suggested by Walter Austin Burkhard and Robert M. Keller specifically adapted to discrete metric spaces. For simplicity, consider integer discrete metric d(x,y). Then, BK-tree is defined in the following way. An arbitrar ...
s.


Multidimensional search

Most algorithms and data structures for searching a dataset are based on the classical
binary search In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the m ...
algorithm, and generalizations such as the
k-d tree In computer science, a ''k''-d tree (short for ''k-dimensional tree'') is a space-partitioning data structure for organizing points in a ''k''-dimensional space. ''k''-d trees are a useful data structure for several applications, such as search ...
or
range tree In computer science, a range tree is an ordered tree data structure to hold a list of points. It allows all points within a given range to be reported efficiently, and is typically used in two or higher dimensions. Range trees were introduced by ...
work by interleaving the
binary search algorithm In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the m ...
over the separate coordinates and treating each spatial coordinate as an independent search constraint. These data structures are well-suited for
range query A range query is a common database operation that retrieves all records where some value is between an upper and lower boundary. For example, list all employees with 3 to 5 years' experience. Range queries are unusual because it is not generally ...
problems asking for every point (x,y) that satisfies \mbox_x \leq x \leq \mbox_x and \mbox_y \leq y \leq \mbox_y. A limitation of these multidimensional search structures is that they are only defined for searching over objects that can be treated as vectors. They aren't applicable for the more general case in which the algorithm is given only a collection of objects and a function for measuring the distance or similarity between two objects. If, for example, someone were to create a function that returns a value indicating how similar one image is to another, a natural algorithmic problem would be to take a dataset of images and find the ones that are similar according to the function to a given query image.


Metric data structures

If there is no structure to the
similarity measure In statistics and related fields, a similarity measure or similarity function or similarity metric is a real-valued function that quantifies the similarity between two objects. Although no single definition of a similarity exists, usually such meas ...
then a
brute force search In computer science, brute-force search or exhaustive search, also known as generate and test, is a very general problem-solving technique and algorithmic paradigm that consists of systematically enumerating all possible candidates for the soluti ...
requiring the comparison of the query image to every image in the dataset is the best that can be done . If, however, the similarity function satisfies the
triangle inequality In mathematics, the triangle inequality states that for any triangle, the sum of the lengths of any two sides must be greater than or equal to the length of the remaining side. This statement permits the inclusion of degenerate triangles, but ...
then it is possible to use the result of each comparison to prune the set of candidates to be examined. The first article on metric trees, as well as the first use of the term "metric tree", published in the open literature was by
Jeffrey Uhlmann Jeffrey K. Uhlmann is an American research scientist who is probably best known for his mathematical generalizations of the Kalman filter. Most of his publications and patents have been in the field of data fusion. He is also known for being a cult ...
in 1991. Other researchers were working independently on similar data structures. In particular, Peter Yianilos claimed to have independently discovered the same method, which he called a vantage point tree (VP-tree). The research on metric tree data structures blossomed in the late 1990s and included an examination by Google co-founder
Sergey Brin Sergey Mikhailovich Brin (russian: link=no, Сергей Михайлович Брин; born August 21, 1973) is an American business magnate, computer scientist, and internet entrepreneur, who co-founded Google with Larry Page. Brin was the ...
of their use for very large databases. The first textbook on metric data structures was published in 2006.


Open source implementations

*
Matlab MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementation ...
: Metric trees are implemented in the metricTree class that is part of the United States Naval Research Laboratory's free
Tracker Component Library Tracker may refer to: Arts and entertainment Fictional characters * Tracker (G.I. Joe), Tracker (''G.I. Joe''), in the ''G.I. Joe'' universe * Tracker (PAW Patrol), Tracker (''PAW Patrol''), in the animated television series ''PAW Patrol'' * Trac ...
.


References

{{CS-Trees Trees (data structures)