HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, the range searching problem consists of processing a set ''S'' of objects, in order to determine which objects from ''S'' intersect with a query object, called the ''range''. For example, if ''S'' is a set of points corresponding to the coordinates of several cities, find the subset of cities within a given range of
latitude In geography, latitude is a coordinate that specifies the north– south position of a point on the surface of the Earth or another celestial body. Latitude is given as an angle that ranges from –90° at the south pole to 90° at the north pol ...
s and
longitude Longitude (, ) is a geographic coordinate that specifies the east–west position of a point on the surface of the Earth, or another celestial body. It is an angular measurement, usually expressed in degrees and denoted by the Greek letter l ...
s. The range searching problem and the
data structure In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, a ...
s that solve it are a fundamental topic of
computational geometry Computational geometry is a branch of computer science devoted to the study of algorithms which can be stated in terms of geometry. Some purely geometrical problems arise out of the study of computational geometric algorithms, and such problems ar ...
. Applications of the problem arise in areas such as
geographical information systems A geographic information system (GIS) is a type of database containing geographic data (that is, descriptions of phenomena for which location is relevant), combined with software tools for managing, analyzing, and visualizing those data. In a b ...
(GIS),
computer-aided design Computer-aided design (CAD) is the use of computers (or ) to aid in the creation, modification, analysis, or optimization of a design. This software is used to increase the productivity of the designer, improve the quality of design, improve c ...
(CAD) and
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases sp ...
s.


Variations

There are several variations of the problem, and different data structures may be necessary for different variations. In order to obtain an efficient solution, several aspects of the problem need to be specified: * Object types: Algorithms depend on whether ''S'' consists of
point Point or points may refer to: Places * Point, Lewis, a peninsula in the Outer Hebrides, Scotland * Point, Texas, a city in Rains County, Texas, United States * Point, the NE tip and a ferry terminal of Lismore, Inner Hebrides, Scotland * Point ...
s, lines,
line segment In geometry, a line segment is a part of a straight line that is bounded by two distinct end points, and contains every point on the line that is between its endpoints. The length of a line segment is given by the Euclidean distance between ...
s,
box A box (plural: boxes) is a container used for the storage or transportation of its contents. Most boxes have flat, parallel, rectangular sides. Boxes can be very small (like a matchbox) or very large (like a shipping box for furniture), and can ...
es,
polygon In geometry, a polygon () is a plane figure that is described by a finite number of straight line segments connected to form a closed ''polygonal chain'' (or ''polygonal circuit''). The bounded plane region, the bounding circuit, or the two toge ...
s.... The simplest and most studied objects to search are points. * Range types: The query ranges also need to be drawn from a predetermined set. Some well-studied sets of ranges, and the names of the respective problems are
axis-aligned rectangle A rectilinear polygon is a polygon all of whose polygon side, sides meet at right angles. Thus the interior angle at each vertex is either 90° or 270°. Rectilinear polygons are a special case of isothetic polygons. In many cases another def ...
s (orthogonal range searching),
simplices In geometry, a simplex (plural: simplexes or simplices) is a generalization of the notion of a triangle or tetrahedron to arbitrary dimensions. The simplex is so-named because it represents the simplest possible polytope in any given dimension. ...
,
halfspace Half-space may refer to: * Half-space (geometry), either of the two parts into which a plane divides Euclidean space * Half-space (punctuation), a spacing character half the width of a regular space * (Poincaré) Half-space model, a model of 3-di ...
s, and
sphere A sphere () is a Geometry, geometrical object that is a solid geometry, three-dimensional analogue to a two-dimensional circle. A sphere is the Locus (mathematics), set of points that are all at the same distance from a given point in three ...
s/
circle A circle is a shape consisting of all points in a plane that are at a given distance from a given point, the centre. Equivalently, it is the curve traced out by a point that moves in a plane so that its distance from a given point is const ...
s. * Query types: If the list of all objects that intersect the query range must be reported, the problem is called
range reporting In computational geometry and database theory, a range reporting query asks for a list of the points that match the query. The query is often specified by a geometric shape, containing all the points that should match, and is called a range. Range r ...
, and the query is called a ''reporting query''. Sometimes, only the number of objects that intersect the range is required. In this case, the problem is called ''range counting'', and the query is called a ''counting query''. The ''emptiness query'' reports whether there is at least one object that intersects the range. In the ''semigroup version'', a
commutative In mathematics, a binary operation is commutative if changing the order of the operands does not change the result. It is a fundamental property of many binary operations, and many mathematical proofs depend on it. Most familiar as the name o ...
semigroup In mathematics, a semigroup is an algebraic structure consisting of a set together with an associative internal binary operation on it. The binary operation of a semigroup is most often denoted multiplicatively: ''x''·''y'', or simply ''xy'', ...
(S,+) is specified, each point is assigned a weight from S, and it is required to report the semigroup sum of the weights of the points that intersect the range. *
Dynamic Dynamics (from Greek δυναμικός ''dynamikos'' "powerful", from δύναμις ''dynamis'' "power") or dynamic may refer to: Physics and engineering * Dynamics (mechanics) ** Aerodynamics, the study of the motion of air ** Analytical dyna ...
range searching vs. static range searching: In the static setting the set ''S'' is known in advance. In dynamic setting objects may be inserted or deleted between queries. * Offline range searching: Both the set of objects and the whole set of queries are known in advance.


Data structures


Orthogonal range searching

In orthogonal range searching, the set ''S'' consists of n points in d dimensions, and the query consists of intervals in each of those dimensions. Thus, the query consists of a multi-dimensional
axis-aligned rectangle A rectilinear polygon is a polygon all of whose polygon side, sides meet at right angles. Thus the interior angle at each vertex is either 90° or 270°. Rectilinear polygons are a special case of isothetic polygons. In many cases another def ...
. With an output size of k, Jon Bentley used a k-d tree to achieve (in
Big O notation Big ''O'' notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. Big O is a member of a family of notations invented by Paul Bachmann, Edmund Lan ...
) O(n) space and O\big(n^ + k\big) query time. Bentley also proposed using range trees, which improved query time to O(\log^d n + k) but increased space to O(n\log^ n).
Dan Willard Dan Edward Willard is an American computer scientist and logician, and is a professor of computer science at the University at Albany. Education and career Willard did his undergraduate studies in mathematics at Stony Brook University, graduati ...
used downpointers, a special case of
fractional cascading In computer science, fractional cascading is a technique to speed up a sequence of binary searches for the same value in a sequence of related data structures. The first binary search in the sequence takes a logarithmic amount of time, as is standar ...
to reduce the query time further to O(\log^ n + k). While the above results were achieved in the
pointer machine In theoretical computer science a pointer machine is an "atomistic" ''abstract computational machine'' model akin to the random-access machine. A pointer algorithm is an algorithm restricted to the pointer machine model. Depending on the type, a ...
model, further improvements have been made in the
word RAM In theoretical computer science, the word RAM (word random-access machine) model is a model of computation in which a random-access machine does bitwise operations on a word of bits. Michael Fredman and Dan Willard created it in 1990 to simulate p ...
model of computation In computer science, and more specifically in computability theory and computational complexity theory, a model of computation is a model which describes how an output of a mathematical function is computed given an input. A model describes how ...
in low dimensions (2D, 3D, 4D).
Bernard Chazelle Bernard Chazelle (born November 5, 1955) is a French-American computer scientist. He is currently the Eugene Higgins Professor of Computer Science at Princeton University. Much of his work is in computational geometry, where he is known for his ...
used compress range trees to achieve O(\log n) query time and O(n) space for range counting. Joseph JaJa and others later improved this query time to O\left(\dfrac\right) for range counting, which matches a lower bound and is thus
asymptotically optimal In computer science, an algorithm is said to be asymptotically optimal if, roughly speaking, for large inputs it performs at worst a constant factor (independent of the input size) worse than the best possible algorithm. It is a term commonly en ...
. As of 2015, the best results (in low dimensions (2D, 3D, 4D)) for range reporting found by
Timothy M. Chan Timothy Moon-Yew Chan is a Founder ProfessorTwo ...
, Kasper Larsen, and Mihai Pătrașcu, also using compressed range trees in the word RAM model of computation, are one of the following: * O(n) space, O(\log ^\epsilon n + k \log ^\epsilon n) query time * O(n \log \log n) space, O(\log \log n + k \log \log n) query time * O(n \log ^\epsilon n) space, O(\log \log n + k) query time In the orthogonal case, if one of the bounds is
infinity Infinity is that which is boundless, endless, or larger than any natural number. It is often denoted by the infinity symbol . Since the time of the ancient Greeks, the philosophical nature of infinity was the subject of many discussions amo ...
, the query is called three-sided. If two of the bounds are infinity, the query is two-sided, and if none of the bounds are infinity, then the query is four-sided.


Dynamic range searching

While in static range searching the set ''S'' is known in advance,
dynamic Dynamics (from Greek δυναμικός ''dynamikos'' "powerful", from δύναμις ''dynamis'' "power") or dynamic may refer to: Physics and engineering * Dynamics (mechanics) ** Aerodynamics, the study of the motion of air ** Analytical dyna ...
range searching, insertions and deletions of points are allowed. In the incremental version of the problem, only insertions are allowed, whereas the decremental version only allows deletions. For the orthogonal case,
Kurt Mehlhorn Kurt Mehlhorn (born 29 August 1949) is a German theoretical computer scientist. He has been a vice president of the Max Planck Society and is director of the Max Planck Institute for Computer Science. Education and career Mehlhorn graduated i ...
and Stefan Näher created a data structure for dynamic range searching which uses dynamic fractional cascading to achieve O(n \log n) space and O(\log n \log \log n + k) query time. Both incremental and decremental versions of the problem can be solved with O(\log n + k) query time, but it is unknown whether general dynamic range searching can be done with that query time.


Colored range searching

The problem of colored range counting considers the case where points have categorical attributes. If the categories are considered as colors of points in geometric space, then a query is for how many colors appear in a particular range. Prosenjit Gupta and others described a data structure in 1995 which solved 2D orthogonal colored range counting in O(n^2\log ^2 n) space and O(\log ^2 n) query time.


Applications

In addition to being considered in
computational geometry Computational geometry is a branch of computer science devoted to the study of algorithms which can be stated in terms of geometry. Some purely geometrical problems arise out of the study of computational geometric algorithms, and such problems ar ...
, range searching, and orthogonal range searching in particular, has applications for range queries in
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases sp ...
s. Colored range searching is also used for and motivated by searching through categorical data. For example, determining the rows in a database of bank accounts which represent people whose age is between 25 and 40 and who have between $10000 and $20000 might be an orthogonal range reporting problem where age and money are two dimensions.


See also

*
Range tree In computer science, a range tree is an ordered tree data structure to hold a list of points. It allows all points within a given range to be reported efficiently, and is typically used in two or higher dimensions. Range trees were introduced by ...
*
Range query A range query is a common database operation that retrieves all records where some value is between an upper and lower boundary. For example, list all employees with 3 to 5 years' experience. Range queries are unusual because it is not generally ...


References


Further reading

* *{{citation, first=Jiří, last=Matoušek, authorlink=Jiří Matoušek (mathematician), title=Geometric range searching, journal= ACM Computing Surveys, volume=26, issue=4, pages=421–461, year=1994, doi=10.1145/197405.197408, s2cid=17729301, url=https://refubium.fu-berlin.de/handle/fub188/17795. Geometric data structures Database theory