HOME

TheInfoList



OR:

Vector overlay is an operation (or class of operations) in a geographic information system (GIS) for integrating two or more vector spatial data sets. Terms such as ''polygon overlay'', ''map overlay'', and ''topological overlay'' are often used synonymously, although they are not identical in the range of operations they include. Overlay has been one of the core elements of
spatial analysis Spatial analysis or spatial statistics includes any of the formal techniques which studies entities using their topological, geometric, or geographic properties. Spatial analysis includes a variety of techniques, many still in their early deve ...
in GIS since its early development. Some overlay operations, especially Intersect and Union, are implemented in all GIS software and are used in a wide variety of analytical applications, while others are less common. Overlay is based on the fundamental principle of
geography Geography (from Greek: , ''geographia''. Combination of Greek words ‘Geo’ (The Earth) and ‘Graphien’ (to describe), literally "earth description") is a field of science devoted to the study of the lands, features, inhabitants, an ...
known as areal integration, in which different topics (say, climate, topography, and agriculture) can be directly compared based on a common location. It is also based on the
mathematics Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...
of
set theory Set theory is the branch of mathematical logic that studies sets, which can be informally described as collections of objects. Although objects of any kind can be collected into a set, set theory, as a branch of mathematics, is mostly conce ...
and
point-set topology In mathematics, general topology is the branch of topology that deals with the basic set-theoretic definitions and constructions used in topology. It is the foundation of most other branches of topology, including differential topology, geomet ...
. The basic approach of a vector overlay operation is to take in two or more layers composed of vector shapes, and output a layer consisting of new shapes created from the topological relationships discovered between the input shapes. A range of specific operators allows for different types of input, and different choices in what to include in the output.


History

Prior to the advent of GIS, the overlay principle had developed as a method of literally superimposing different thematic maps (typically an isarithmic map or a chorochromatic map) drawn on transparent film (e.g.,
cellulose acetate In biochemistry, cellulose acetate refers to any acetate ester of cellulose, usually cellulose diacetate. It was first prepared in 1865. A bioplastic, cellulose acetate is used as a film base in photography, as a component in some coatings, and ...
) to see the interactions and find locations with specific combinations of characteristics. The technique was largely developed by landscape architects.
Warren Manning Warren Henry Manning (November 7, 1860–February 5, 1938) was an American landscape designer and promoter of the informal and naturalistic "wild garden" approach to garden design. In his designs, Manning emphasized pre-existing flora through a ...
appears to have used this approach to compare aspects of Billerica, Massachusetts, although his published accounts only reproduce the maps without explaining the technique. Jacqueline Tyrwhitt published instructions for the technique in an English textbook in 1950, including: Ian McHarg was perhaps most responsible for widely publicizing this approach to planning in ''Design with Nature'' (1969), in which he gave several examples of projects on which he had consulted, such as
transportation planning Transportation planning is the process of defining future policies, goals, investments, and spatial planning designs to prepare for future needs to move people and goods to destinations. As practiced today, it is a collaborative process that ...
and land conservation. The first true GIS, the
Canada Geographic Information System {{Unreferenced, date=October 2012 The Canada Geographic Information System (CGIS) was an early geographic information system (GIS) developed for the Government of Canada beginning in the early 1960s. CGIS was used to store geospatial data for t ...
(CGIS), developed during the 1960s and completed in 1971, was based on a rudimentary vector data model, and one of the earliest functions was polygon overlay. Another early vector GIS, the Polygon Information Overlay System (PIOS), developed by
ESRI Esri (; Environmental Systems Research Institute) is an American multinational geographic information system (GIS) software company. It is best known for its ArcGIS products. With a 43% market share, Esri is the world's leading supplier of GIS ...
for San Diego County, California in 1971, also supported polygon overlay. It used the
Point in polygon In computational geometry, the point-in-polygon (PIP) problem asks whether a given point in the plane lies inside, outside, or on the boundary of a polygon. It is a special case of point location problems and finds applications in areas that dea ...
algorithm to find intersections quickly. Unfortunately, the results of overlay in these early systems was often prone to error.
Carl Steinitz Carl may refer to: *Carl, Georgia, city in USA *Carl, West Virginia, an unincorporated community *Carl (name), includes info about the name, variations of the name, and a list of people with the name *Carl², a TV series * "Carl", an episode of tel ...
, a landscape architect, helped found the
Harvard Laboratory for Computer Graphics and Spatial Analysis The Harvard Laboratory for Computer Graphics and Spatial Analysis (1965 to 1991) pioneered early cartographic and architectural computer applications that led to integrated geographic information systems (GIS). Some of the Laboratory's influenti ...
, in part to develop GIS as a digital tool to implement McHarg's methods. In 1975, Thomas Peucker and Nicholas Chrisman of the Harvard Lab introduced the POLYVRT data model, one of the first to explicitly represent topological relationships and attributes in vector data. They envisioned a system that could handle multiple "polygon networks" (layers) that overlapped by computing ''Least Common Geographic Units'' (LCGU), the area where a pair of polygons overlapped, with attributes inherited from the original polygons. Chrisman and James Dougenik implemented this strategy in the WHIRLPOOL program, released in 1979 as part of the Odyssey project to develop a general-purpose GIS. This system implemented several improvements over the earlier approaches in CGIS and PIOS, and its algorithm became part of the core of GIS software for decades to come.


Algorithm

The goal of all overlay operations is to take in vector layers, and create a layer that ''integrates'' both the geometry and the attributes of the inputs. Usually, both inputs are polygon layers, but lines and points are allowed in many operations, with simpler processing. Since the original implementation, the basic strategy of the polygon overlay algorithm has remained the same, although the vector data structures that are used have evolved. # Given the two input polygon layers, extract the boundary lines. # Cracking part A: In each layer, identify edges shared between polygons. Break each line at the junction of shared edges and remove duplicates to create a set of topologically planar connected lines. ''In early topological data structures such as POLYVRT and the ARC/INFO coverage, the data was natively stored this way, so this step was unnecessary.'' # Cracking part B: Find any intersections between lines from the two inputs. At each intersection, split both lines. Then merge the two line layers into a single set of topologically planar connected lines. # Assembling part A: Find each minimal closed ring of lines, and use it to create a polygon. Each of these will be a least common geographic unit (LCGU), with at most one "parent" polygon from each of the two inputs. # Assembling part B: Create an attribute table that includes the columns from both inputs. For each LCGU, determine its parent polygon from each input layer, and copy its attributes into the LCGU's row the new table; if was not in any of the polygons for one of the input layers, leave the values as null. Parameters are usually available to allow the user to calibrate the algorithm for a particular situation. One of the earliest was the ''snapping or fuzzy tolerance'', a threshold distance. Any pair of lines that stay within this distance of each other are collapsed into a single line, avoiding unwanted narrow ''
sliver polygon A Sliver Polygon, in the context of Geographic Information Systems (GIS), is a small polygon found in vector data that is an artifact of error rather than representing a real-world feature. They have been a recognized source of error since overl ...
s'' that can occur when lines that should be coincident (for example, a river and a boundary that should follow it ''de jure'') are digitized separately with slightly different vertices.


Operators

The basic algorithm can be modified in a number of ways to return different forms of integration between the two input layers. These different ''overlay operators'' are used to answer a variety of questions, although some are far more commonly implemented and used than others. The most common are closely analogous to operators in
set theory Set theory is the branch of mathematical logic that studies sets, which can be informally described as collections of objects. Although objects of any kind can be collected into a set, set theory, as a branch of mathematics, is mostly conce ...
and boolean logic, and have adopted their terms. As in these algebraic systems, the overlay operators may be ''commutative'' (giving the same result regardless of order) and/or ''associative'' (more than two inputs giving the same result regardless of the order in which they are paired). * Intersect (ArcGIS, QGIS, Manifold, TNTmips; AND in GRASS): The result includes only the LCGUs where the two input layers intersect (overlap); that is, those with both "parents." This is identical to the set theoretic intersection of the input layers. Intersect is probably the most commonly used operator in this list. ''Commutative, associative'' * Union (ArcGIS, QGIS, Manifold, TNTmips; or in GRASS): The result includes all of the LCGUs, both those where the inputs intersect and where they do not. This is identical to the set theoretic union of the input layers. ''Commutative, associative'' * Subtract (TNTmips; Erase in ArcGIS; Difference in QGIS; not in GRASS; missing from Manifold): The result includes only the portions of polygons in one layer that do not overlap with the other layer; that is, the LCGUs that have no parent from the other layer. ''Non-commutative, non-associative'' * Exclusive or (Symmetrical Difference in ArcGIS, QGIS; Exclusive Union in TNTmips; XOR in GRASS; missing from Manifold): The result includes the portions of polygons in both layers that do not overlap; that is, all LCGUs that have one parent. This could also be achieved by computing the intersection and the union, then subtracting the intersection from the union, or by subtracting each layer from the other, then computing the union of the two subtractions. ''Commutative, associative'' * Clip (ArcGIS, QGIS, GRASS, Manifold; Extract Inside in TNTmips): The result includes the portions of polygons of one layer where they intersect the other layer. The outline is the same as the intersection, but the interior only includes the polygons of one layer rather than computing the LCGUs. ''Non-commutative, non-associative'' * Cover (Update in ArcGIS and Manifold; Replace in TNTmips; not in QGIS or GRASS): The result includes one layer intact, with the portions of the polygons of the other layer only where the two layers do not intersect. It is called "cover" because the result looks like one layer is covering the other; it is called "update" in ArcGIS because the most common use is when the two layers represent the same theme, but one represents recent changes (e.g., new parcels) that need to replace the older ones in the same location. It can be replicated by subtracting one layer from the other, then computing the union of that result with the original first layer. ''Non-commutative, non-associative'' * Divide (Identity in ArcGIS and Manifold; not in QGIS, TNTmips, or GRASS): The result includes all of the LCGUs that cover one of the input layers, excluding those that are only in the other layer. It is called "divide" because it has the appearance of one layer being used to divide the polygons of the other layer. It can be replicated by computing the intersection, then subtracting one layer from the other, then computing the union of these two results. ''Non-commutative, non-associative''


Boolean overlay algebra

One of the most common uses of polygon overlay is to perform a
suitability analysis Suitability analysis is the process and procedures used to establish the suitability of a system – that is, the ability of a system to meet the needs of a stakeholder or other user. Before GIS (a computerized method that helps to determine suit ...
, also known as a suitability model or multi-criteria evaluation. The task is to find the region that meets a set of criteria, each of which can be represented by a region. For example, the habitat of a species of wildlife might need to be A) within certain vegetation cover types, B) within a threshold distance of a water source (computed using a buffer), and C) not within a threshold distance of significant roads. Each of the criteria can be considered ''boolean'' in the sense of Boolean logic, because for any point in space, each criterion is either present or not present, and the point is either in the final habitat area or it is not (acknowledging that the criteria may be vague, but this requires more complex fuzzy suitability analysis methods). That is, which vegetation polygon the point is in is not important, only whether it is suitable or not suitable. This means that the criteria can be expressed as a Boolean logic expression, in this case, H = A and B and not C. In a task such as this, the overlay procedure can be simplified because the individual polygons within each layer are not important, and can be dissolved into a single ''boolean region'' (consisting of one or more disjoint polygons but no adjacent polygons) representing the region that meets the criterion. With these inputs, each of the operators of Boolean logic corresponds exactly to one of the polygon overlay operators: intersect = AND, union = OR, subtract = AND NOT, exclusive or = XOR. Thus, the above habitat region would be generated by computing the intersection of A and B, and subtracting C from the result. Thus, this particular use of polygon overlay can be treated as an algebra that is homomorphic to Boolean logic. This enables the use of GIS to solve many spatial tasks that can be reduced to simple logic.


Lines and points

Vector overlay is most commonly performed using two polygon layers as input and creating a third polygon layer. However, it is possible to perform the same algorithm (parts of it at least) on points and lines. The following operations are typically supported in GIS software: * Intersect: The output will be of the same dimension as the lower of the inputs: Points * = Points, Lines * = Lines. This is often used as a form of '' spatial join'', as it merges the attribute tables of the two layers analogous to a table join. An example of this would be allocating students to school districts. Because it is rare for a point to exactly fall on a line or another point, the fuzzy tolerance is often used here.
QGIS QGIS is a free and open-source cross-platform desktop geographic information system (GIS) application that supports viewing, editing, printing, and analysis of geospatial data. Functionality QGIS functions as geographic information system (GIS ...
has separate operations for computing a line intersection as lines (to find coincident lines) and as points. * Subtract: The output will be of the same dimension as the primary input, with the subtraction layer being of the same or lesser dimension: Points - = Points, Lines - = Lines * Clip: While the primary input can be points or lines, the clipping layer is usually required to be polygons, producing the same geometry as the primary input, but only including those features (or parts of lines) that are within the clipping polygons. This operation might also be considered a form of ''
spatial query A spatial database is a general-purpose database (usually a relational database) that has been enhanced to include spatial data that represents objects defined in a geometric space, along with tools for querying and analyzing such data. Most s ...
'', as it retains the features of one layer based on its topological relationship to another. * Union: Normally, both input layers are expected to be of the same dimensionality, producing an output layer including both sets of features. ArcGIS and GRASS do not allow this option with points or lines.


Implementations

Vector Overlay is included in some form in virtually every GIS software package that supports vector analysis, although the interface and underlying algorithms vary significantly. *
Esri Esri (; Environmental Systems Research Institute) is an American multinational geographic information system (GIS) software company. It is best known for its ArcGIS products. With a 43% market share, Esri is the world's leading supplier of GIS ...
GIS software has included polygon overlay since the first release of
ARC/INFO ArcInfo (formerly ARC/INFO) is a full-featured geographic information system produced by Esri, and is the highest level of licensing (and therefore functionality) in the ArcGIS Desktop product line. It was originally a command-line based system. T ...
in 1982. Each generation of Esri software (ARC/INFO, ArcGIS, ArcGIS Pro) has included a set of separate tools for each of the overlay operators (Intersect, Union, Clip, etc.). The current implementation in ArcGIS Pro recently added an alternative set of
Pairwise Overlay
tools (as of v2.7) that uses parallel processing to more efficiently process very large datasets. *
GRASS GIS ''Geographic Resources Analysis Support System'' (commonly termed ''GRASS GIS'') is a geographic information system (GIS) software suite used for geospatial data management and analysis, image processing, producing graphics and maps, spatial and ...
(open source), although it was originally raster-based, has included overlay as part of its vector system since GRASS 3.0 (1988). Most of the polygon overlay operators are collected into a singl
v.overlay
command, wit

as a separate command. *
QGIS QGIS is a free and open-source cross-platform desktop geographic information system (GIS) application that supports viewing, editing, printing, and analysis of geospatial data. Functionality QGIS functions as geographic information system (GIS ...
(open source) originally incorporated GRASS as its analytical engine, but has gradually developed its own processing framework, includin
vector overlay
*
Manifold System Manifold System is a geographic information system (GIS) software package developed by Manifold Software Limited that runs on Microsoft Windows. Manifold System handles both vector and raster data, includes spatial SQL, a built-in Internet Map Ser ...
implements overlay in it
transformation
system. * Th
Turf
Javascript API includes the most common overlay methods, although these operate on individual input polygon objects, not on entire layers. *
TNTmips TNTmips is a geospatial analysis system providing a fully featured GIS, RDBMS, and automated image processing system with CAD, TIN, surface modeling, map layout and innovative data publishing tools. TNTmips has a single integrated system with ...
includes several tools fo
overlay
among its vector analysis process.


References

{{Reflist


External links



documentation in Esri ArcGIS

command documentation in GRASS GIS

documentation in QGIS

documentation in Manifold GIS software Geographic information systems