Spatial statistics
   HOME

TheInfoList



OR:

Spatial analysis or spatial statistics includes any of the formal techniques which studies entities using their
topological In mathematics, topology (from the Greek words , and ) is concerned with the properties of a geometric object that are preserved under continuous deformations, such as stretching, twisting, crumpling, and bending; that is, without closing ...
, geometric, or geographic properties. Spatial analysis includes a variety of techniques, many still in their early development, using different analytic approaches and applied in fields as diverse as
astronomy Astronomy () is a natural science that studies celestial objects and phenomena. It uses mathematics, physics, and chemistry in order to explain their origin and evolution. Objects of interest include planets, moons, stars, nebulae, g ...
, with its studies of the placement of galaxies in the
cosmos The cosmos (, ) is another name for the Universe. Using the word ''cosmos'' implies viewing the universe as a complex and orderly system or entity. The cosmos, and understandings of the reasons for its existence and significance, are studied in ...
, to chip fabrication engineering, with its use of "place and route"
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
s to build complex wiring structures. In a more restricted sense, spatial analysis is the technique applied to structures at the human scale, most notably in the analysis of geographic data or transcriptomics data. Complex issues arise in spatial analysis, many of which are neither clearly defined nor completely resolved, but form the basis for current research. The most fundamental of these is the problem of defining the spatial location of the entities being studied. Classification of the techniques of spatial analysis is difficult because of the large number of different fields of research involved, the different fundamental approaches which can be chosen, and the many forms the data can take.


History

Spatial analysis began with early attempts at
cartography Cartography (; from grc, χάρτης , "papyrus, sheet of paper, map"; and , "write") is the study and practice of making and using maps. Combining science, aesthetics and technique, cartography builds on the premise that reality (or an i ...
and surveying. Land surveying goes back to at least 1,400 B.C in Egypt: the dimensions of taxable land plots were measured with measuring ropes and plumb bobs. b. Many fields have contributed to its rise in modern form.
Biology Biology is the scientific study of life. It is a natural science with a broad scope but has several unifying themes that tie it together as a single, coherent field. For instance, all organisms are made up of cells that process hereditary i ...
contributed through
botanical Botany, also called , plant biology or phytology, is the science of plant life and a branch of biology. A botanist, plant scientist or phytologist is a scientist who specialises in this field. The term "botany" comes from the Ancient Greek wo ...
studies of global plant distributions and local plant locations, ethological studies of animal movement, landscape ecological studies of vegetation blocks,
ecological Ecology () is the study of the relationships between living organisms, including humans, and their physical environment. Ecology considers organisms at the individual, population, community, ecosystem, and biosphere level. Ecology overlaps wi ...
studies of spatial population dynamics, and the study of biogeography.
Epidemiology Epidemiology is the study and analysis of the distribution (who, when, and where), patterns and determinants of health and disease conditions in a defined population. It is a cornerstone of public health, and shapes policy decisions and evide ...
contributed with early work on disease mapping, notably John Snow's work of mapping an outbreak of cholera, with research on mapping the spread of disease and with location studies for health care delivery. Statistics has contributed greatly through work in spatial statistics.
Economics Economics () is the social science that studies the production, distribution, and consumption of goods and services. Economics focuses on the behaviour and interactions of economic agents and how economies work. Microeconomics analyzes ...
has contributed notably through
spatial econometrics Spatial econometrics is the field where spatial analysis and econometrics intersect. The term “spatial econometrics” was introduced for the first time by the Belgian economist Jean Paelinck (universally recognised as the father of the disciplin ...
. Geographic information system is currently a major contributor due to the importance of geographic software in the modern analytic toolbox.
Remote sensing Remote sensing is the acquisition of information about an object or phenomenon without making physical contact with the object, in contrast to in situ or on-site observation. The term is applied especially to acquiring information about Eart ...
has contributed extensively in morphometric and clustering analysis.
Computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includi ...
has contributed extensively through the study of algorithms, notably in computational geometry. Mathematics continues to provide the fundamental tools for analysis and to reveal the complexity of the spatial realm, for example, with recent work on
fractals In mathematics, a fractal is a geometric shape containing detailed structure at arbitrarily small scales, usually having a fractal dimension strictly exceeding the topological dimension. Many fractals appear similar at various scales, as illus ...
and
scale invariance In physics, mathematics and statistics, scale invariance is a feature of objects or laws that do not change if scales of length, energy, or other variables, are multiplied by a common factor, and thus represent a universality. The technical term ...
.
Scientific modelling Scientific modelling is a scientific activity, the aim of which is to make a particular part or feature of the world easier to understand, define, quantify, visualize, or simulate by referencing it to existing and usually commonly accepted ...
provides a useful framework for new approaches.


Fundamental issues

Spatial analysis confronts many fundamental issues in the definition of its objects of study, in the construction of the analytic operations to be used, in the use of computers for analysis, in the limitations and particularities of the analyses which are known, and in the presentation of analytic results. Many of these issues are active subjects of modern research. Common errors often arise in spatial analysis, some due to the mathematics of space, some due to the particular ways data are presented spatially, some due to the tools which are available. Census data, because it protects individual privacy by aggregating data into local units, raises a number of statistical issues. The fractal nature of coastline makes precise measurements of its length difficult if not impossible. A computer software fitting straight lines to the curve of a coastline, can easily calculate the lengths of the lines which it defines. However these straight lines may have no inherent meaning in the real world, as was shown for the
coastline of Britain The coastline of the United Kingdom is formed by a variety of natural features including islands, bays, headlands and peninsulas. It consists of the coastline of the island of Great Britain and the north-east coast of the island of Ireland, as w ...
. These problems represent a challenge in spatial analysis because of the power of maps as media of presentation. When results are presented as maps, the presentation combines spatial data which are generally accurate with analytic results which may be inaccurate, leading to an impression that analytic results are more accurate than the data would indicate.


Spatial characterization

The definition of the spatial presence of an entity constrains the possible analysis which can be applied to that entity and influences the final conclusions that can be reached. While this property is fundamentally true of all
analysis Analysis ( : analyses) is the process of breaking a complex topic or substance into smaller parts in order to gain a better understanding of it. The technique has been applied in the study of mathematics and logic since before Aristotle (3 ...
, it is particularly important in spatial analysis because the tools to define and study entities favor specific characterizations of the entities being studied. Statistical techniques favor the spatial definition of objects as points because there are very few statistical techniques which operate directly on line, area, or volume elements. Computer tools favor the spatial definition of objects as homogeneous and separate elements because of the limited number of
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases s ...
elements and computational structures available, and the ease with which these primitive structures can be created.


Spatial dependence

Spatial dependence is the spatial relationship of variable values (for themes defined over space, such as
rainfall Rain is water droplets that have condensed from atmospheric water vapor and then fall under gravity. Rain is a major component of the water cycle and is responsible for depositing most of the fresh water on the Earth. It provides water f ...
) or locations (for themes defined as objects, such as cities). Spatial dependence is measured as the existence of statistical dependence in a collection of random variables, each of which is associated with a different
geographical location In geography, location or place are used to denote a region (point, line, or area) on Earth's surface or elsewhere. The term ''location'' generally implies a higher degree of certainty than ''place'', the latter often indicating an entity with an ...
. Spatial dependence is of importance in applications where it is reasonable to postulate the existence of corresponding set of random variables at locations that have not been included in a sample. Thus
rainfall Rain is water droplets that have condensed from atmospheric water vapor and then fall under gravity. Rain is a major component of the water cycle and is responsible for depositing most of the fresh water on the Earth. It provides water f ...
may be measured at a set of rain gauge locations, and such measurements can be considered as outcomes of random variables, but rainfall clearly occurs at other locations and would again be random. Because
rainfall Rain is water droplets that have condensed from atmospheric water vapor and then fall under gravity. Rain is a major component of the water cycle and is responsible for depositing most of the fresh water on the Earth. It provides water f ...
exhibits properties of autocorrelation, spatial interpolation techniques can be used to estimate
rainfall Rain is water droplets that have condensed from atmospheric water vapor and then fall under gravity. Rain is a major component of the water cycle and is responsible for depositing most of the fresh water on the Earth. It provides water f ...
amounts at locations near measured locations. As with other types of statistical dependence, the presence of spatial dependence generally leads to estimates of an average value from a sample being less accurate than had the samples been independent, although if negative dependence exists a sample average can be better than in the independent case. A different problem than that of estimating an overall average is that of
spatial interpolation In numerical analysis, multivariate interpolation is interpolation on functions of more than one variable; when the variates are spatial coordinates, it is also known as spatial interpolation. The function to be interpolated is known at given poi ...
: here the problem is to estimate the unobserved random outcomes of variables at locations intermediate to places where measurements are made, on that there is spatial dependence between the observed and unobserved random variables. Tools for exploring spatial dependence include:
spatial correlation In wireless communication, spatial correlation is the correlation between a signal's spatial direction and the average received signal gain. Theoretically, the performance of wireless communication systems can be improved by having multiple anten ...
, spatial covariance functions and
semivariogram In spatial statistics the theoretical variogram 2\gamma(\mathbf_1,\mathbf_2) is a function describing the degree of spatial dependence of a spatial random field or stochastic process Z(\mathbf). The semivariogram \gamma(\mathbf_1,\mathbf_2) is hal ...
s. Methods for spatial interpolation include
Kriging In statistics, originally in geostatistics, kriging or Kriging, also known as Gaussian process regression, is a method of interpolation based on Gaussian process governed by prior covariances. Under suitable assumptions of the prior, kriging giv ...
, which is a type of
best linear unbiased prediction In statistics, best linear unbiased prediction (BLUP) is used in linear mixed models for the estimation of random effects. BLUP was derived by Charles Roy Henderson in 1950 but the term "best linear unbiased predictor" (or "prediction") seems not ...
. The topic of spatial dependence is of importance to
geostatistics Geostatistics is a branch of statistics focusing on spatial or spatiotemporal datasets. Developed originally to predict probability distributions of ore grades for mining operations, it is currently applied in diverse disciplines including p ...
and spatial analysis.


Spatial auto-correlation

Spatial dependency is the co-variation of properties within geographic space: characteristics at proximal locations appear to be correlated, either positively or negatively. Spatial dependency leads to the spatial autocorrelation problem in statistics since, like temporal autocorrelation, this violates standard statistical techniques that assume independence among observations. For example, regression analyses that do not compensate for spatial dependency can have unstable parameter estimates and yield unreliable significance tests. Spatial regression models (see below) capture these relationships and do not suffer from these weaknesses. It is also appropriate to view spatial dependency as a source of information rather than something to be corrected. Locational effects also manifest as spatial
heterogeneity Homogeneity and heterogeneity are concepts often used in the sciences and statistics relating to the uniformity of a substance or organism. A material or image that is homogeneous is uniform in composition or character (i.e. color, shape, siz ...
, or the apparent variation in a process with respect to location in geographic space. Unless a space is uniform and boundless, every location will have some degree of uniqueness relative to the other locations. This affects the spatial dependency relations and therefore the spatial process. Spatial heterogeneity means that overall parameters estimated for the entire system may not adequately describe the process at any given location.


Spatial association

Spatial association is the degree to which things are similarly arranged in space. Analysis of the distribution patterns of two phenomena is done by map overlay. If the distributions are similar, then the spatial association is strong, and vice versa. In a Geographic Information System, the analysis can be done quantitatively. For example, a set of observations (as points or extracted from raster cells) at matching locations can be intersected and examined by
regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
. Like
spatial autocorrelation Spatial analysis or spatial statistics includes any of the formal techniques which studies entities using their topological, geometric, or geographic properties. Spatial analysis includes a variety of techniques, many still in their early dev ...
, this can be a useful tool for spatial prediction. In spatial modeling, the concept of spatial association allows the use of covariates in a regression equation to predict the geographic field and thus produce a map.


The second dimension of spatial association

The second dimension of spatial association (SDA) reveals the association between spatial variables through extracting geographical information at locations outside samples. SDA effectively uses the missing geographical information outside sample locations in methods of the first dimension of spatial association (FDA), which explore spatial association using observations at sample locations.


Scaling

Spatial measurement scale is a persistent issue in spatial analysis; more detail is available at the
modifiable areal unit problem __NOTOC__ The modifiable areal unit problem (MAUP) is a source of statistical bias that can significantly impact the results of statistical hypothesis tests. MAUP affects results when point-based measures of spatial phenomena are aggregated into ...
(MAUP) topic entry. Landscape ecologists developed a series of
scale invariant In physics, mathematics and statistics, scale invariance is a feature of objects or laws that do not change if scales of length, energy, or other variables, are multiplied by a common factor, and thus represent a universality. The technical ter ...
metrics for aspects of ecology that are fractal in nature. In more general terms, no scale independent method of
analysis Analysis ( : analyses) is the process of breaking a complex topic or substance into smaller parts in order to gain a better understanding of it. The technique has been applied in the study of mathematics and logic since before Aristotle (3 ...
is widely agreed upon for spatial statistics.


Sampling

Spatial sampling involves determining a limited number of locations in geographic space for faithfully measuring phenomena that are subject to dependency and heterogeneity. Dependency suggests that since one location can predict the value of another location, we do not need observations in both places. But heterogeneity suggests that this relation can change across space, and therefore we cannot trust an observed degree of dependency beyond a region that may be small. Basic spatial sampling schemes include random, clustered and systematic. These basic schemes can be applied at multiple levels in a designated spatial hierarchy (e.g., urban area, city, neighborhood). It is also possible to exploit ancillary data, for example, using property values as a guide in a spatial sampling scheme to measure educational attainment and income. Spatial models such as autocorrelation statistics, regression and interpolation (see below) can also dictate sample design.


Common errors in spatial analysis

The fundamental issues in spatial analysis lead to numerous problems in analysis including bias, distortion and outright errors in the conclusions reached. These issues are often interlinked but various attempts have been made to separate out particular issues from each other.


Length

In discussing the
coastline of Britain The coastline of the United Kingdom is formed by a variety of natural features including islands, bays, headlands and peninsulas. It consists of the coastline of the island of Great Britain and the north-east coast of the island of Ireland, as w ...
, Benoit Mandelbrot showed that certain spatial concepts are inherently nonsensical despite presumption of their validity. Lengths in ecology depend directly on the scale at which they are measured and experienced. So while surveyors commonly measure the length of a river, this length only has meaning in the context of the relevance of the measuring technique to the question under study. Image:britain-fractal-coastline-200km.png, Britain measured using a 200 km linear measurement Image:britain-fractal-coastline-100km.png, Britain measured using a 100 km linear measurement Image:britain-fractal-coastline-50km.png, Britain measured using a 50 km linear measurement


Locational fallacy

The locational fallacy refers to error due to the particular spatial characterization chosen for the elements of study, in particular choice of placement for the spatial presence of the element. Spatial characterizations may be simplistic or even wrong. Studies of humans often reduce the spatial existence of humans to a single point, for instance their home address. This can easily lead to poor analysis, for example, when considering disease transmission which can happen at work or at school and therefore far from the home. The spatial characterization may implicitly limit the subject of study. For example, the spatial analysis of crime data has recently become popular but these studies can only describe the particular kinds of crime which can be described spatially. This leads to many maps of assault but not to any maps of embezzlement with political consequences in the conceptualization of crime and the design of policies to address the issue.


Atomic fallacy

This describes errors due to treating elements as separate 'atoms' outside of their spatial context. The fallacy is about transferring individual conclusions to spatial units.


Ecological fallacy

The
ecological fallacy An ecological fallacy (also ecological ''inference'' fallacy or population fallacy) is a formal fallacy in the interpretation of statistical data that occurs when inferences about the nature of individuals are deduced from inferences about the g ...
describes errors due to performing analyses on aggregate data when trying to reach conclusions on the individual units. Errors occur in part from spatial aggregation. For example, a
pixel In digital imaging, a pixel (abbreviated px), pel, or picture element is the smallest addressable element in a raster image, or the smallest point in an all points addressable display device. In most digital display devices, pixels are the ...
represents the average surface temperatures within an area. Ecological fallacy would be to assume that all points within the area have the same temperature.


Solutions to the fundamental issues


Geographic space

A mathematical space exists whenever we have a set of observations and quantitative measures of their attributes. For example, we can represent individuals' incomes or years of education within a coordinate system where the location of each individual can be specified with respect to both dimensions. The distance between individuals within this space is a quantitative measure of their differences with respect to income and education. However, in spatial analysis, we are concerned with specific types of mathematical spaces, namely, geographic space. In geographic space, the observations correspond to locations in a spatial measurement framework that capture their proximity in the real world. The locations in a spatial measurement framework often represent locations on the surface of the Earth, but this is not strictly necessary. A spatial measurement framework can also capture proximity with respect to, say, interstellar space or within a biological entity such as a liver. The fundamental tenet is
Tobler's First Law of Geography The First Law of Geography, according to Waldo Tobler, is "everything is related to everything else, but near things are more related than distant things." This first law is the foundation of the fundamental concepts of spatial dependence and spati ...
: if the interrelation between entities increases with proximity in the real world, then representation in geographic space and assessment using spatial analysis techniques are appropriate. The
Euclidean distance In mathematics, the Euclidean distance between two points in Euclidean space is the length of a line segment between the two points. It can be calculated from the Cartesian coordinates of the points using the Pythagorean theorem, therefor ...
between locations often represents their proximity, although this is only one possibility. There are an infinite number of distances in addition to Euclidean that can support quantitative analysis. For example, "Manhattan" (or "
Taxicab A taxi, also known as a taxicab or simply a cab, is a type of vehicle for hire with a driver, used by a single passenger or small group of passengers, often for a non-shared ride. A taxicab conveys passengers between locations of their choi ...
") distances where movement is restricted to paths parallel to the axes can be more meaningful than Euclidean distances in urban settings. In addition to distances, other geographic relationships such as connectivity (e.g., the existence or degree of shared borders) and direction can also influence the relationships among entities. It is also possible to compute minimal cost paths across a cost surface; for example, this can represent proximity among locations when travel must occur across rugged terrain.


Types

Spatial data comes in many varieties and it is not easy to arrive at a system of classification that is simultaneously exclusive, exhaustive, imaginative, and satisfying. -- G. Upton & B. FingeltonGraham J. Upton & Bernard Fingelton: Spatial Data Analysis by Example Volume 1: Point Pattern and Quantitative Data John Wiley & Sons, New York. 1985.


Spatial data analysis

Urban and Regional Studies deal with large tables of spatial data obtained from censuses and surveys. It is necessary to simplify the huge amount of detailed information in order to extract the main trends. Multivariable analysis (or
Factor analysis Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in six observed ...
, FA) allows a change of variables, transforming the many variables of the census, usually correlated between themselves, into fewer independent "Factors" or "Principal Components" which are, actually, the
eigenvectors In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted ...
of the data correlation matrix weighted by the inverse of their eigenvalues. This change of variables has two main advantages: # Since information is concentrated on the first new factors, it is possible to keep only a few of them while losing only a small amount of information; mapping them produces fewer and more significant maps # The factors, actually the eigenvectors, are orthogonal by construction, i.e. not correlated. In most cases, the dominant factor (with the largest eigenvalue) is the Social Component, separating rich and poor in the city. Since factors are not-correlated, other smaller processes than social status, which would have remained hidden otherwise, appear on the second, third, ... factors. Factor analysis depends on measuring distances between observations : the choice of a significant metric is crucial. The Euclidean metric (Principal Component Analysis), the Chi-Square distance (Correspondence Analysis) or the Generalized Mahalanobis distance (Discriminant Analysis) are among the more widely used. More complicated models, using communalities or rotations have been proposed. Using multivariate methods in spatial analysis began really in the 1950s (although some examples go back to the beginning of the century) and culminated in the 1970s, with the increasing power and accessibility of computers. Already in 1948, in a seminal publication, two sociologists,
Wendell Bell Wendell Bell (September 27, 1924 – November 3, 2019) was a futurist and Professor Emeritus of Sociology at Yale University. His areas of specialization included sociology, social class, race, family life and future studies. Early career Durin ...
and Eshref Shevky, had shown that most city populations in the US and in the world could be represented with three independent factors : 1- the « socio-economic status » opposing rich and poor districts and distributed in sectors running along highways from the city center, 2- the « life cycle », i.e. the age structure of households, distributed in concentric circles, and 3- « race and ethnicity », identifying patches of migrants located within the city. In 1961, in a groundbreaking study, British geographers used FA to classify British towns. Brian J Berry, at the University of Chicago, and his students made a wide use of the method, applying it to most important cities in the world and exhibiting common social structures. The use of Factor Analysis in Geography, made so easy by modern computers, has been very wide but not always very wise. Since the vectors extracted are determined by the data matrix, it is not possible to compare factors obtained from different censuses. A solution consists in fusing together several census matrices in a unique table which, then, may be analyzed. This, however, assumes that the definition of the variables has not changed over time and produces very large tables, difficult to manage. A better solution, proposed by psychometricians, groups the data in a « cubic matrix », with three entries (for instance, locations, variables, time periods). A Three-Way Factor Analysis produces then three groups of factors related by a small cubic « core matrix ». This method, which exhibits data evolution over time, has not been widely used in geography. In Los Angeles, however, it has exhibited the role, traditionally ignored, of Downtown as an organizing center for the whole city during several decades.


Spatial autocorrelation

Spatial autocorrelation statistics measure and analyze the degree of dependency among observations in a geographic space. Classic spatial autocorrelation statistics include Moran's I, Geary's C, Getis's G and the standard deviational ellipse. These statistics require measuring a spatial weights matrix that reflects the intensity of the geographic relationship between observations in a neighborhood, e.g., the distances between neighbors, the lengths of shared border, or whether they fall into a specified directional class such as "west". Classic spatial autocorrelation statistics compare the spatial weights to the covariance relationship at pairs of locations. Spatial autocorrelation that is more positive than expected from random indicate the clustering of similar values across geographic space, while significant negative spatial autocorrelation indicates that neighboring values are more dissimilar than expected by chance, suggesting a spatial pattern similar to a chess board. Spatial autocorrelation statistics such as Moran's I and Geary's C are global in the sense that they estimate the overall degree of spatial autocorrelation for a dataset. The possibility of spatial heterogeneity suggests that the estimated degree of autocorrelation may vary significantly across geographic space. Local spatial autocorrelation statistics provide estimates disaggregated to the level of the spatial analysis units, allowing assessment of the dependency relationships across space. G statistics compare neighborhoods to a global average and identify local regions of strong autocorrelation. Local versions of the I and C statistics are also available.


Spatial heterogeneity


Spatial interpolation

Spatial interpolation In numerical analysis, multivariate interpolation is interpolation on functions of more than one variable; when the variates are spatial coordinates, it is also known as spatial interpolation. The function to be interpolated is known at given poi ...
methods estimate the variables at unobserved locations in geographic space based on the values at observed locations. Basic methods include
inverse distance weighting Inverse distance weighting (IDW) is a type of deterministic method for multivariate interpolation with a known scattered set of points. The assigned values to unknown points are calculated with a weighted average of the values available at the kn ...
: this attenuates the variable with decreasing proximity from the observed location.
Kriging In statistics, originally in geostatistics, kriging or Kriging, also known as Gaussian process regression, is a method of interpolation based on Gaussian process governed by prior covariances. Under suitable assumptions of the prior, kriging giv ...
is a more sophisticated method that interpolates across space according to a spatial lag relationship that has both systematic and random components. This can accommodate a wide range of spatial relationships for the hidden values between observed locations. Kriging provides optimal estimates given the hypothesized lag relationship, and error estimates can be mapped to determine if spatial patterns exist.


Spatial regression

Spatial regression methods capture spatial dependency in
regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
, avoiding statistical problems such as unstable parameters and unreliable significance tests, as well as providing information on spatial relationships among the variables involved. Depending on the specific technique, spatial dependency can enter the regression model as relationships between the independent variables and the dependent, between the dependent variables and a spatial lag of itself, or in the error terms. Geographically weighted regression (GWR) is a local version of spatial regression that generates parameters disaggregated by the spatial units of analysis. This allows assessment of the spatial heterogeneity in the estimated relationships between the independent and dependent variables. The use of
Bayesian hierarchical modeling Bayesian hierarchical modelling is a statistical model written in multiple levels (hierarchical form) that estimates the parameters of the posterior distribution using the Bayesian method.Allenby, Rossi, McCulloch (January 2005)"Hierarchical Bayes ...
in conjunction with
Markov chain Monte Carlo In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain ...
(MCMC) methods have recently shown to be effective in modeling complex relationships using Poisson-Gamma-CAR, Poisson-lognormal-SAR, or Overdispersed logit models. Statistical packages for implementing such Bayesian models using MCMC include WinBugs, CrimeStat and many packages available via
R programming language R is a programming language for statistical computing and graphics supported by the R Core Team and the R Foundation for Statistical Computing. Created by statisticians Ross Ihaka and Robert Gentleman, R is used among data miners, bioinforma ...
. Spatial stochastic processes, such as Gaussian processes are also increasingly being deployed in spatial regression analysis. Model-based versions of GWR, known as spatially varying coefficient models have been applied to conduct Bayesian inference. Spatial stochastic process can become computationally effective and scalable Gaussian process models, such as Gaussian Predictive Processes and Nearest Neighbor Gaussian Processes (NNGP).


Spatial interaction

Spatial interaction or "
gravity model Gravity models are used in various social sciences to predict and describe certain behaviors that mimic gravitational interaction as described in Isaac Newton's laws of gravity. Generally, the social science models contain some elements of mass ...
s" estimate the flow of people, material or information between locations in geographic space. Factors can include origin propulsive variables such as the number of commuters in residential areas, destination attractiveness variables such as the amount of office space in employment areas, and proximity relationships between the locations measured in terms such as driving distance or travel time. In addition, the topological, or connective, relationships between areas must be identified, particularly considering the often conflicting relationship between distance and topology; for example, two spatially close neighborhoods may not display any significant interaction if they are separated by a highway. After specifying the functional forms of these relationships, the analyst can estimate model parameters using observed flow data and standard estimation techniques such as ordinary least squares or maximum likelihood. Competing destinations versions of spatial interaction models include the proximity among the destinations (or origins) in addition to the origin-destination proximity; this captures the effects of destination (origin) clustering on flows. Computational methods such as artificial neural networks can also estimate spatial interaction relationships among locations and can handle noisy and qualitative data.


Simulation and modeling

Spatial interaction models are aggregate and top-down: they specify an overall governing relationship for flow between locations. This characteristic is also shared by urban models such as those based on mathematical programming, flows among economic sectors, or bid-rent theory. An alternative modeling perspective is to represent the system at the highest possible level of disaggregation and study the bottom-up emergence of complex patterns and relationships from behavior and interactions at the individual level.
Complex adaptive systems A complex adaptive system is a system that is '' complex'' in that it is a dynamic network of interactions, but the behavior of the ensemble may not be predictable according to the behavior of the components. It is '' adaptive'' in that the indiv ...
theory as applied to spatial analysis suggests that simple interactions among proximal entities can lead to intricate, persistent and functional spatial entities at aggregate levels. Two fundamentally spatial simulation methods are cellular automata and agent-based modeling.
Cellular automata A cellular automaton (pl. cellular automata, abbrev. CA) is a discrete model of computation studied in automata theory. Cellular automata are also called cellular spaces, tessellation automata, homogeneous structures, cellular structures, tessel ...
modeling imposes a fixed spatial framework such as grid cells and specifies rules that dictate the state of a cell based on the states of its neighboring cells. As time progresses, spatial patterns emerge as cells change states based on their neighbors; this alters the conditions for future time periods. For example, cells can represent locations in an urban area and their states can be different types of land use. Patterns that can emerge from the simple interactions of local land uses include office districts and
urban sprawl Urban sprawl (also known as suburban sprawl or urban encroachment) is defined as "the spreading of urban developments (such as houses and shopping centers) on undeveloped land near a city." Urban sprawl has been described as the unrestricted growt ...
. Agent-based modeling uses software entities (agents) that have purposeful behavior (goals) and can react, interact and modify their environment while seeking their objectives. Unlike the cells in cellular automata, simulysts can allow agents to be mobile with respect to space. For example, one could model traffic flow and dynamics using agents representing individual vehicles that try to minimize travel time between specified origins and destinations. While pursuing minimal travel times, the agents must avoid collisions with other vehicles also seeking to minimize their travel times. Cellular automata and agent-based modeling are complementary modeling strategies. They can be integrated into a common geographic automata system where some agents are fixed while others are mobile. Calibration plays a pivotal role in both CA and ABM simulation and modelling approaches. Initial approaches to CA proposed robust calibration approaches based on stochastic, Monte Carlo methods. ABM approaches rely on agents' decision rules (in many cases extracted from qualitative research base methods such as questionnaires). Recent Machine Learning Algorithms calibrate using training sets, for instance in order to understand the qualities of the built environment.


Multiple-point geostatistics (MPS)

Spatial analysis of a conceptual geological model is the main purpose of any MPS algorithm. The method analyzes the spatial statistics of the geological model, called the training image, and generates realizations of the phenomena that honor those input multiple-point statistics. A recent MPS algorithm used to accomplish this task is the pattern-based method by Honarkhah. In this method, a distance-based approach is employed to analyze the patterns in the training image. This allows the reproduction of the multiple-point statistics, and the complex geometrical features of the training image. Each output of the MPS algorithm is a realization that represents a random field. Together, several realizations may be used to quantify spatial uncertainty. One of the recent methods is presented by Tahmasebi et al. uses a cross-correlation function to improve the spatial pattern reproduction. They call their MPS simulation method as the CCSIM algorithm. This method is able to quantify the spatial connectivity, variability and uncertainty. Furthermore, the method is not sensitive to any type of data and is able to simulate both categorical and continuous scenarios. CCSIM algorithm is able to be used for any stationary, non-stationary and multivariate systems and it can provide high quality visual appeal model.,


Geospatial and hydrospatial analysis

Geospatial and hydrospatial analysis, or just spatial analysis, is an approach to applying statistical analysis and other analytic techniques to data which has a geographical or spatial aspect. Such analysis would typically employ software capable of rendering maps processing spatial data, and applying analytical methods to terrestrial or geographic datasets, including the use of geographic information systems and
geomatics Geomatics is defined in the ISO/TC 211 series of standards as the "discipline concerned with the collection, distribution, storage, analysis, processing, presentation of geographic data or geographic information". Under another definition, it ...
.


Geographical information system usage

Geographic information systems (GIS) — a large domain that provides a variety of capabilities designed to capture, store, manipulate, analyze, manage, and present all types of geographical data — utilizes geospatial and hydrospatial analysis in a variety of contexts, operations and applications.


Basic applications

Geospatial and Hydrospatial analysis, using GIS, was developed for problems in the environmental and life sciences, in particular
ecology Ecology () is the study of the relationships between living organisms, including humans, and their physical environment. Ecology considers organisms at the individual, population, community, ecosystem, and biosphere level. Ecology overl ...
,
geology Geology () is a branch of natural science concerned with Earth and other astronomical objects, the features or rocks of which it is composed, and the processes by which they change over time. Modern geology significantly overlaps all other Ea ...
and
epidemiology Epidemiology is the study and analysis of the distribution (who, when, and where), patterns and determinants of health and disease conditions in a defined population. It is a cornerstone of public health, and shapes policy decisions and evide ...
. It has extended to almost all industries including defense, intelligence, utilities, Natural Resources (i.e. Oil and Gas, Forestry ... etc.), social sciences, medicine and
Public Safety Public security or public safety is the prevention of and protection from events that could endanger the safety and security of the public from significant danger, injury, or property damage. It is often conducted by a state government to ensur ...
(i.e. emergency management and criminology), disaster risk reduction and management (DRRM), and
climate change adaptation Climate change adaptation is the process of adjusting to current or expected effects of climate change.IPCC, 2022Annex II: Glossary öller, V., R. van Diemen, J.B.R. Matthews, C. Méndez, S. Semenov, J.S. Fuglestvedt, A. Reisinger (eds.) InClimat ...
(CCA). Spatial statistics typically result primarily from observation rather than experimentation. Hydrospatial is particularly used for the aquatic side and the members related to the water surface, column, bottom, sub-bottom and the coastal zones.


Basic operations

Vector-based GIS is typically related to operations such as map overlay (combining two or more maps or map layers according to predefined rules), simple buffering (identifying regions of a map within a specified distance of one or more features, such as towns, roads or rivers) and similar basic operations. This reflects (and is reflected in) the use of the term spatial analysis within the Open Geospatial Consortium ( OGC) “simple feature specifications”. For raster-based GIS, widely used in the environmental sciences and remote sensing, this typically means a range of actions applied to the grid cells of one or more maps (or images) often involving filtering and/or algebraic operations (map algebra). These techniques involve processing one or more raster layers according to simple rules resulting in a new map layer, for example replacing each cell value with some combination of its neighbours’ values, or computing the sum or difference of specific attribute values for each grid cell in two matching raster datasets. Descriptive statistics, such as cell counts, means, variances, maxima, minima, cumulative values, frequencies and a number of other measures and distance computations are also often included in this generic term spatial analysis. Spatial analysis includes a large variety of statistical techniques (descriptive, exploratory, and explanatory statistics) that apply to data that vary spatially and which can vary over time. Some more advanced statistical techniques include Getis-ord Gi* or Anselin Local Moran's I which are used to determine clustering patterns of spatially referenced data.


Advanced operations

Geospatial and Hydrospatial analysis goes beyond 2D and 3D mapping operations and spatial statistics. It is multi-dimensional and also temporal and includes: * Surface analysis — in particular analysing the properties of physical surfaces, such as
gradient In vector calculus, the gradient of a scalar-valued differentiable function of several variables is the vector field (or vector-valued function) \nabla f whose value at a point p is the "direction and rate of fastest increase". If the gr ...
,
aspect Aspect or Aspects may refer to: Entertainment * ''Aspect magazine'', a biannual DVD magazine showcasing new media art * Aspect Co., a Japanese video game company * Aspects (band), a hip hop group from Bristol, England * ''Aspects'' (Benny Carter ...
and
visibility The visibility is the measure of the distance at which an object or light can be clearly discerned. In meteorology it depends on the transparency of the surrounding air and as such, it is unchanging no matter the ambient light level or time o ...
, and analysing surface-like data “fields”; * Network analysis — examining the properties of natural and man-made networks in order to understand the behaviour of flows within and around such networks; and locational analysis. GIS-based network analysis may be used to address a wide range of practical problems such as route selection and facility location (core topics in the field of
operations research Operations research ( en-GB, operational research) (U.S. Air Force Specialty Code: Operations Analysis), often shortened to the initialism OR, is a discipline that deals with the development and application of analytical methods to improve decis ...
), and problems involving flows such as those found in Hydrospatial and
hydrology Hydrology () is the scientific study of the movement, distribution, and management of water on Earth and other planets, including the water cycle, water resources, and environmental watershed sustainability. A practitioner of hydrology is call ...
and transportation research. In many instances location problems relate to networks and as such are addressed with tools designed for this purpose, but in others existing networks may have little or no relevance or may be impractical to incorporate within the modeling process. Problems that are not specifically network constrained, such as new road or pipeline routing, regional warehouse location, mobile phone mast positioning or the selection of rural community health care sites, may be effectively analysed (at least initially) without reference to existing physical networks. Locational analysis "in the plane" is also applicable where suitable network datasets are not available, or are too large or expensive to be utilised, or where the location algorithm is very complex or involves the examination or simulation of a very large number of alternative configurations. *
Geovisualization Geovisualization or geovisualisation (short for geographic visualization), also known as cartographic visualization, refers to a set of tools and techniques supporting the analysis of geospatial data through the use of interactive visualization. Li ...
— the creation and manipulation of images, maps, diagrams, charts, 3D views and their associated tabular datasets. GIS packages increasingly provide a range of such tools, providing static or rotating views, draping images over 2.5D surface representations, providing animations and fly-throughs, dynamic linking and brushing and spatio-temporal visualisations. This latter class of tools is the least developed, reflecting in part the limited range of suitable compatible datasets and the limited set of analytical methods available, although this picture is changing rapidly. All these facilities augment the core tools utilised in spatial analysis throughout the analytical process (exploration of data, identification of patterns and relationships, construction of models, and communication of results)


Mobile geospatial and hydrospatial Computing

Traditionally geospatial and hydrospatial computing has been performed primarily on personal computers (PCs) or servers. Due to the increasing capabilities of mobile devices, however, geospatial computing in mobile devices is a fast-growing trend. The portable nature of these devices, as well as the presence of useful sensors, such as Global Navigation Satellite System (GNSS) receivers and barometric pressure sensors, make them useful for capturing and processing geospatial and hydrospatial information in the field. In addition to the local processing of geospatial information on mobile devices, another growing trend is cloud-based geospatial computing. In this architecture, data can be collected in the field using mobile devices and then transmitted to cloud-based servers for further processing and ultimate storage. In a similar manner, geospatial and hydrospatial information can be made available to connected mobile devices via the cloud, allowing access to vast databases of geospatial and hydrospatial information anywhere where a wireless data connection is available.


Geographic information science and spatial analysis

Geographic information systems (GIS) and the underlying geographic information science that advances these technologies have a strong influence on spatial analysis. The increasing ability to capture and handle geographic data means that spatial analysis is occurring within increasingly data-rich environments. Geographic data capture systems include remotely sensed imagery, environmental monitoring systems such as intelligent transportation systems, and location-aware technologies such as mobile devices that can report location in near-real time. GIS provide platforms for managing these data, computing spatial relationships such as distance, connectivity and directional relationships between spatial units, and visualizing both the raw data and spatial analytic results within a cartographic context. Subtypes include: *
Geovisualization Geovisualization or geovisualisation (short for geographic visualization), also known as cartographic visualization, refers to a set of tools and techniques supporting the analysis of geospatial data through the use of interactive visualization. Li ...
(GVis) combines scientific visualization with digital cartography to support the exploration and analysis of geographic data and information, including the results of spatial analysis or simulation. GVis leverages the human orientation towards visual information processing in the exploration, analysis and communication of geographic data and information. In contrast with traditional cartography, GVis is typically three- or four-dimensional (the latter including time) and user-interactive. * Geographic knowledge discovery (GKD) is the human-centered process of applying efficient computational tools for exploring massive
spatial database A spatial database is a general-purpose database (usually a relational database) that has been enhanced to include spatial data that represents objects defined in a geometric space, along with tools for querying and analyzing such data. Most s ...
s. GKD includes geographic data mining, but also encompasses related activities such as data selection, data cleaning and pre-processing, and interpretation of results. GVis can also serve a central role in the GKD process. GKD is based on the premise that massive databases contain interesting (valid, novel, useful and understandable) patterns that standard analytical techniques cannot find. GKD can serve as a hypothesis-generating process for spatial analysis, producing tentative patterns and relationships that should be confirmed using spatial analytical techniques. * Spatial decision support systems (SDSS) take existing spatial data and use a variety of mathematical models to make projections into the future. This allows urban and regional planners to test intervention decisions prior to implementation.


See also

;General topics * Buffer analysis *
Cartography Cartography (; from grc, χάρτης , "papyrus, sheet of paper, map"; and , "write") is the study and practice of making and using maps. Combining science, aesthetics and technique, cartography builds on the premise that reality (or an i ...
* Complete spatial randomness *
Concepts and Techniques in Modern Geography ''Concepts and Techniques in Modern Geography'', abbreviated CATMOG, is a series of 59 short publications, each focused on an individual method or theory in geography. Background and impact ''Concepts and Techniques in Modern Geography'' were p ...
*
Cost distance analysis In spatial analysis and geographic information systems, cost distance analysis or cost path analysis is a method for determining one or more optimal routes of travel through unconstrained (two-dimensional) space.de Smith, Michael, Paul Longley, M ...
*
GeoComputation Geocomputation (sometimes GeoComputation) is a field of study at the intersection of geography and computation. See also *Geoinformatics Geoinformatics is the science and the technology which develops and uses information science infrastructur ...
* Geospatial intelligence * Geospatial predictive modeling * Dimensionally Extended nine-Intersection Model (DE-9IM) * Geographic information science *
Mathematical statistics Mathematical statistics is the application of probability theory, a branch of mathematics, to statistics, as opposed to techniques for collecting statistical data. Specific mathematical techniques which are used for this include mathematical an ...
*
Modifiable areal unit problem __NOTOC__ The modifiable areal unit problem (MAUP) is a source of statistical bias that can significantly impact the results of statistical hypothesis tests. MAUP affects results when point-based measures of spatial phenomena are aggregated into ...
* Point process * Proximity analysis *
Spatial autocorrelation Spatial analysis or spatial statistics includes any of the formal techniques which studies entities using their topological, geometric, or geographic properties. Spatial analysis includes a variety of techniques, many still in their early dev ...
* Spatial descriptive statistics * Spatial relation *
Technical geography Technical geography is one of three main branches of geography and involves using, studying, and creating tools to obtain, analyze, interpret, and understand spatial information. The other two branches, human geography and physical geography, can ...
* Terrain analysis *
Tobler's first law of geography The First Law of Geography, according to Waldo Tobler, is "everything is related to everything else, but near things are more related than distant things." This first law is the foundation of the fundamental concepts of spatial dependence and spati ...
*
Tobler's second law of geography The second law of geography, according to Waldo Tobler, is "the phenomenon external to a geographic area of interest affects what goes on inside." Background Tobler's second law of geography, "the phenomenon external to a geographic area of inte ...
* List of spatial analysis software ;Specific applications *
Boundary problem (in spatial analysis) A boundary problem in analysis is a phenomenon in which geographical patterns are differentiated by the shape and arrangement of boundaries that are drawn for administrative or measurement purposes. The boundary problem occurs because of the loss o ...
*
Extrapolation domain analysis Extrapolation domain analysis (EDA) is a methodology for identifying geographical areas that seem suitable for adoption of innovative ecosystem management practices on the basis of sites exhibiting similarity in conditions such as climatic, land u ...
*
Fuzzy architectural spatial analysis Fuzzy architectural spatial analysis (FASA) (also fuzzy inference system (FIS) based architectural space analysis or fuzzy spatial analysis) is a spatial analysis method of analysing the spatial formation and architectural space intensity within ...
*
Geodemographic segmentation In marketing, geodemographic segmentation is a multivariate statistical classification technique for discovering whether the individuals of a population fall into different groups by making quantitative comparisons of multiple characteristics wit ...
* Geographic information systems *
Geoinformatics Geoinformatics is the science and the technology which develops and uses information science infrastructure to address the problems of geography, cartography, geosciences and related branches of science and engineering, such as Land Surveying. ...
*
Geostatistics Geostatistics is a branch of statistics focusing on spatial or spatiotemporal datasets. Developed originally to predict probability distributions of ore grades for mining operations, it is currently applied in diverse disciplines including p ...
*
Permeability (spatial and transport planning) Permeability or connectivity describes the extent to which urban forms permit (or restrict) movement of people or vehicles in different directions. The terms are often used interchangeably, although differentiated definitions also exist (see belo ...
*
Spatial econometrics Spatial econometrics is the field where spatial analysis and econometrics intersect. The term “spatial econometrics” was introduced for the first time by the Belgian economist Jean Paelinck (universally recognised as the father of the disciplin ...
*
Spatial epidemiology Spatial epidemiology is a subfield of epidemiology focused on the study of the spatial distribution of health outcomes; it is closely related to health geography. Specifically, spatial epidemiology is concerned with the description and examinatio ...
*
Suitability analysis Suitability analysis is the process and procedures used to establish the suitability of a system – that is, the ability of a system to meet the needs of a stakeholder or other user. Before GIS (a computerized method that helps to determine suit ...
*
Viewshed analysis Viewshed analysis is a computational algorithm that delineates a viewshed, the area that is visible (on the base terrain surface) from a given location. It is a common part of the terrain analysis toolset found in of most geographic information s ...


References


Further reading

* Abler, R., J. Adams, and P. Gould (1971) ''Spatial Organization–The Geographer's View of the World'', Englewood Cliffs, NJ: Prentice-Hall.
Anselin, L. (1995) "Local indicators of spatial association – LISA". ''Geographical Analysis'', 27, 93–115
* * * Benenson, I. and P. M. Torrens. (2004). ''Geosimulation: Automata-Based Modeling of Urban Phenomena.'' Wiley. * Fotheringham, A. S., C. Brunsdon and M. Charlton (2000) ''Quantitative Geography: Perspectives on Spatial Data Analysis'', Sage. * Fotheringham, A. S. and M. E. O'Kelly (1989) ''Spatial Interaction Models: Formulations and Applications'', Kluwer Academic * * * MacEachren, A. M. and D. R. F. Taylor (eds.) (1994) ''Visualization in Modern Cartography'', Pergamon. * Levine, N. (2010).
CrimeStat: A Spatial Statistics Program for the Analysis of Crime Incident Locations
'. Version 3.3. Ned Levine & Associates, Houston, TX and the National Institute of Justice, Washington, DC. Ch. 1-17 + 2 update chapters * * Miller, H. J. and J. Han (eds.) (2001) ''Geographic Data Mining and Knowledge Discovery'', Taylor and Francis. * O'Sullivan, D. and D. Unwin (2002) ''Geographic Information Analysis'', Wiley. * * * * Fisher MM, Leung Y (2001) Geocomputational Modelling: techniques and applications. Springer Verlag, Berlin * * Openshaw S and Abrahart RJ (2000) GeoComputation. CRC Press * Diappi Lidia (2004) Evolving Cities: Geocomputation in Territorial Planning. Ashgate, England * Longley PA, Brooks SM, McDonnell R, Macmillan B (1998), Geocomputation, a primer. John Wiley and Sons, Chichester * * * Murgante B., Borruso G., Lapucci A. (2009) "Geocomputation and Urban Planning" ''Studies in Computational Intelligence'', Vol. 176. Springer-Verlag, Berlin. * * * Fischer M., Leung Y. (2010) "GeoComputational Modelling: Techniques and Applications" Advances in Spatial Science. Springer-Verlag, Berlin. * Murgante B., Borruso G., Lapucci A. (2011) "Geocomputation, Sustainability and Environmental Planning" ''Studies in Computational Intelligence'', Vol. 348. Springer-Verlag, Berlin. * *


External links


ICA Commission on Geospatial Analysis and Modeling

An educational resource about spatial statistics and geostatistics

A comprehensive guide to principles, techniques & software tools

Social and Spatial Inequalities

National Center for Geographic Information and Analysis (NCGIA)International Cartographic Association (ICA)
the world body for mapping and GIScience professionals {{DEFAULTSORT:Spatial Analysis