HOME

TheInfoList



OR:

__NOTOC__ The modifiable areal unit problem (MAUP) is a source of
statistical bias Statistical bias is a systematic tendency which causes differences between results and facts. The bias exists in numbers of the process of data analysis, including the source of the data, the estimator chosen, and the ways the data was analyzed. ...
that can significantly impact the results of
statistical hypothesis test A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis. Hypothesis testing allows us to make probabilistic statements about population parameters. ...
s. MAUP affects results when point-based measures of spatial phenomena are aggregated into districts, for example,
population density Population density (in agriculture: standing stock or plant density) is a measurement of population per unit land area. It is mostly applied to humans, but sometimes to other living organisms too. It is a key geographical term.Matt RosenberPopu ...
or
illness rate In economics, the absence rate is the ratio of workers with absences to total full-time wage and salary employment. In the United States, absences are defined as instances when persons who usually work 35 or more hours per week worked less than 3 ...
s. The resulting summary values (e.g., totals, rates, proportions, densities) are influenced by both the shape and scale of the aggregation unit. For example, census data may be aggregated into county districts, census tracts, postcode areas, police precincts, or any other arbitrary spatial partition. Thus the results of data aggregation are dependent on the mapmaker's choice of which "modifiable areal unit" to use in their analysis. A census
choropleth map A choropleth map () is a type of statistical thematic map that uses pseudocolor, i.e., color corresponding with an aggregate summary of a geographic characteristic within spatial enumeration units, such as population density or per-capita in ...
calculating population density using state boundaries will yield radically different results than a map that calculates density based on county boundaries. Furthermore, census district boundaries are also subject to change over time, meaning the MAUP must be considered when comparing past data to current data.


Background

The issue was first recognized by Gehlke and Biehl in 1934 and later described in detail in an entry in the
Concepts and Techniques in Modern Geography ''Concepts and Techniques in Modern Geography'', abbreviated CATMOG, is a series of 59 short publications, each focused on an individual method or theory in geography. Background and impact ''Concepts and Techniques in Modern Geography'' were ...
(CATMOG) series by
Stan Openshaw Stan Openshaw (10 August, 1946 – 19 May, 2022) was a British geographer. His last post was professor of human geography based in the School of Geography at the University of Leeds. After eighteen years at Newcastle University, including ...
(1984) and in the book by Giuseppe Arbia (1988). In particular, Openshaw (1984) observed that "the areal units (zonal objects) used in many geographical studies are arbitrary, modifiable, and subject to the whims and fancies of whoever is doing, or did, the aggregating". The problem is especially apparent when the aggregate data are used for cluster analysis for spatial epidemiology,
spatial statistics Spatial analysis or spatial statistics includes any of the formal techniques which studies entities using their topological, geometric, or geographic properties. Spatial analysis includes a variety of techniques, many still in their early devel ...
or
choropleth map A choropleth map () is a type of statistical thematic map that uses pseudocolor, i.e., color corresponding with an aggregate summary of a geographic characteristic within spatial enumeration units, such as population density or per-capita in ...
ping, in which misinterpretations can easily be made without realizing it. Many fields of science, especially
human geography Human geography or anthropogeography is the branch of geography that studies spatial relationships between human communities, cultures, economies, and their interactions with the environment. It analyzes spatial interdependencies between social ...
are prone to disregard the MAUP when drawing inferences from statistics based on aggregated data. MAUP is closely related to the topic of ecological fallacy and ecological bias (Arbia, 1988). Ecological bias caused by MAUP has been documented as two separate effects that usually occur simultaneously during the analysis of aggregated data. First, the scale effect causes variation in statistical results between different levels of aggregation (radial distance). Therefore, the association between variables depends on the size of areal units for which data are reported. Generally, correlation increases as areal unit size increases. The zone effect describes variation in correlation statistics caused by the regrouping of data into different configurations at the same scale (areal shape). Since the 1930s, research has found extra variation in statistical results because of the MAUP. The standard methods of calculating within-group and between-group variance do not account for the extra variance seen in MAUP studies as the groupings change. MAUP can be used as a methodology to calculate upper and lower limits as well as average regression parameters for multiple sets of spatial groupings. The MAUP is a critical source of error in spatial studies, whether observational or experimental. As such, unit consistency, particularly in a time-series cross-sectional (TSCS) context, is essential. Further, robustness checks of unit sensitivity to alternative spatial aggregation should be routinely performed to mitigate associated biases on resulting statistical estimates.


Suggested solutions

Several suggestions have been made in literature to reduce aggregation bias during
regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
. A researcher might correct the variance-covariance matrix using samples from individual-level data. Alternatively, one might focus on local spatial regression rather than global regression. A researcher might also attempt to design areal units to maximize a particular statistical result. Others have argued that it may be difficult to construct a single set of optimal aggregation units for multiple variables, each of which may exhibit non-stationarity and spatial autocorrelation across space in different ways. Others have suggested developing statistics that change across scales in a predictable way, perhaps using fractal dimension as a scale-independent measure of spatial relationships. Others have suggested Bayesian hierarchical models as a general methodology for combining aggregated and individual-level data for ecological inference. Studies of the MAUP based on empirical data can only provide limited insight due to an inability to control relationships between multiple spatial variables. Data simulation is necessary to have control over various properties of individual-level data. Simulation studies have demonstrated that the spatial support of variables can affect the magnitude of ecological bias caused by spatial data aggregation.


MAUP sensitivity analysis

Using simulations for univariate data, Larsen advocated the use of a Variance Ratio to investigate the effect of spatial configuration, spatial association, and data aggregation. A detailed description of the variation of statistics due to MAUP is presented by Reynolds, who demonstrates the importance of the spatial arrangement and spatial autocorrelation of data values. Reynold’s simulation experiments were expanded by Swift, who in which a series of nine exercises began with simulated regression analysis and spatial trend, then focused on the topic of MAUP in the context of spatial epidemiology. A method of MAUP sensitivity analysis is presented that demonstrates that the MAUP is not entirely a problem.Swift, A., Liu, L., and Uber, J. (2008) "Reducing MAUP bias of correlation statistics between water quality and GI illness." Computers, Environment and Urban Systems 32, 134–148 MAUP can be used as an analytical tool to help understand spatial heterogeneity and
spatial autocorrelation Spatial analysis or spatial statistics includes any of the formal techniques which studies entities using their topological, geometric, or geographic properties. Spatial analysis includes a variety of techniques, many still in their early dev ...
. This topic is of particular importance because in some cases data aggregation can obscure a strong
correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistic ...
between variables, making the relationship appear weak or even negative. Conversely, MAUP can cause random variables to appear as if there is a significant association where there is not. Multivariate regression parameters are more sensitive to MAUP than correlation coefficients. Until a more analytical solution to MAUP is discovered, spatial sensitivity analysis using a variety of areal units is recommended as a methodology to estimate the uncertainty of correlation and regression coefficients due to ecological bias. An example of data simulation and re-aggregation using the ArcPy library is available.Swift, A. (2017). "Crime mapping data simulation", https://app.box.com/s/a84w16x7hffljjvkhtlr72eisj4qiene In transport planning, MAUP is associated to Traffic Analysis Zoning (TAZ). A major point of departure in understanding problems in transportation analysis is the recognition that spatial analysis has some limitations associated with the discretization of space. Among them, modifiable areal units and boundary problems are directly or indirectly related to transportation planning and analysis through the design of traffic analysis zones – most of transport studies require directly or indirectly the definition of TAZs. The modifiable boundary and the scale issues should all be given specific attention during the specification of a TAZ because of the effects these factors exert on statistical and mathematical properties of spatial patterns (ie the modifiable areal unit problem—MAUP). In the studies of Viegas, Martinez and Silva (2009, 2009b) the authors propose a method where the results obtained from the study of spatial data are not independent of the scale, and the aggregation effects are implicit in the choice of zonal boundaries. The delineation of zonal boundaries of TAZs has a direct impact on the reality and accuracy of the results obtained from transportation forecasting models. In this paper the MAUP effects on the TAZ definition and the transportation demand models are measured and analyzed using different grids (in size and in origin location). This analysis was developed by building an application integrated in commercial GIS software and by using a case study (Lisbon Metropolitan Area) to test its implementabiity and performance. The results reveal the conflict between statistical and geographic precision, and their relationship with the loss of information in the traffic assignment step of the transportation planning models.


See also

General topics * Arbia's law of geography *
Concepts and Techniques in Modern Geography ''Concepts and Techniques in Modern Geography'', abbreviated CATMOG, is a series of 59 short publications, each focused on an individual method or theory in geography. Background and impact ''Concepts and Techniques in Modern Geography'' were ...
* Representation theory *
Spatial analysis Spatial analysis or spatial statistics includes any of the formal techniques which studies entities using their topological, geometric, or geographic properties. Spatial analysis includes a variety of techniques, many still in their early deve ...
Specific applications *
Boundary problem (in spatial analysis) A boundary problem in analysis is a phenomenon in which geographical patterns are differentiated by the shape and arrangement of boundaries that are drawn for administrative or measurement purposes. The boundary problem occurs because of the loss o ...
* Gerrymandering * Spatial econometrics * Spatial epidemiology


References


Sources

* * This article contains quotations fro
Modifiable areal unit problem
at the GIS Wiki, which is available under th
Creative Commons Attribution 3.0 Unported (CC BY 3.0)
license. * * *Unwin, D. J. (1996). "GIS, spatial analysis and spatial statistics." ''Progress in Human Geography.'' 20: 540–551. *Cressie, N. (1996). “Change of Support and the Modifiable Areal Unit Problem.” “Geographical Systems“, 3:159–180. *Viegas, J., E.A. Silva, L. Martinez (2009a). “Effects of the Modifiable Areal Unit Problem on the Delineation of Traffic Analysis Zones” “Environment and Planning B – Planning and Design“, 36(4): 625–643. *Viegas, J., E.A. Silva, L. Martinez (2009a). “A traffic analysis zone definition: a new methodology and algorithm” “Transportation“. 36 (5): 6“, 36 (5): 6 .


Further reading

* * * * * * * * * {{DEFAULTSORT:Modifiable areal unit problem Bias Geographic information systems Spatial analysis